From jason.stajich at gmail.com  Fri Feb  1 01:58:57 2013
From: jason.stajich at gmail.com (Jason Stajich)
Date: Thu, 31 Jan 2013 22:58:57 -0800
Subject: [Bioperl-l] Bioperl-l Digest, Vol 117, Issue 13
In-Reply-To: <575B184F-C051-4FEF-9BEB-7AB98E3C52A6@yahoo.com>
References: <mailman.7.1359565204.26693.bioperl-l@lists.open-bio.org>
	<575B184F-C051-4FEF-9BEB-7AB98E3C52A6@yahoo.com>
Message-ID: <CD561DB2-ACFC-4592-B83B-829F44ADE6A3@gmail.com>

Dan - 

I think the answer is yes if others are doing it - I am not in a position to be much of a main coder.

I don't know which format you speak of here or if you had to write something for the text blast changes or something else.  Specific bug reports on formats that aren't working is always helpful.  The XML format has been pretty stable so I would suggest that if you are simply parsing reports not looking at them.

Chris posted instructions on how to contribute and the move to github simplifies this.  That you had to write a whole new parser seems probably a bit severe - I hope that in the future people can speak to the problems sooner. If I hit a wall with something I can't do I usually write the code to fix it and contribute it back but I don't play follow-the-format-changes with the tools anymore, but hopefully others like yourself can make the contributions.

If you speak to the response I made to the question below, I don't think anyone will be trying and support the NCBI's additional markups that refer to the upstream and downstream features as they are laid out in the text files without some serious effort. Perhaps in the future that information will be reported in the XML format and thus be more parseable.

best wishes,
Jason
On Jan 30, 2013, at 1:40 PM, Dan kilburn <dr_kilburn59 at yahoo.com> wrote:

> Hi Jason,
> 
> Are there any plans to keep SearchIO up to date with ncbi blast? I know they change formats ridiculously often, but I had to write my own parser to get sequence identity, which I would rather not have done. I realize that this job would be a big load on anyone who takes it, but it's so fundamental. Maybe I can help.
> 
> --Dan
> Sent from my iPhone
> 
> On Jan 30, 2013, at 12:00 PM, bioperl-l-request at lists.open-bio.org wrote:
> 
>> Send Bioperl-l mailing list submissions to
>>   bioperl-l at lists.open-bio.org
>> 
>> To subscribe or unsubscribe via the World Wide Web, visit
>>   http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> or, via email, send a message with subject or body 'help' to
>>   bioperl-l-request at lists.open-bio.org
>> 
>> You can reach the person managing the list at
>>   bioperl-l-owner at lists.open-bio.org
>> 
>> When replying, please edit your Subject line so it is more specific
>> than "Re: Contents of Bioperl-l digest..."
>> 
>> 
>> Today's Topics:
>> 
>>  1. Re:  Parsing Blast-Report extracting "Features flanking    .."
>>     (Jason Stajich)
>> 
>> 
>> ----------------------------------------------------------------------
>> 
>> Message: 1
>> Date: Tue, 29 Jan 2013 11:00:16 -0800
>> From: Jason Stajich <jason.stajich at gmail.com>
>> Subject: Re: [Bioperl-l] Parsing Blast-Report extracting "Features
>>   flanking    .."
>> To: buschj at hhu.de
>> Cc: bioperl-l at lists.open-bio.org
>> Message-ID: <6E83E3F3-C304-4DC4-9A11-FE1CA90F207D at gmail.com>
>> Content-Type: text/plain;    charset=us-ascii
>> 
>> We don't parse the NCBI feature info from the BLAST reports per your query. To look up a specific feature you can use Bio::DB::GenBank to query for sequence from a specific feature by accession number - see the HOWTOs for that.
>> 
>> However, most people use tools that generate SAM/BAM files with short reads - then you can use a tool like bedtools to find overlaps of reads with the locations of features.
>> 
>> basically:
>> - download the genome and GFF for arabidopsis
>> - align your sRNA to the genome with a short read aligner - bowtie, bwa, others
>> - convert your sam to bam file with SAMtools or picard
>> - compare the location of features with the reads to get expression summaries or individuals reads with BEDTools
>> 
>> 
>> On Jan 25, 2013, at 2:20 AM, jobu <buschj at hhu.de> wrote:
>> 
>>> Am 22.01.2013 19:03, schrieb Mgavi Brathwaite:
>>>> What upstream and downstream elements are you interested in?
>>> 
>>> 
>>> I've got a huge pile of short RNA reads.
>>> Part of the question now is whether those RNA fragments originate from
>>> siRNA events,
>>> or may represent miRNAs / parts of pre-miRNAs.
>>> 
>>> So I did an online  blast search against database nt.
>>> The resulting report quite often just gives subject information like this:
>>> 
>>> -----
>>>> gb|CP002686.1| Arabidopsis thaliana chromosome 3, complete sequence
>>> Length=23459830
>>> -----
>>> 
>>> Now I would like to get the hit's neighbouring regions  for further
>>> analysis.
>>> Preferably I would like to do that  in an automized way, but the only
>>> possible action with this kind of subject gi | description would be to
>>> fetch the entire chromosomal  sequence I guess ?
>>> 
>>> However,
>>> right below the line above, the report states more precisely:
>>> 
>>> ------
>>> Features flanking this part of subject sequence:
>>> 8872 bp at 5' side: cytochrome P450 90B1
>>> 402 bp at 3' side: U1 small nuclear ribonucleoprotein-70K
>>> ------
>>> 
>>> Still I would like to have the possibility to automatically fetch the
>>> subject's sequence(s),
>>> as of now I think  parsing the report with SearchIO won't let me aquire
>>> that information, because SearchIO does not recognize report sections
>>> like those.
>>> 
>>> I hope I did not miss any of SearchIOs capabilities, but I could not
>>> find any method covering my wish?!
>>> 
>>> Right now maybe the only way to get the information I want is to
>>> construct my own parser and write it out into a separate file, which in
>>> turn again  I could read into a hash before processing the Blast-Report
>>> with SearchIO to combine both data for further automized work.
>>> 
>>> I am aware though that even successfully getting the flanking features
>>> would leave me with the more or less wide  intergenic gap my hsp is
>>> located in.
>>> 
>>> However I'm in need of a way to get the flanking features including
>>> their annotation and the region spanning between them.
>>> But I hope I do not have to get complete sequences to accomplish that,
>>> as this would be kind of an overkill.
>>> 
>>> with kind regards
>>> Jochen
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> 
>> Jason Stajich
>> jason.stajich at gmail.com
>> jason at bioperl.org
>> 
>> 
>> 
>> 
>> ------------------------------
>> 
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> 
>> End of Bioperl-l Digest, Vol 117, Issue 13
>> ******************************************
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From dr_kilburn59 at yahoo.com  Fri Feb  1 09:25:34 2013
From: dr_kilburn59 at yahoo.com (Dan Kilburn)
Date: Fri, 1 Feb 2013 06:25:34 -0800 (PST)
Subject: [Bioperl-l] Bioperl-l Digest, Vol 117, Issue 13
In-Reply-To: <CD561DB2-ACFC-4592-B83B-829F44ADE6A3@gmail.com>
References: <mailman.7.1359565204.26693.bioperl-l@lists.open-bio.org>
	<575B184F-C051-4FEF-9BEB-7AB98E3C52A6@yahoo.com>
	<CD561DB2-ACFC-4592-B83B-829F44ADE6A3@gmail.com>
Message-ID: <1359728734.27412.YahooMailNeo@web162006.mail.bf1.yahoo.com>

Hi Jason,
?
Thanks for?the detailed feedback.? The real reason I had to write my own parser is that even with close, repeated support from NCBI we couldn't get XML output with short_web_blast.pl?because the parameter that turns on XML output was not functioning (they've probably fixed it by now), and I had to crank out a parser asap to support a job talk.
?
I don't think the upstream and downstream feature reports are particulalry useful, becase in mammals they tend to be so far away that they are not likely to be biologically relevant.? But the internal motif reports are useful, maybe especially if you are blasting short reads, like I was.? A 16-mer preserved domain hit is really good if you're blasting 18-mer Illumina short reads, like I was.
?
As far as my involvement goes, I got diagnosed with cancer on Wednesday, so I'll be taking a step back until next week's surgery and taking a lot a deep breaths.? On the other hand, this just makes me more motivated: I've been thinking alot about time, and timely contributions, the last two days.
?
Cheers,
Dan
 

________________________________
 From: Jason Stajich <jason.stajich at gmail.com>
To: Dan kilburn <dr_kilburn59 at yahoo.com> 
Cc: "bioperl-l at lists.open-bio.org" <bioperl-l at lists.open-bio.org> 
Sent: Friday, February 1, 2013 1:58 AM
Subject: Re: [Bioperl-l] Bioperl-l Digest, Vol 117, Issue 13
  

Dan -?

I think the answer is yes if others are doing it - I am not in a position to be much of a main coder.

I don't know which format you speak of here or if you had to write something for the text blast changes or something else. ?Specific bug reports on formats that aren't working is always helpful. ?The XML format has been pretty stable so I would suggest that if you are simply parsing reports not looking at them.

Chris posted instructions on how to contribute and the move to github simplifies this. ?That you had to write a whole new parser seems probably a bit severe - I hope that in the future people can speak to the problems sooner. If I hit a wall with something I can't do I usually write the code to fix it and contribute it back but I don't play follow-the-format-changes with the tools anymore, but hopefully others like yourself can make the contributions.

If you speak to the response I made to the question below, I don't think anyone will be trying and support the NCBI's additional markups that refer to the upstream and downstream features as they are laid out in the text files without some serious effort. Perhaps in the future that information will be reported in the XML format and thus be more parseable.
best wishes,
Jason

On Jan 30, 2013, at 1:40 PM, Dan kilburn <dr_kilburn59 at yahoo.com> wrote:

Hi Jason,
>
>Are there any plans to keep SearchIO up to date with ncbi blast? I know they change formats ridiculously often, but I had to write my own parser to get sequence identity, which I would rather not have done. I realize that this job would be a big load on anyone who takes it, but it's so fundamental. Maybe I can help.
>
>--Dan
>Sent from my iPhone
>
>On Jan 30, 2013, at 12:00 PM, bioperl-l-request at lists.open-bio.org wrote:
>
>
>Send Bioperl-l mailing list submissions to
>>??bioperl-l at lists.open-bio.org
>>
>>To subscribe or unsubscribe via the World Wide Web, visit
>>??http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>or, via email, send a message with subject or body 'help' to
>>??bioperl-l-request at lists.open-bio.org
>>
>>You can reach the person managing the list at
>>??bioperl-l-owner at lists.open-bio.org
>>
>>When replying, please edit your Subject line so it is more specific
>>than "Re: Contents of Bioperl-l digest..."
>>
>>
>>Today's Topics:
>>
>>?1. Re: ?Parsing Blast-Report extracting "Features flanking ???.."
>>????(Jason Stajich)
>>
>>
>>----------------------------------------------------------------------
>>
>>Message: 1
>>Date: Tue, 29 Jan 2013 11:00:16 -0800
>>From: Jason Stajich <jason.stajich at gmail.com>
>>Subject: Re: [Bioperl-l] Parsing Blast-Report extracting "Features
>>??flanking ???.."
>>To: buschj at hhu.de
>>Cc: bioperl-l at lists.open-bio.org
>>Message-ID: <6E83E3F3-C304-4DC4-9A11-FE1CA90F207D at gmail.com>
>>Content-Type: text/plain; ???charset=us-ascii
>>
>>We don't parse the NCBI feature info from the BLAST reports per your query. To look up a specific feature you can use Bio::DB::GenBank to query for sequence from a specific feature by accession number - see the HOWTOs for that.
>>
>>However, most people use tools that generate SAM/BAM files with short reads - then you can use a tool like bedtools to find overlaps of reads with the locations of features.
>>
>>basically:
>>- download the genome and GFF for arabidopsis
>>- align your sRNA to the genome with a short read aligner - bowtie, bwa, others
>>- convert your sam to bam file with SAMtools or picard
>>- compare the location of features with the reads to get expression summaries or individuals reads with BEDTools
>>
>>
>>On Jan 25, 2013, at 2:20 AM, jobu <buschj at hhu.de> wrote:
>>
>>
>>Am 22.01.2013 19:03, schrieb Mgavi Brathwaite:
>>>
>>>What upstream and downstream elements are you interested in?
>>>>
>>>
>>>I've got a huge pile of short RNA reads.
>>>Part of the question now is whether those RNA fragments originate from
>>>siRNA events,
>>>or may represent miRNAs / parts of pre-miRNAs.
>>>
>>>So I did an online ?blast search against database nt.
>>>The resulting report quite often just gives subject information like this:
>>>
>>>-----
>>>
>>>gb|CP002686.1| Arabidopsis thaliana chromosome 3, complete sequence
>>>>Length=23459830
>>>-----
>>>
>>>Now I would like to get the hit's neighbouring regions ?for further
>>>analysis.
>>>Preferably I would like to do that ?in an automized way, but the only
>>>possible action with this kind of subject gi | description would be to
>>>fetch the entire chromosomal ?sequence I guess ?
>>>
>>>However,
>>>right below the line above, the report states more precisely:
>>>
>>>------
>>>Features flanking this part of subject sequence:
>>>8872 bp at 5' side: cytochrome P450 90B1
>>>402 bp at 3' side: U1 small nuclear ribonucleoprotein-70K
>>>------
>>>
>>>Still I would like to have the possibility to automatically fetch the
>>>subject's sequence(s),
>>>as of now I think ?parsing the report with SearchIO won't let me aquire
>>>that information, because SearchIO does not recognize report sections
>>>like those.
>>>
>>>I hope I did not miss any of SearchIOs capabilities, but I could not
>>>find any method covering my wish?!
>>>
>>>Right now maybe the only way to get the information I want is to
>>>construct my own parser and write it out into a separate file, which in
>>>turn again ?I could read into a hash before processing the Blast-Report
>>>with SearchIO to combine both data for further automized work.
>>>
>>>I am aware though that even successfully getting the flanking features
>>>would leave me with the more or less wide ?intergenic gap my hsp is
>>>located in.
>>>
>>>However I'm in need of a way to get the flanking features including
>>>their annotation and the region spanning between them.
>>>But I hope I do not have to get complete sequences to accomplish that,
>>>as this would be kind of an overkill.
>>>
>>>with kind regards
>>>Jochen
>>>
>>>
>>>
>>>_______________________________________________
>>>Bioperl-l mailing list
>>>Bioperl-l at lists.open-bio.org
>>>http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>Jason Stajich
>>jason.stajich at gmail.com
>>jason at bioperl.org
>>
>>
>>
>>
>>------------------------------
>>
>>_______________________________________________
>>Bioperl-l mailing list
>>Bioperl-l at lists.open-bio.org
>>http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>End of Bioperl-l Digest, Vol 117, Issue 13
>>******************************************
>>
>_______________________________________________
>Bioperl-l mailing list
>Bioperl-l at lists.open-bio.org
>http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org  

From carandraug+dev at gmail.com  Sat Feb  2 20:44:31 2013
From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=)
Date: Sun, 3 Feb 2013 01:44:31 +0000
Subject: [Bioperl-l] TCofee does not accept named arguments and issue with
	output option
Message-ID: <CAPOrs_3TM5+yD3s3=npWb1sucmy_smSLejxz3Cr6C0Rg6h3Dyw@mail.gmail.com>

Hi

the TCoffee module does not options of the named argument type:

-arg => option

one needs to do like

'arg' => option

Is there a special reason for this? I tracked down this to the commit

7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e

12 years ago[1]. A comment on the code actually says "don't want named
parameters"[2] (though the commit message sounds pretty innocuous
"migrated to new Bio::Root::RootI chained new"). Is there a reason for
this? The rest of bioperl has no issue with named parameters, and the
API should be the same as Clustalw which also has no problem with it.
This is very easy to fix, I can submit a pull request no problem.

Also, shouldn't the code complain in the case of non-supported
options? Took me a very long time to find out the problem because
there was no complaints coming from the code.

There is also a problem with the way it handles the output option.
I'll have to look closer into it, but the documentation is simply
incorrect. "'output' => 'fasta_aln'" gives an error while just 'fasta'
(undocumented), works fine.

Carn?
[1] https://github.com/carandraug/bioperl-run/commit/7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e
[2] https://github.com/carandraug/bioperl-run/commit/7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e#L0R374


From cjfields at illinois.edu  Sun Feb  3 16:54:51 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Sun, 3 Feb 2013 21:54:51 +0000
Subject: [Bioperl-l] TCofee does not accept named arguments and issue
 with	output option
In-Reply-To: <CAPOrs_3TM5+yD3s3=npWb1sucmy_smSLejxz3Cr6C0Rg6h3Dyw@mail.gmail.com>
References: <CAPOrs_3TM5+yD3s3=npWb1sucmy_smSLejxz3Cr6C0Rg6h3Dyw@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE14D30@CHIMBX5.ad.uillinois.edu>

Carn?,

On Feb 2, 2013, at 7:44 PM, Carn? Draug <carandraug+dev at gmail.com> wrote:

> Hi
> 
> the TCoffee module does not options of the named argument type:
> 
> -arg => option
> 
> one needs to do like
> 
> 'arg' => option
> 
> Is there a special reason for this? I tracked down this to the commit
> 
> 7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e
> 
> 12 years ago[1]. A comment on the code actually says "don't want named
> parameters"[2] (though the commit message sounds pretty innocuous
> "migrated to new Bio::Root::RootI chained new"). Is there a reason for
> this? The rest of bioperl has no issue with named parameters, and the
> API should be the same as Clustalw which also has no problem with it.
> This is very easy to fix, I can submit a pull request no problem.

IIRC the reasoning behind this was to differentiate Bioperl parameters from command-specific ones.  This decision predates my involvement w/ core dev, but my general feeling is that anything that is an object attribute (regardless whether it is a direct representation of a value passed to a wrapped program or not) should be preceded by '-' for consistency.  

The downside of big changes like this: potential backwards compatibility issues.  Such changes would need to be tested out rigorously, as there are a ton of old scripts that would potentially break with a direct change.  I don't have a problem breaking this with a bioperl 2.0 release, though.  

> Also, shouldn't the code complain in the case of non-supported
> options? Took me a very long time to find out the problem because
> there was no complaints coming from the code.

Yes, it should complain when options are given that do not make sense, some validation would help there.  With some modules this might be a side-effect of using AUTOLOAD or simply not checking the parameters.

> There is also a problem with the way it handles the output option.
> I'll have to look closer into it, but the documentation is simply
> incorrect. "'output' => 'fasta_aln'" gives an error while just 'fasta'
> (undocumented), works fine.

That's entirely possible.

> Carn?
> [1] https://github.com/carandraug/bioperl-run/commit/7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e
> [2] https://github.com/carandraug/bioperl-run/commit/7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e#L0R374

As an aside, there are a few downsides of trying to implement command-line parameters as perl object attributes (getter/setter), one being that many can't be directly represented as an object attribute (namely, anything that can't be a getter/setter named subroutine, such as those having hyphens, starting with a number, etc) so you have to hack your way around it.  Infernal was this way IIRC.  Maybe these should just be simply stored as a semi-validated set of key-value pairs.  

chris

From carandraug+dev at gmail.com  Sun Feb  3 23:34:22 2013
From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=)
Date: Mon, 4 Feb 2013 04:34:22 +0000
Subject: [Bioperl-l] TCofee does not accept named arguments and issue
 with output option
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE14D30@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_3TM5+yD3s3=npWb1sucmy_smSLejxz3Cr6C0Rg6h3Dyw@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE14D30@CHIMBX5.ad.uillinois.edu>
Message-ID: <CAPOrs_2b2+Dy-HW3ngjNd2tjaTxgvFpTR-rKzq7HOO-6ZzyoTQ@mail.gmail.com>

On 3 February 2013 21:54, Fields, Christopher J <cjfields at illinois.edu> wrote:
> On Feb 2, 2013, at 7:44 PM, Carn? Draug <carandraug+dev at gmail.com> wrote:
>
>> Hi
>>
>> the TCoffee module does not options of the named argument type:
>>
>> -arg => option
>>
>> one needs to do like
>>
>> 'arg' => option
>>
>> Is there a special reason for this? I tracked down this to the commit
>>
>> 7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e
>>
>> 12 years ago[1]. A comment on the code actually says "don't want named
>> parameters"[2] (though the commit message sounds pretty innocuous
>> "migrated to new Bio::Root::RootI chained new"). Is there a reason for
>> this? The rest of bioperl has no issue with named parameters, and the
>> API should be the same as Clustalw which also has no problem with it.
>> This is very easy to fix, I can submit a pull request no problem.
>
> IIRC the reasoning behind this was to differentiate Bioperl parameters from command-specific ones.  This decision predates my involvement w/ core dev, but my general feeling is that anything that is an object attribute (regardless whether it is a direct representation of a value passed to a wrapped program or not) should be preceded by '-' for consistency.
>
> The downside of big changes like this: potential backwards compatibility issues.  Such changes would need to be tested out rigorously, as there are a ton of old scripts that would potentially break with a direct change.  I don't have a problem breaking this with a bioperl 2.0 release, though.

Should passing the tests be enough? There's one for TCofee. At the
moment I don't see how this would cause compatibility issues, we are
adding an option, not removing it. But the comment on the code,
stating plainly that the -param API was not wanted caught me by
surpise and why I'm asking.

> As an aside, there are a few downsides of trying to implement command-line parameters as perl object attributes (getter/setter), one being that many can't be directly represented as an object attribute (namely, anything that can't be a getter/setter named subroutine, such as those having hyphens, starting with a number, etc) so you have to hack your way around it.  Infernal was this way IIRC.  Maybe these should just be simply stored as a semi-validated set of key-value pairs.

>From a quick glance at the list of TCoffee parameters I don't at the
moment see any that should cause problem.

I have submitted a bug report[1] which mentions some other issues I
found with TCoffee. If someone could comment on them would be great
and I can start fixing it.

Carn?

[1] https://redmine.open-bio.org/issues/3406


From whereverroadgoes at gmail.com  Mon Feb  4 10:39:19 2013
From: whereverroadgoes at gmail.com (Slym)
Date: Mon, 4 Feb 2013 07:39:19 -0800 (PST)
Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases
Message-ID: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com>

The result I get is:

Number of bases of type A = 
Number of bases of type C = 
Number of bases of type G = 
Number of bases of type T = 

i.e. There's no expected values. 
Please help!

#! /usr/bin/perl

use Bio::Tools::SeqStats;
use Bio::Seq;

open (FILE, "seq.fasta");
@array = <FILE>;

# Removing first line of fasta

shift (@array);
$array = join('', at array);
open (FILE2, ">>seq2.fasta");
print FILE2 "$array";

$seqobj = Bio::PrimarySeq->new( -file => "sekw2.fasta",
- alphabet => 'dna',);


my $seq_stats = Bio::Tools::SeqStats->new(-seq=>$seqobj);

my $monomer_ref = $seq_stats->count_monomers();

foreach $base (sort keys %$monomer_ref) {
print "Liczba zasad typu ", $base," = ", $monomer_ref{$base},"\n";
}


From hamish.mcwilliam at bioinfo-user.org.uk  Mon Feb  4 11:59:16 2013
From: hamish.mcwilliam at bioinfo-user.org.uk (Hamish McWilliam)
Date: Mon, 4 Feb 2013 16:59:16 +0000
Subject: [Bioperl-l] Where to get BLASTCLUST or equivalent?
In-Reply-To: <loom.20130201T045704-740@post.gmane.org>
References: <200305311150.h4VBopn2019091@localhost.localdomain>
	<loom.20130201T045704-740@post.gmane.org>
Message-ID: <CABqDwwLHWp2fZm5h8KJmZhBFV6QmNLJrg5OE=hR+9U3Y3UJ7_g@mail.gmail.com>

BLASTCLUST is part of the legacy NCBI BLAST package (not NCBI BLAST+)
and can be obtained from:

ftp://ftp.ncbi.nlm.nih.gov/blast/executables/release/LATEST

As Robert notes there are many other tools which can be used to
perform sequence clustering, Wikipedia has a Sequence Clustering
article (http://en.wikipedia.org/wiki/Sequence_clustering) which lists
some of the most commonly used.

All the best,

Hamish

On 1 February 2013 04:15, Rob <yuf228 at hotmail.com> wrote:
> Cyril C.C. Chua <bmbcccc <at> bmb.leeds.ac.uk> writes:
>
>>
>> Hi,
>>
>> I have some difficulty in sourcing for BLASTCLUST or related
>> programs/mods. Does any1 know exactly how to locate them?
>>
>> Regards
>>
>> Cyril Chua
>>
>
>
> Hi Cyril,
>
> I heard of the following programmes that might do similar things (I HAVEN'T
> used any of them yet):
>
> Afree - http://www.vicbioinformatics.com/software.afree.shtml
> Uclust - http://drive5.com/uclust/uclust_userguide_2_1.pdf
> Usearch - http://www.drive5.com/usearch/
> DomClust - http://mbgd.genome.ad.jp/domclust/
>
> or
>
> Check this:
>
> http://ppod.princeton.edu/help/help_tech.html
>
> God bless,
>
>
> Robert
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


--
----
"Saying the internet has changed dramatically over the last five years
is clich? ? the internet is always changing dramatically" - Craig
Labovitz, Arbor Networks.


From whereverroadgoes at gmail.com  Mon Feb  4 12:34:10 2013
From: whereverroadgoes at gmail.com (Slym)
Date: Mon, 4 Feb 2013 09:34:10 -0800 (PST)
Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases
In-Reply-To: <CAJ57qHHphLgEyfkEEyt2HVh+RahSWpiuhuaA08vi5ZxMwDDgTg@mail.gmail.com>
References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com>
	<CAJ57qHHphLgEyfkEEyt2HVh+RahSWpiuhuaA08vi5ZxMwDDgTg@mail.gmail.com>
Message-ID: <b2154001-d1eb-4266-a491-108d3e6ae77d@googlegroups.com>

Thanks Roy,

It still doesn't seem to produce anything. :/

From roy.chaudhuri at gmail.com  Mon Feb  4 12:51:03 2013
From: roy.chaudhuri at gmail.com (Roy Chaudhuri)
Date: Mon, 4 Feb 2013 17:51:03 +0000
Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases
In-Reply-To: <b2154001-d1eb-4266-a491-108d3e6ae77d@googlegroups.com>
References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com>
	<CAJ57qHHphLgEyfkEEyt2HVh+RahSWpiuhuaA08vi5ZxMwDDgTg@mail.gmail.com>
	<b2154001-d1eb-4266-a491-108d3e6ae77d@googlegroups.com>
Message-ID: <CAJ57qHG9zFomG1wB4fN7hZZaByvP_EhxOHRTt2OrOZz__WgawQ@mail.gmail.com>

Sorry, I'd missed another problem in your code - you are trying to
load a fasta file using Bio::PrimarySeq. To read sequence data from a
file you should use Bio::SeqIO, see:

http://www.bioperl.org/wiki/HOWTO:Beginners#Retrieving_a_sequence_from_a_file
http://www.bioperl.org/wiki/HOWTO:SeqIO

Cheers,
Roy.

From asjo at koldfront.dk  Mon Feb  4 12:58:25 2013
From: asjo at koldfront.dk (Adam =?iso-8859-1?Q?Sj=F8gren?=)
Date: Mon, 04 Feb 2013 18:58:25 +0100
Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases
In-Reply-To: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com> (Slym's
	message of "Mon, 4 Feb 2013 07:39:19 -0800 (PST)")
References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com>
Message-ID: <8738xc2c72.fsf@topper.koldfront.dk>

On Mon, 4 Feb 2013 07:39:19 -0800 (PST), Slym wrote:

> #! /usr/bin/perl

> use Bio::Tools::SeqStats;
> use Bio::Seq;

It can be a good idea to add "use strict; use warnings;" to the top of
your script. At least two problems in your program would have been
caught by perl if you had.

> open (FILE, "seq.fasta");

Using (global) literal filehandles and the two parameter open() is
somewhat outdated, a more current way to do it could be:

  open my $fh, '<', 'seq.fasta';

> @array = <FILE>;

> # Removing first line of fasta

> shift (@array);
> $array = join('', at array);
> open (FILE2, ">>seq2.fasta");
> print FILE2 "$array";

Note that you are writing just the sequence to your seq2.fasta file
here, so the new file isn't really a fasta file.

> $seqobj = Bio::PrimarySeq->new( -file => "sekw2.fasta",
> - alphabet => 'dna',);

Bio::PrimarySeq doesn't take a '-file' parameter. Also, note that the
filename is different than before "sekw2" vs. "seq2"!

Either you should use Bio::SeqIO with a '-file' parameter, or you can
use Bio::PrimarySeq with a '-seq' parameter.

> my $seq_stats = Bio::Tools::SeqStats->new(-seq=>$seqobj);

> my $monomer_ref = $seq_stats->count_monomers();

> foreach $base (sort keys %$monomer_ref) {
> print "Liczba zasad typu ", $base," = ", $monomer_ref{$base},"\n";

Here you wanted $monomer_ref->{$base}, as %monomer_ref isn't mentioned
anywhere else.

> }

Here is a complete version of your script - I chose to use Bio::SeqIO -
that works:

  #!/usr/bin/perl

  use strict;
  use warnings;

  use Bio::SeqIO;
  use Bio::Tools::SeqStats;

  my $io=Bio::SeqIO->new(-file=>'seq.fasta', -alphabet=>'dna');
  my $seqobj=$io->next_seq; # Get the first sequence from the file

  my $seq_stats = Bio::Tools::SeqStats->new(-seq=>$seqobj);
  my $monomer_ref = $seq_stats->count_monomers();
  foreach my $base (sort keys %$monomer_ref) {
      print "Liczba zasad typu ", $base," = ", $monomer_ref->{$base},"\n";
  }

E.g.:

  $ cat seq.fasta
  >test
  aaaacccggt
  $ ./slym.pl 
  Liczba zasad typu A = 4
  Liczba zasad typu C = 3
  Liczba zasad typu G = 2
  Liczba zasad typu T = 1
  $ 


  Best regards,

    Adam

-- 
 "Grittings. Ma nam is Kahlfin."                              Adam Sj?gren
                                                         asjo at koldfront.dk


From whereverroadgoes at gmail.com  Mon Feb  4 13:02:29 2013
From: whereverroadgoes at gmail.com (Slym)
Date: Mon, 4 Feb 2013 10:02:29 -0800 (PST)
Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases
In-Reply-To: <CAJ57qHG9zFomG1wB4fN7hZZaByvP_EhxOHRTt2OrOZz__WgawQ@mail.gmail.com>
References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com>
	<CAJ57qHHphLgEyfkEEyt2HVh+RahSWpiuhuaA08vi5ZxMwDDgTg@mail.gmail.com>
	<b2154001-d1eb-4266-a491-108d3e6ae77d@googlegroups.com>
	<CAJ57qHG9zFomG1wB4fN7hZZaByvP_EhxOHRTt2OrOZz__WgawQ@mail.gmail.com>
Message-ID: <d5e347d1-cbaa-498a-9b64-a5242fdc4dd8@googlegroups.com>

The thing is, if I use Bio::SeqIO then  Bio::Tools::SeqStats produces an 
error (saying that it wants input provided by Bio::PrimarySeq).
(btw in this line
 $seqobj = Bio::PrimarySeq->new( -file => "sekw2.fasta", - alphabet => 
'dna',); 
there's a typo "sekw2" instead of "seq2" but this is correct in my original 
code).


From whereverroadgoes at gmail.com  Mon Feb  4 13:02:29 2013
From: whereverroadgoes at gmail.com (Slym)
Date: Mon, 4 Feb 2013 10:02:29 -0800 (PST)
Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases
In-Reply-To: <CAJ57qHG9zFomG1wB4fN7hZZaByvP_EhxOHRTt2OrOZz__WgawQ@mail.gmail.com>
References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com>
	<CAJ57qHHphLgEyfkEEyt2HVh+RahSWpiuhuaA08vi5ZxMwDDgTg@mail.gmail.com>
	<b2154001-d1eb-4266-a491-108d3e6ae77d@googlegroups.com>
	<CAJ57qHG9zFomG1wB4fN7hZZaByvP_EhxOHRTt2OrOZz__WgawQ@mail.gmail.com>
Message-ID: <d5e347d1-cbaa-498a-9b64-a5242fdc4dd8@googlegroups.com>

The thing is, if I use Bio::SeqIO then  Bio::Tools::SeqStats produces an 
error (saying that it wants input provided by Bio::PrimarySeq).
(btw in this line
 $seqobj = Bio::PrimarySeq->new( -file => "sekw2.fasta", - alphabet => 
'dna',); 
there's a typo "sekw2" instead of "seq2" but this is correct in my original 
code).


From cjfields at illinois.edu  Mon Feb  4 13:54:39 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Mon, 4 Feb 2013 18:54:39 +0000
Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases
In-Reply-To: <d5e347d1-cbaa-498a-9b64-a5242fdc4dd8@googlegroups.com>
References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com>
	<CAJ57qHHphLgEyfkEEyt2HVh+RahSWpiuhuaA08vi5ZxMwDDgTg@mail.gmail.com>
	<b2154001-d1eb-4266-a491-108d3e6ae77d@googlegroups.com>
	<CAJ57qHG9zFomG1wB4fN7hZZaByvP_EhxOHRTt2OrOZz__WgawQ@mail.gmail.com>
	<d5e347d1-cbaa-498a-9b64-a5242fdc4dd8@googlegroups.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE161ED@CHIMBX5.ad.uillinois.edu>

Please make sure and read both Roy's and Adam's responses all the way through; Bio::SeqIO is not a sequence object but the front-end for format parsing (e.g. FASTA, etc).  Bio::PrimarySeq does not have a '-file' parameter, Bio::SeqIO does.  

If SeqStats truly doesn't work with Bio::Seq we can fix that, but according to Adam he has tested using Bio::SeqIO out and it seems to work.

chris

On Feb 4, 2013, at 12:02 PM, Slym <whereverroadgoes at gmail.com>
 wrote:

> The thing is, if I use Bio::SeqIO then  Bio::Tools::SeqStats produces an 
> error (saying that it wants input provided by Bio::PrimarySeq).
> (btw in this line
> $seqobj = Bio::PrimarySeq->new( -file => "sekw2.fasta", - alphabet => 
> 'dna',); 
> there's a typo "sekw2" instead of "seq2" but this is correct in my original 
> code).
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From asjo at koldfront.dk  Mon Feb  4 15:00:32 2013
From: asjo at koldfront.dk (Adam =?iso-8859-1?Q?Sj=F8gren?=)
Date: Mon, 04 Feb 2013 21:00:32 +0100
Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases
In-Reply-To: <d5e347d1-cbaa-498a-9b64-a5242fdc4dd8@googlegroups.com> (Slym's
	message of "Mon, 4 Feb 2013 10:02:29 -0800 (PST)")
References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com>
	<CAJ57qHHphLgEyfkEEyt2HVh+RahSWpiuhuaA08vi5ZxMwDDgTg@mail.gmail.com>
	<b2154001-d1eb-4266-a491-108d3e6ae77d@googlegroups.com>
	<CAJ57qHG9zFomG1wB4fN7hZZaByvP_EhxOHRTt2OrOZz__WgawQ@mail.gmail.com>
	<d5e347d1-cbaa-498a-9b64-a5242fdc4dd8@googlegroups.com>
Message-ID: <87txpr26jj.fsf@topper.koldfront.dk>

On Mon, 4 Feb 2013 10:02:29 -0800 (PST), Slym wrote:

> The thing is, if I use Bio::SeqIO then  Bio::Tools::SeqStats produces an 
> error (saying that it wants input provided by Bio::PrimarySeq).

That sounds like you forgot to call ->next_seq() on the Bio::SeqIO
object - to get a sequence object - please see the complete, working
example I sent earlier.


  Best regards,

    Adam

-- 
 "Denial springs eternal."                                    Adam Sj?gren
                                                         asjo at koldfront.dk


From scott at scottcain.net  Tue Feb  5 09:45:14 2013
From: scott at scottcain.net (Scott Cain)
Date: Tue, 5 Feb 2013 09:45:14 -0500
Subject: [Bioperl-l] Have your say in the 2013 GMOD Community Survey!
Message-ID: <CA+JTaoy5NZubXo2jQ8oDN20BQ5BAHg3B9ZmYZRJ6f2Ryr+-awQ@mail.gmail.com>

Give us your thoughts on the GMOD project and win a personal DNA test
from 23andMe!

The GMOD project provides tools like GBrowse, Galaxy, MAKER, JBrowse,
Tripal, Apollo, Chado, and many more to a huge community of users and
developers around the world.

To make sure that GMOD is giving you the support you need, we want to
know how you use GMOD, which components you find valuable, your
opinion on support, training, and GMOD's strengths and weaknesses.
Your feedback is vital in helping GMOD to serve its user community
more effectively and to suggest future directions for the project.

Do the survey: http://gmod.org/survey.html

The survey should take between 10 and 15 minutes (including thinking
time), and participants can enter a draw to win "A Journey Through
Your DNA", the personal DNA test from 23andMe (the winner can pick a
$50 Amazon gift voucher if they prefer).

The survey will be open until March 1st. Results will be collated and
discussed at the April 2013 GMOD Meeting in Cambridge, UK, and posted
on the GMOD wiki at http://gmod.org.

Please spread the word to other friends and colleagues who use GMOD:
the more voices we hear, the better the picture we get of the needs of
our users, and the better we can help you!

Do the survey: http://gmod.org/survey.html

If you have any questions or problems with the survey, please email me
-- I will be happy to help out!

Thanks,
Scott


-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   scott at scottcain dot net
GMOD Coordinator (http://gmod.org/)                     216-392-3087
Ontario Institute for Cancer Research

From tiago.hori at gmail.com  Tue Feb  5 10:21:55 2013
From: tiago.hori at gmail.com (Tiago Hori)
Date: Tue, 5 Feb 2013 07:21:55 -0800 (PST)
Subject: [Bioperl-l] Search I::O
Message-ID: <39b1269f-63a7-4b29-af79-8c93ab231abf@googlegroups.com>

Hi All,

I am trying to find the best putative orthologs for 44K Atlantic Salmon 
sequences, and so I need to parse 44K BLAST reports to find the best human 
hit. I am trying to learn Seach::IO, but when I try the first example on 
the HOWTO: use strict;
use Bio::SearchIO;

my $in = new Bio::SearchIO(-format => 'blast'
               -file => 'C001R047.txt');

while( my $result = $in->next_result ) {
  ## $result is a Bio::Search::Result::ResultI compliant object
  while( my $hit = $result->next_hit ) {
    ## $hit is a Bio::Search::Hit::HitI compliant object
    while( my $hsp = $hit->next_hsp ) {
      ## $hsp is a Bio::Search::HSP::HSPI compliant object
      if( $hsp->length('total') > 50 ) {
        if ( $hsp->percent_identity >= 75 ) {
          print "Query=",   $result->query_name,
            " Hit=",        $hit->name,
            " Length=",     $hsp->length('total'),
            " Percent_id=", $hsp->percent_identity, "\n";
        }
      }
    }  
  }
}

I get this error: Odd number of elements in hash assignment at 
/usr/local/share/perl/5.14.2/Bio/SearchIO.pm line 189.

I am using BioPerl version 1.6.901. Is there a format problem with the 
blast reports?

Any help would be greatly appreciated!

T.

From tiago.hori at gmail.com  Tue Feb  5 10:33:32 2013
From: tiago.hori at gmail.com (Tiago Hori)
Date: Tue, 5 Feb 2013 07:33:32 -0800 (PST)
Subject: [Bioperl-l] Search::IO example from HOWTO
Message-ID: <c87907a1-18da-49ed-ad70-55ca7bd27658@googlegroups.com>

Hi All,

I am trying to run tha example from the Search::IO how to use strict;
use Bio::SearchIO;

my $in = new Bio::SearchIO(-format => 'blast'
               -file => 'test.txt');

while( my $result = $in->next_result ) {
  ## $result is a Bio::Search::Result::ResultI compliant object
  while( my $hit = $result->next_hit ) {
    ## $hit is a Bio::Search::Hit::HitI compliant object
    while( my $hsp = $hit->next_hsp ) {
      ## $hsp is a Bio::Search::HSP::HSPI compliant object
      if( $hsp->length('total') > 50 ) {
        if ( $hsp->percent_identity >= 75 ) {
          print "Query=",   $result->query_name,
            " Hit=",        $hit->name,
            " Length=",     $hsp->length('total'),
            " Percent_id=", $hsp->percent_identity, "\n";
        }
      }
    }  
  }
}

And I get this error:Odd number of elements in hash assignment at 
/usr/local/share/perl/5.14.2/Bio/SearchIO.pm line 189.

Can anybody help!

Cheers,

T.


From carandraug+dev at gmail.com  Tue Feb  5 13:56:21 2013
From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=)
Date: Tue, 5 Feb 2013 18:56:21 +0000
Subject: [Bioperl-l] removing packages from bioperl-live
Message-ID: <CAPOrs_1z3xYWVFvObLryf7E4w1oO3O0ZjJ_Cu8HA805=S0Fpzw@mail.gmail.com>

Hi

some of the bioperl-live packages have already been split into
separate repositories. However, they were never actually removed from
bioperl-live. This creates 2 entry points for bug fixes and
implementations. After a chat on #bioperl, I was told to ask here.

Should these be removed? For example, there's bioperl-FeatureIO but
that code alo exists in bioperl-live. Can I remove it from
bioperl-live?

Carn?


From cjfields at illinois.edu  Tue Feb  5 14:34:07 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Tue, 5 Feb 2013 19:34:07 +0000
Subject: [Bioperl-l] Removing Bio::FeatureIO? was Re: removing packages from
 bioperl-live
In-Reply-To: <CAPOrs_1z3xYWVFvObLryf7E4w1oO3O0ZjJ_Cu8HA805=S0Fpzw@mail.gmail.com>
References: <CAPOrs_1z3xYWVFvObLryf7E4w1oO3O0ZjJ_Cu8HA805=S0Fpzw@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu>

Probably should retitle this to ask the question directly (make sure the right radars are pinged).

My vote is yes, it should be removed.  There were a lot of implementation issues with it that ended up becoming problematic.  I do believe it is used, though, so I would like to get additional responses from the community before removing it and pointing to the separate repository (where there has been a lot of experimenting going on).

chris

On Feb 5, 2013, at 12:56 PM, Carn? Draug <carandraug+dev at gmail.com> wrote:

> Hi
> 
> some of the bioperl-live packages have already been split into
> separate repositories. However, they were never actually removed from
> bioperl-live. This creates 2 entry points for bug fixes and
> implementations. After a chat on #bioperl, I was told to ask here.
> 
> Should these be removed? For example, there's bioperl-FeatureIO but
> that code alo exists in bioperl-live. Can I remove it from
> bioperl-live?
> 
> Carn?
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From scott at scottcain.net  Tue Feb  5 14:36:10 2013
From: scott at scottcain.net (Scott Cain)
Date: Tue, 5 Feb 2013 14:36:10 -0500
Subject: [Bioperl-l] Removing Bio::FeatureIO? was Re: removing packages
 from bioperl-live
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_1z3xYWVFvObLryf7E4w1oO3O0ZjJ_Cu8HA805=S0Fpzw@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu>
Message-ID: <CA+JTaowxkgy+2ytqHG-MG6VrOdT7jGLQ9-_TJfVA3COsLgUZYw@mail.gmail.com>

I'm sure it will lead to lots of fun, but I suspect you are right and
it should be removed.  It's time you yank on that bandaid :-)

Scott


On Tue, Feb 5, 2013 at 2:34 PM, Fields, Christopher J
<cjfields at illinois.edu> wrote:
> Probably should retitle this to ask the question directly (make sure the right radars are pinged).
>
> My vote is yes, it should be removed.  There were a lot of implementation issues with it that ended up becoming problematic.  I do believe it is used, though, so I would like to get additional responses from the community before removing it and pointing to the separate repository (where there has been a lot of experimenting going on).
>
> chris
>
> On Feb 5, 2013, at 12:56 PM, Carn? Draug <carandraug+dev at gmail.com> wrote:
>
>> Hi
>>
>> some of the bioperl-live packages have already been split into
>> separate repositories. However, they were never actually removed from
>> bioperl-live. This creates 2 entry points for bug fixes and
>> implementations. After a chat on #bioperl, I was told to ask here.
>>
>> Should these be removed? For example, there's bioperl-FeatureIO but
>> that code alo exists in bioperl-live. Can I remove it from
>> bioperl-live?
>>
>> Carn?
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   scott at scottcain dot net
GMOD Coordinator (http://gmod.org/)                     216-392-3087
Ontario Institute for Cancer Research


From carandraug+dev at gmail.com  Tue Feb  5 15:06:23 2013
From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=)
Date: Tue, 5 Feb 2013 20:06:23 +0000
Subject: [Bioperl-l] dependencies on perl version
Message-ID: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>

Hi

how much perl backwards compatibility does bioperl needs to keep?

If I have something I want to implement and use state (requires
5.010), is it acceptable? 5.010 is already a quite old perl version.
Of course, there are other less elegant ways to implement those
features. If I can't use modern perl stuff, what version number is the
limit?

Carn?


From carandraug+dev at gmail.com  Tue Feb  5 15:10:01 2013
From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=)
Date: Tue, 5 Feb 2013 20:10:01 +0000
Subject: [Bioperl-l] Removing Bio::FeatureIO? was Re: removing packages
 from bioperl-live
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_1z3xYWVFvObLryf7E4w1oO3O0ZjJ_Cu8HA805=S0Fpzw@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu>
Message-ID: <CAPOrs_0qgrs3FKaoyFHL_RmbYJG8jNDfhxW-YddFVUfW3DFn4w@mail.gmail.com>

On 5 February 2013 19:34, Fields, Christopher J <cjfields at illinois.edu> wrote:
> Probably should retitle this to ask the question directly (make sure the right radars are pinged).
>
> My vote is yes, it should be removed.  There were a lot of implementation issues with it that ended up becoming problematic.  I do believe it is used, though, so I would like to get additional responses from the community before removing it and pointing to the separate repository (where there has been a lot of experimenting going on).

Mentioning Bio::FeatureIO was just an example. I meant to ask it as
more general. If the code is already in a separate repository, should
it be removed from bioperl-live?

Carn?


From cjfields at illinois.edu  Tue Feb  5 15:56:48 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Tue, 5 Feb 2013 20:56:48 +0000
Subject: [Bioperl-l] dependencies on perl version
In-Reply-To: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>

Aim for 5.10.1, but be careful of smart-match.  If you do this, make sure to add a 'use 5.010' pragma at the top.  

(for those who don't like this, please speak up.  perl 5.8 has been around almost 8 yrs, we would like to allow using new features if at all possible)

chris

On Feb 5, 2013, at 2:06 PM, Carn? Draug <carandraug+dev at gmail.com> wrote:

> Hi
> 
> how much perl backwards compatibility does bioperl needs to keep?
> 
> If I have something I want to implement and use state (requires
> 5.010), is it acceptable? 5.010 is already a quite old perl version.
> Of course, there are other less elegant ways to implement those
> features. If I can't use modern perl stuff, what version number is the
> limit?
> 
> Carn?
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Tue Feb  5 15:59:38 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Tue, 5 Feb 2013 20:59:38 +0000
Subject: [Bioperl-l] Removing Bio::FeatureIO? was Re: removing packages
 from bioperl-live
In-Reply-To: <CAPOrs_0qgrs3FKaoyFHL_RmbYJG8jNDfhxW-YddFVUfW3DFn4w@mail.gmail.com>
References: <CAPOrs_1z3xYWVFvObLryf7E4w1oO3O0ZjJ_Cu8HA805=S0Fpzw@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu>
	<CAPOrs_0qgrs3FKaoyFHL_RmbYJG8jNDfhxW-YddFVUfW3DFn4w@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1829D@CHIMBX5.ad.uillinois.edu>

On Feb 5, 2013, at 2:10 PM, Carn? Draug <carandraug+dev at gmail.com> wrote:

> On 5 February 2013 19:34, Fields, Christopher J <cjfields at illinois.edu> wrote:
>> Probably should retitle this to ask the question directly (make sure the right radars are pinged).
>> 
>> My vote is yes, it should be removed.  There were a lot of implementation issues with it that ended up becoming problematic.  I do believe it is used, though, so I would like to get additional responses from the community before removing it and pointing to the separate repository (where there has been a lot of experimenting going on).
> 
> Mentioning Bio::FeatureIO was just an example. I meant to ask it as
> more general. If the code is already in a separate repository, should
> it be removed from bioperl-live?
> 
> Carn?

Yes for Bio::FeatureIO, no for Bio::Root::Root and the others at the moment (I want to get a release out by March 1, which I'm planning on announcing later today, so the less disruptive it is the better).  Once we get a new release out we should remove the rest.

chris

From cjfields at illinois.edu  Tue Feb  5 16:53:29 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Tue, 5 Feb 2013 21:53:29 +0000
Subject: [Bioperl-l] Next BioPerl release
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu>

All,

I am scheduling the next BioPerl CPAN release tentatively for March 1.  Any help in triaging bug reports would be greatly appreciated!  

Amongst all other changes, as mentioned in a separate thread we will remove Bio::FeatureIO, now developed in a separate repository:

    https://github.com/bioperl/Bio-FeatureIO

Feedback, suggestions, etc are greatly appreciated.

chris

From miker at htblis.com  Tue Feb  5 19:54:17 2013
From: miker at htblis.com (Michael Rogoff)
Date: Tue, 5 Feb 2013 16:54:17 -0800
Subject: [Bioperl-l] Bio::Graphics error when rendering features with Split
	locations
Message-ID: <C71FF11A-F2E2-4204-9A10-50F5535A0C81@htblis.com>

When trying to render features from a genbank file that include a split location e.g.:

     promoter        join(1000..1080,1..5)
                     /label=PROM1

The following exception is raised:
Can't locate object method "has_tag" via package "Bio::Location::Simple" at lib/perl5/site_perl/5.10.1/Bio/Graphics/Glyph.pm line 704, <GEN0> line 36.

This can be reproduced with the code in the example "Rendering Features from a GenBank or EMBL File" from the Graphics HOW-TO:
http://www.bioperl.org/wiki/HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File

Is there a way to change the script so that split locations would, at the very least, not cause a fatal error?  Is there a different glyph type that needs to be used?  Thanks in advance for any help.

I've attached a simple genbank input that will reproduce the error:

LOCUS       sample2     1080 bp DNA    circular
DEFINITION  Cloning vector sample2
ACCESSION   sample2
VERSION     sample2.1  GI:4352432
COMMENT     Component Fragments
FEATURES               Location/Qualifiers
     terminator      39..328
                     /label=TERM1
                     /note="terminator 1"
     misc_feature    393..488
                     /label=MF1
     CDS             complement(800..900)
                     /label=CDS1
                     /note="resistence gene"
     promoter        join(1000..1080,1..5)
                     /label=PROM1
ORIGIN
        1  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
       61  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      121  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      181  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      241  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      301  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      361  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      421  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      481  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      541  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      601  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      661  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      721  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      781  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      841  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      901  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      961  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
     1021  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
//


P.S.  I think I have traced the source of the problem to Glyph's _subfeat method, which in the case of a feature with split locations is returning location objects instead of feature objects.  Is this a bug?

sub _subfeat {
  my $class   = shift;
  my $feature = shift;

  return $feature->segments     if $feature->can('segments');

  my @split = eval { my $id   = $feature->location->seq_id;
                     my @subs = $feature->location->sub_Location;
                     grep {$id eq $_->seq_id} @subs;
                   };

  return @split if @split;

  # Either the APIs have changed, or I got confused at some point...
  return $feature->get_SeqFeatures         if $feature->can('get_SeqFeatures');
  return $feature->sub_SeqFeature          if $feature->can('sub_SeqFeature');
  return;
}


From l.m.timmermans at students.uu.nl  Tue Feb  5 21:40:27 2013
From: l.m.timmermans at students.uu.nl (Leon Timmermans)
Date: Wed, 6 Feb 2013 03:40:27 +0100
Subject: [Bioperl-l] dependencies on perl version
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
Message-ID: <CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>

On Tue, Feb 5, 2013 at 9:56 PM, Fields, Christopher J
<cjfields at illinois.edu> wrote:
> Aim for 5.10.1, but be careful of smart-match.  If you do this, make sure to add a 'use 5.010' pragma at the top.
>
> (for those who don't like this, please speak up.  perl 5.8 has been around almost 8 yrs, we would like to allow using new features if at all possible)

I *really* hate saying it, but I fear a lot of places are still stuck
on 5.8, in particular on 5.8.8 because of CentOS 5. I know my
department still is and doesn't seem to be in a hurry to upgrade, and
I'm pretty sure it won't be the only one (though personally I use a
self-compiled 5.16).

Leon

From florent.angly at gmail.com  Tue Feb  5 21:51:27 2013
From: florent.angly at gmail.com (Florent Angly)
Date: Wed, 06 Feb 2013 12:51:27 +1000
Subject: [Bioperl-l] Removing Bio::FeatureIO? was Re: removing packages
 from bioperl-live
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1829D@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_1z3xYWVFvObLryf7E4w1oO3O0ZjJ_Cu8HA805=S0Fpzw@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu>
	<CAPOrs_0qgrs3FKaoyFHL_RmbYJG8jNDfhxW-YddFVUfW3DFn4w@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1829D@CHIMBX5.ad.uillinois.edu>
Message-ID: <5111C52F.50101@gmail.com>

On 06/02/13 06:59, Fields, Christopher J wrote:
> On Feb 5, 2013, at 2:10 PM, Carn? Draug <carandraug+dev at gmail.com> wrote:
>
>> On 5 February 2013 19:34, Fields, Christopher J <cjfields at illinois.edu> wrote:
>>> Probably should retitle this to ask the question directly (make sure the right radars are pinged).
>>>
>>> My vote is yes, it should be removed.  There were a lot of implementation issues with it that ended up becoming problematic.  I do believe it is used, though, so I would like to get additional responses from the community before removing it and pointing to the separate repository (where there has been a lot of experimenting going on).
>> Mentioning Bio::FeatureIO was just an example. I meant to ask it as
>> more general. If the code is already in a separate repository, should
>> it be removed from bioperl-live?
>>
>> Carn?
> Yes for Bio::FeatureIO, no for Bio::Root::Root and the others at the moment (I want to get a release out by March 1, which I'm planning on announcing later today, so the less disruptive it is the better).  Once we get a new release out we should remove the rest.
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Sounds good to me (I've been burnt once by the fact that Bio::FeatureIO 
is in two places).
Florent


From florent.angly at gmail.com  Tue Feb  5 21:56:19 2013
From: florent.angly at gmail.com (Florent Angly)
Date: Wed, 06 Feb 2013 12:56:19 +1000
Subject: [Bioperl-l] dependencies on perl version
In-Reply-To: <CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
Message-ID: <5111C653.2010703@gmail.com>

For what it's worth, the current stable version of Debian uses perl 
5.10.1 (http://packages.debian.org/stable/perl/perl).
Florent

On 06/02/13 12:40, Leon Timmermans wrote:
> On Tue, Feb 5, 2013 at 9:56 PM, Fields, Christopher J
> <cjfields at illinois.edu> wrote:
>> Aim for 5.10.1, but be careful of smart-match.  If you do this, make sure to add a 'use 5.010' pragma at the top.
>>
>> (for those who don't like this, please speak up.  perl 5.8 has been around almost 8 yrs, we would like to allow using new features if at all possible)
> I *really* hate saying it, but I fear a lot of places are still stuck
> on 5.8, in particular on 5.8.8 because of CentOS 5. I know my
> department still is and doesn't seem to be in a hurry to upgrade, and
> I'm pretty sure it won't be the only one (though personally I use a
> self-compiled 5.16).
>
> Leon
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From hlapp at drycafe.net  Tue Feb  5 22:27:35 2013
From: hlapp at drycafe.net (Hilmar Lapp)
Date: Tue, 5 Feb 2013 22:27:35 -0500
Subject: [Bioperl-l] Next BioPerl release
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu>
References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu>
Message-ID: <09524241-59F8-4BFF-8054-53CD0A649C11@drycafe.net>


On Feb 5, 2013, at 4:53 PM, Fields, Christopher J wrote:

> I am scheduling the next BioPerl CPAN release tentatively for March 1.

Yay!! Thanks for your leadership again, Chris, and for volunteering your time for the project. If nothing else, and I know this is no compensation really worth speaking of, we owe you beer, and I'll certainly pay my debt to you in Berlin if you come there.

	-hilmar
-- 
===========================================================
: Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net :
===========================================================


From hlapp at drycafe.net  Tue Feb  5 22:32:40 2013
From: hlapp at drycafe.net (Hilmar Lapp)
Date: Tue, 5 Feb 2013 22:32:40 -0500
Subject: [Bioperl-l] dependencies on perl version
In-Reply-To: <5111C653.2010703@gmail.com>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
Message-ID: <A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>

Does anyone know what Ubuntu uses? I've heard lots of other old version problems with CentOS.

8 years is really old, and at some point I fear that weighing backwards compatibility too much just holds us back in a real detrimental way.

	-hilmar

On Feb 5, 2013, at 9:56 PM, Florent Angly wrote:

> For what it's worth, the current stable version of Debian uses perl 5.10.1 (http://packages.debian.org/stable/perl/perl).
> Florent
> 
> On 06/02/13 12:40, Leon Timmermans wrote:
>> On Tue, Feb 5, 2013 at 9:56 PM, Fields, Christopher J
>> <cjfields at illinois.edu> wrote:
>>> Aim for 5.10.1, but be careful of smart-match.  If you do this, make sure to add a 'use 5.010' pragma at the top.
>>> 
>>> (for those who don't like this, please speak up.  perl 5.8 has been around almost 8 yrs, we would like to allow using new features if at all possible)
>> I *really* hate saying it, but I fear a lot of places are still stuck
>> on 5.8, in particular on 5.8.8 because of CentOS 5. I know my
>> department still is and doesn't seem to be in a hurry to upgrade, and
>> I'm pretty sure it won't be the only one (though personally I use a
>> self-compiled 5.16).
>> 
>> Leon
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net :
===========================================================


From cjfields at illinois.edu  Tue Feb  5 22:58:08 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 6 Feb 2013 03:58:08 +0000
Subject: [Bioperl-l] dependencies on perl version
In-Reply-To: <A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE18CBE@CHIMBX5.ad.uillinois.edu>

Re: being held back, I agree.  I don't necessarily want to intentionally break current modules by adding modern code unless it can be demonstrated to be a decent benefit performance-wise, but I don't want to impede new additions by requiring compat with perl 5.8 (hence my suggestion of a 'use 5.01x' pragma when appropriate).

Ubuntu 12.04 LTS is on perl 5.14.2: 

    http://askubuntu.com/questions/80672/what-perl-version-will-be-in-12-04-lts

BTW, I was wrong about perl 5.8 being 8 yrs old; it's almost 11 yrs old (perl 5.8.0 was released on 7/18/2002).  perl 5.8 reached end-of-life in 2008, fixes being only for security reasons.

So, I support dropping perl 5.8 support, but we should have a decent route of use for the folks stuck on old clusters.

chris

On Feb 5, 2013, at 9:32 PM, Hilmar Lapp <hlapp at drycafe.net> wrote:

> Does anyone know what Ubuntu uses? I've heard lots of other old version problems with CentOS.
> 
> 8 years is really old, and at some point I fear that weighing backwards compatibility too much just holds us back in a real detrimental way.
> 
> 	-hilmar
> 
> On Feb 5, 2013, at 9:56 PM, Florent Angly wrote:
> 
>> For what it's worth, the current stable version of Debian uses perl 5.10.1 (http://packages.debian.org/stable/perl/perl).
>> Florent
>> 
>> On 06/02/13 12:40, Leon Timmermans wrote:
>>> On Tue, Feb 5, 2013 at 9:56 PM, Fields, Christopher J
>>> <cjfields at illinois.edu> wrote:
>>>> Aim for 5.10.1, but be careful of smart-match.  If you do this, make sure to add a 'use 5.010' pragma at the top.
>>>> 
>>>> (for those who don't like this, please speak up.  perl 5.8 has been around almost 8 yrs, we would like to allow using new features if at all possible)
>>> I *really* hate saying it, but I fear a lot of places are still stuck
>>> on 5.8, in particular on 5.8.8 because of CentOS 5. I know my
>>> department still is and doesn't seem to be in a hurry to upgrade, and
>>> I'm pretty sure it won't be the only one (though personally I use a
>>> self-compiled 5.16).
>>> 
>>> Leon
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> 
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> -- 
> ===========================================================
> : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net :
> ===========================================================
> 
> 
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From l.m.timmermans at students.uu.nl  Tue Feb  5 23:11:52 2013
From: l.m.timmermans at students.uu.nl (Leon Timmermans)
Date: Wed, 6 Feb 2013 05:11:52 +0100
Subject: [Bioperl-l] dependencies on perl version
In-Reply-To: <A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
Message-ID: <CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>

On Wed, Feb 6, 2013 at 4:32 AM, Hilmar Lapp <hlapp at drycafe.net> wrote:
> Does anyone know what Ubuntu uses?

5.14.2, distrowatch is your friend ;-)

> I've heard lots of other old version problems with CentOS.

I know people who still use CentOS 4 in production :-|

> 8 years is really old, and at some point I fear that weighing backwards compatibility too much just holds us back in a real detrimental way.

CentOS 5 is 6 years old (and will be supported another 4), but CentOS
6 is 'only' 19 months. perl missing a release in the 5.8-5.10
timeframe combined with an unfortunate alignment of its release
schedule with Red Hat's don't do us any favors here.

Leon

From cjfields at illinois.edu  Tue Feb  5 23:14:24 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 6 Feb 2013 04:14:24 +0000
Subject: [Bioperl-l] dependencies on perl version
In-Reply-To: <CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE18E52@CHIMBX5.ad.uillinois.edu>

On Feb 5, 2013, at 8:40 PM, Leon Timmermans <l.m.timmermans at students.uu.nl> wrote:

> On Tue, Feb 5, 2013 at 9:56 PM, Fields, Christopher J
> <cjfields at illinois.edu> wrote:
>> Aim for 5.10.1, but be careful of smart-match.  If you do this, make sure to add a 'use 5.010' pragma at the top.
>> 
>> (for those who don't like this, please speak up.  perl 5.8 has been around almost 8 yrs, we would like to allow using new features if at all possible)
> 
> I *really* hate saying it, but I fear a lot of places are still stuck
> on 5.8, in particular on 5.8.8 because of CentOS 5. I know my
> department still is and doesn't seem to be in a hurry to upgrade, and
> I'm pretty sure it won't be the only one (though personally I use a
> self-compiled 5.16).
> 
> Leon

We had the same problem for a while, but our sysadmins were willing to set up perl 5.12 (at that time) loadable as a module (we can of course set up a local perl as well).  We're now using a sysadmin-installed perl 5.16 with our current cluster.

chris

From cjfields at illinois.edu  Tue Feb  5 23:24:31 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 6 Feb 2013 04:24:31 +0000
Subject: [Bioperl-l] dependencies on perl version
In-Reply-To: <CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>

On Feb 5, 2013, at 10:11 PM, Leon Timmermans <l.m.timmermans at students.uu.nl> wrote:

> On Wed, Feb 6, 2013 at 4:32 AM, Hilmar Lapp <hlapp at drycafe.net> wrote:
>> Does anyone know what Ubuntu uses?
> 
> 5.14.2, distrowatch is your friend ;-)
> 
>> I've heard lots of other old version problems with CentOS.
> 
> I know people who still use CentOS 4 in production :-|
> 
>> 8 years is really old, and at some point I fear that weighing backwards compatibility too much just holds us back in a real detrimental way.
> 
> CentOS 5 is 6 years old (and will be supported another 4), but CentOS
> 6 is 'only' 19 months. perl missing a release in the 5.8-5.10
> timeframe combined with an unfortunate alignment of its release
> schedule with Red Hat's don't do us any favors here.
> 
> Leon

Right, it took ~8 yrs to go from 5.8 to 5.10.  I'd like to point out that Python users are in the same boat: the Python version for CentOS 5 is 2.4.3, and Biopython requires a minimum of python 2.5 (and recommends python 2.7).  

We can always state that perl 5.8 is supported for the upcoming Bioperl release, but we're dropping v5.8 support for any future releases.

chris


From l.m.timmermans at students.uu.nl  Tue Feb  5 23:33:57 2013
From: l.m.timmermans at students.uu.nl (Leon Timmermans)
Date: Wed, 6 Feb 2013 05:33:57 +0100
Subject: [Bioperl-l] dependencies on perl version
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
Message-ID: <CAC1jpXAjt8m9Go9YkGFOUkxw92FUoLFbs0Q_fys-f_gyAwX8yw@mail.gmail.com>

On Wed, Feb 6, 2013 at 5:24 AM, Fields, Christopher J
<cjfields at illinois.edu> wrote:
> Right, it took ~8 yrs to go from 5.8 to 5.10.  I'd like to point out that Python users are in the same boat: the Python version for CentOS 5 is 2.4.3, and Biopython requires a minimum of python 2.5 (and recommends python 2.7).
>
> We can always state that perl 5.8 is supported for the upcoming Bioperl release, but we're dropping v5.8 support for any future releases.

Sounds reasonable. These things shouldn't come as a surprise.

I suspect that the thing that will save us is that most of these
people install it once and then never upgrade.

Leon

From hartzell at alerce.com  Wed Feb  6 12:58:07 2013
From: hartzell at alerce.com (George Hartzell)
Date: Wed, 6 Feb 2013 09:58:07 -0800
Subject: [Bioperl-l] dependencies on perl version
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
Message-ID: <20754.39343.128576.743448@gargle.gargle.HOWL>

Fields, Christopher J writes:
 > [...]
 > Right, it took ~8 yrs to go from 5.8 to 5.10.  I'd like to point
 > out that Python users are in the same boat: the Python version for
 > CentOS 5 is 2.4.3, and Biopython requires a minimum of python 2.5
 > (and recommends python 2.7).   
 > 
 > We can always state that perl 5.8 is supported for the upcoming
 > Bioperl release, but we're dropping v5.8 support for any future
 > releases. 

Do more than drop support for 5.8.

The Perl community has put a transparent and predictable process in
place for releasing [generally] better versions of the language.  It
means that Perl has a chance of continuing to be relevant, attracting
new talent and actually *fixing* some of the s&%t that gives Perl a
bad rap.  It gives people something to plan around, no one should be
surprised that v 5.X.Y is coming out in mid 20ZZ.

BioPerl should do the same thing, declare a release policy that trails
along with the Perl release schedule.  Keep it simple and no one can
argue with it.  Support Perl releases as long as the releases
themselves are supported.

Rather than expending energy supporting out of date platforms, put the
energy into being modern (or Modern...), better distro building and
packaging, testing, documentation and releasing so that the process of
staying current is painless.

Look forward.  Keep it interesting and fun.

Everyone running Mac OS 9 on their Pismo, raise your hand.  Anyone
make their living running sequencing gels in Plexiglas doohickeys on
their lab bench?

I'm not suggesting that the BioPerl community is free to make
arbitrary and capricious changes that makes it difficult for *anyone*
to get anything done.  Churn is a waste of time.

But why should the all-volunteer BioPerl community be stuck supporting
code from 12 years ago because it's cost effective for someone else to
avoid spending *their* $/time/people to stay up to date.

Those sites that value stability/maturity/stagnation so highly have
already accepted the cost/difficulty of nailing one of their feet to
the floor as they try to run forward.  They recognize and depend on
the benefits of having that stable base but generally they've also
accepted the costs associated with their restrictive choices.  They
know how to pull in separate kernel/driver updates so that they can
actually run on nearly modern hardware.  They know, and live with, the
fact that they're not going to have access to the shiny new stuff.
And they know how to stay up to date, when they need to, with the
software that their users need to be competitive (e.g. BioConductor
and R).

As long as (if/when...) updating a BioPerl release is something that
can reliably happen with a few cpanm invocations then the sites that
otherwise favor punctuated equilibrium will learn to handle gradual
change.

Those folks that are "stuck" on older releases always have the option
of supporting professional Perl programmers to keep older releases
going, backport changes, etc....  They're already buying support for
their platforms (or freeloading and coping), let them put bread on the
table at one of the bioinformatics consultancies or labs if they have
something special they need.

Have fun.  Use sharp tools.  Do cool science.  Build cool things.  No
one is paying you to be backwards compatible with the previous
millennium.

g.

From amackey at virginia.edu  Wed Feb  6 13:47:46 2013
From: amackey at virginia.edu (Aaron Mackey)
Date: Wed, 6 Feb 2013 13:47:46 -0500
Subject: [Bioperl-l] dependencies on perl version
In-Reply-To: <20754.39343.128576.743448@gargle.gargle.HOWL>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
Message-ID: <CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>

Huzzah!

--
Aaron J. Mackey, PhD
Assistant Professor
Center for Public Health Genomics
University of Virginia
amackey at virginia.edu
http://www.cphg.virginia.edu/mackey


On Wed, Feb 6, 2013 at 12:58 PM, George Hartzell <hartzell at alerce.com>wrote:

> Fields, Christopher J writes:
>  > [...]
>  > Right, it took ~8 yrs to go from 5.8 to 5.10.  I'd like to point
>  > out that Python users are in the same boat: the Python version for
>  > CentOS 5 is 2.4.3, and Biopython requires a minimum of python 2.5
>  > (and recommends python 2.7).
>  >
>  > We can always state that perl 5.8 is supported for the upcoming
>  > Bioperl release, but we're dropping v5.8 support for any future
>  > releases.
>
> Do more than drop support for 5.8.
>
> The Perl community has put a transparent and predictable process in
> place for releasing [generally] better versions of the language.  It
> means that Perl has a chance of continuing to be relevant, attracting
> new talent and actually *fixing* some of the s&%t that gives Perl a
> bad rap.  It gives people something to plan around, no one should be
> surprised that v 5.X.Y is coming out in mid 20ZZ.
>
> BioPerl should do the same thing, declare a release policy that trails
> along with the Perl release schedule.  Keep it simple and no one can
> argue with it.  Support Perl releases as long as the releases
> themselves are supported.
>
> Rather than expending energy supporting out of date platforms, put the
> energy into being modern (or Modern...), better distro building and
> packaging, testing, documentation and releasing so that the process of
> staying current is painless.
>
> Look forward.  Keep it interesting and fun.
>
> Everyone running Mac OS 9 on their Pismo, raise your hand.  Anyone
> make their living running sequencing gels in Plexiglas doohickeys on
> their lab bench?
>
> I'm not suggesting that the BioPerl community is free to make
> arbitrary and capricious changes that makes it difficult for *anyone*
> to get anything done.  Churn is a waste of time.
>
> But why should the all-volunteer BioPerl community be stuck supporting
> code from 12 years ago because it's cost effective for someone else to
> avoid spending *their* $/time/people to stay up to date.
>
> Those sites that value stability/maturity/stagnation so highly have
> already accepted the cost/difficulty of nailing one of their feet to
> the floor as they try to run forward.  They recognize and depend on
> the benefits of having that stable base but generally they've also
> accepted the costs associated with their restrictive choices.  They
> know how to pull in separate kernel/driver updates so that they can
> actually run on nearly modern hardware.  They know, and live with, the
> fact that they're not going to have access to the shiny new stuff.
> And they know how to stay up to date, when they need to, with the
> software that their users need to be competitive (e.g. BioConductor
> and R).
>
> As long as (if/when...) updating a BioPerl release is something that
> can reliably happen with a few cpanm invocations then the sites that
> otherwise favor punctuated equilibrium will learn to handle gradual
> change.
>
> Those folks that are "stuck" on older releases always have the option
> of supporting professional Perl programmers to keep older releases
> going, backport changes, etc....  They're already buying support for
> their platforms (or freeloading and coping), let them put bread on the
> table at one of the bioinformatics consultancies or labs if they have
> something special they need.
>
> Have fun.  Use sharp tools.  Do cool science.  Build cool things.  No
> one is paying you to be backwards compatible with the previous
> millennium.
>
> g.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

From tiago.hori at gmail.com  Wed Feb  6 08:25:41 2013
From: tiago.hori at gmail.com (Tiago Hori)
Date: Wed, 6 Feb 2013 05:25:41 -0800 (PST)
Subject: [Bioperl-l] Problems installing Bio::Tools::Run:StandAloneBlastPlus
Message-ID: <9b488c6e-34b3-4269-a7ac-e2206720939a@googlegroups.com>

Hi Guys,

I am trying to install the module Bio::Tools::Run:StandAloneBlastPlus, but 
it has been hard so far.

I managed to install and compile samtools, after finding all the 
dependencies, but I am still missing something! I posted the complete 
report below!

Any help, would be great!

Cheers,

T.

cpan[1]> install Bio::Tools::Run::StandAloneBlastPlus
Reading '/home/tiagohori/.cpan/Metadata'
  Database was generated on Tue, 05 Feb 2013 18:41:03 GMT
Running install for module 'Bio::Tools::Run::StandAloneBlastPlus'
Running make for C/CJ/CJFIELDS/BioPerl-Run-1.006900.tar.gz
Checksum for 
/home/tiagohori/.cpan/sources/authors/id/C/CJ/CJFIELDS/BioPerl-Run-1.006900.tar.gz 
ok
Scanning cache /home/tiagohori/.cpan/build for sizes
..................................------------------------------------------DONE
DEL(1/20): /home/tiagohori/.cpan/build/BioPerl-Run-1.006900-qpHfzz 
DEL(2/20): /home/tiagohori/.cpan/build/BioPerl-Run-1.006900-qpHfzz.yml 
DEL(3/20): /home/tiagohori/.cpan/build/BioPerl-Run-1.006900-nMOXgO 
DEL(4/20): /home/tiagohori/.cpan/build/BioPerl-Run-1.006900-nMOXgO.yml 
DEL(5/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-bgBQyC 
DEL(6/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-bgBQyC.yml 
DEL(7/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-Ki3dbt 
DEL(8/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-Ki3dbt.yml 
DEL(9/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-ciM7U4 
DEL(10/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-ciM7U4.yml 
DEL(11/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-oDyi_5 
DEL(12/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-oDyi_5.yml 
DEL(13/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-AQiiAn 
DEL(14/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-AQiiAn.yml 
DEL(15/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-0H2Z9o 
DEL(16/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-0H2Z9o.yml 
DEL(17/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-c_8A_U 
DEL(18/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-c_8A_U.yml 
DEL(19/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-lWtV8v 
DEL(20/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-lWtV8v.yml 

  CPAN.pm: Building C/CJ/CJFIELDS/BioPerl-Run-1.006900.tar.gz

Install scripts? y/n [n ]
n 
Do you want to run tests that require connection to servers across the 
internet
(likely to cause some failures)? y/n [n ]
n 
  - will not run internet-requiring tests
Created MYMETA.yml and MYMETA.json
Creating new 'Build' script for 'BioPerl-Run' version '1.006900'
Building BioPerl-Run
  CJFIELDS/BioPerl-Run-1.006900.tar.gz
  ./Build -- OK
Running Build test
t/Amap.t ...................... 1/18 # Required executable for 
Bio::Tools::Run::Alignment::Amap is not present
t/Amap.t ...................... ok     
t/AnalysisFactory_soap.t ...... skipped: Network tests have not been 
requested
t/Analysis_soap.t ............. skipped: Network tests have not been 
requested
t/BEDTools.t .................. 3/423 # Required executable for 
Bio::Tools::Run::BEDTools is not present
t/BEDTools.t .................. ok       
t/BWA.t ....................... 1/36 # Required executable for 
Bio::Tools::Run::BWA is not present
t/BWA.t ....................... ok     
t/Blat.t ...................... 1/33 # Required executable for 
Bio::Tools::Run::Alignment::Blat is not present
# Looks like you planned 33 tests but ran 20.
t/Blat.t ...................... Dubious, test returned 255 (wstat 65280, 
0xff00)
Failed 13/33 subtests 
(less 15 skipped subtests: 5 okay)
t/Bowtie.t .................... 1/73 # Required executable for 
Bio::Tools::Run::Bowtie is not present
t/Bowtie.t .................... ok     
t/Cap3.t ...................... 1/91 # Required executable for 
Bio::Tools::Run::Cap3 is not present
t/Cap3.t ...................... ok     
t/Clustalw.t .................. 1/45 # Required executable for 
Bio::Tools::Run::Alignment::Clustalw is not present
t/Clustalw.t .................. ok     
t/Coil.t ...................... 2/6 # Required executable for 
Bio::Tools::Run::Coil is not present
t/Coil.t ...................... ok   
t/Consense.t .................. 1/9 # Required executable for 
Bio::Tools::Run::Phylo::Phylip::Consense is not present
t/Consense.t .................. ok   
t/DBA.t ....................... 1/18 # Required executable for 
Bio::Tools::Run::Alignment::DBA is not present
t/DBA.t ....................... ok     
t/DrawGram.t .................. 1/6 # Required executable for 
Bio::Tools::Run::Phylo::Phylip::DrawGram is not present
t/DrawGram.t .................. ok   
t/DrawTree.t .................. 1/6 # Required executable for 
Bio::Tools::Run::Phylo::Phylip::DrawTree is not present
t/DrawTree.t .................. ok   
t/EMBOSS.t .................... ok     
t/Ensembl.t ................... skipped: Network tests have not been 
requested
t/Eponine.t ................... 1/7 # Looks like you planned 7 tests but 
ran 2.
t/Eponine.t ................... Dubious, test returned 255 (wstat 65280, 
0xff00)
Failed 5/7 subtests 
t/Exonerate.t ................. 1/89 # Required executable for 
Bio::Tools::Run::Alignment::Exonerate is not present
t/Exonerate.t ................. ok     
t/FootPrinter.t ............... 1/24 # Required executable for 
Bio::Tools::Run::FootPrinter is not present
t/FootPrinter.t ............... ok     
t/Genemark.hmm.prokaryotic.t .. 1/99 # Required environment variable 
$GENEMARK_MODELS is not set
t/Genemark.hmm.prokaryotic.t .. ok     
t/Genewise.t .................. 1/20 # Required executable for 
Bio::Tools::Run::Genewise is not present
t/Genewise.t .................. ok     
t/Genscan.t ................... 1/6 # Required environment variable 
$GENSCANDIR is not set
t/Genscan.t ................... ok   
t/Gerp.t ...................... 1/33 # Required executable for 
Bio::Tools::Run::Phylo::Gerp is not present
t/Gerp.t ...................... ok     
t/Glimmer2.t .................. 1/217 # Required executable for 
Bio::Tools::Run::Glimmer is not present
t/Glimmer2.t .................. ok       
t/Glimmer3.t .................. 1/111 # Required executable for 
Bio::Tools::Run::Glimmer is not present
t/Glimmer3.t .................. ok       
t/Gumby.t ..................... 1/124 # Required executable for 
Bio::Tools::Run::Phylo::Gumby is not present
t/Gumby.t ..................... ok       
t/Hmmer.t ..................... 1/27 # Required executable for 
Bio::Tools::Run::Hmmer is not present
t/Hmmer.t ..................... ok     
t/Hyphy.t ..................... 2/15 # Required executable for 
Bio::Tools::Run::Phylo::Hyphy::SLAC is not present
t/Hyphy.t ..................... ok     
t/Infernal.t .................. 1/43 # Required executable for 
Bio::Tools::Run::Infernal is not present
t/Infernal.t .................. ok     
t/Kalign.t .................... 1/8 # Required executable for 
Bio::Tools::Run::Alignment::Kalign is not present
t/Kalign.t .................... ok   
t/LVB.t ....................... 1/19 # Required executable for 
Bio::Tools::Run::Phylo::LVB is not present
t/LVB.t ....................... ok     
t/Lagan.t ..................... 1/12 # Required executable for 
Bio::Tools::Run::Alignment::Lagan is not present
t/Lagan.t ..................... ok     
t/MAFFT.t ..................... 1/17 # Required executable for 
Bio::Tools::Run::Alignment::MAFFT is not present
t/MAFFT.t ..................... ok     
t/MCS.t ....................... 1/24 # Required executable for 
Bio::Tools::Run::MCS is not present
t/MCS.t ....................... ok     
t/Maq.t ....................... 1/51 # Required executable for 
Bio::Tools::Run::Maq is not present
t/Maq.t ....................... ok     
t/Match.t ..................... 1/7 # Required executable for 
Bio::Tools::Run::Match is not present
t/Match.t ..................... ok   
t/Mdust.t ..................... 1/5 # Required executable for 
Bio::Tools::Run::Mdust is not present
t/Mdust.t ..................... ok   
t/Meme.t ...................... 1/25 # Required executable for 
Bio::Tools::Run::Meme is not present
t/Meme.t ...................... ok     
t/Minimo.t .................... 1/72 # Required executable for 
Bio::Tools::Run::Minimo is not present
t/Minimo.t .................... ok     
t/Molphy.t .................... 1/10 # Required executable for 
Bio::Tools::Run::Phylo::Molphy::ProtML is not present
t/Molphy.t .................... ok     
t/Muscle.t .................... 1/16 # Required executable for 
Bio::Tools::Run::Alignment::Muscle is not present
t/Muscle.t .................... ok     
t/Neighbor.t .................. 1/17 # Required executable for 
Bio::Tools::Run::Phylo::Phylip::Neighbor is not present
t/Neighbor.t .................. ok     
t/Newbler.t ................... 1/98 # Required executable for 
Bio::Tools::Run::Newbler is not present
t/Newbler.t ................... ok     
t/Njtree.t .................... 1/6 # Required executable for 
Bio::Tools::Run::Phylo::Njtree::Best is not present
t/Njtree.t .................... ok   
t/PAML.t ...................... 1/28 # Required executable for 
Bio::Tools::Run::Phylo::PAML::Codeml is not present
t/PAML.t ...................... ok     
t/Pal2Nal.t ................... 1/9 # Required executable for 
Bio::Tools::Run::Alignment::Pal2Nal is not present
t/Pal2Nal.t ................... ok   
t/PhastCons.t ................. 1/181 # Required executable for 
Bio::Tools::Run::Phylo::Phast::PhastCons is not present
t/PhastCons.t ................. ok       
t/Phrap.t ..................... 1/127 # Required executable for 
Bio::Tools::Run::Phrap is not present
t/Phrap.t ..................... ok       
t/Phyml.t ..................... 1/47 # Required executable for 
Bio::Tools::Run::Phylo::Phyml is not present
t/Phyml.t ..................... ok     
t/Primate.t ................... 1/8 # Required executable for 
Bio::Tools::Run::Primate is not present
t/Primate.t ................... ok   
t/Primer3.t ................... 1/9 # Required executable for 
Bio::Tools::Run::Primer3 is not present
t/Primer3.t ................... ok   
t/Prints.t .................... 1/7 # Required executable for 
Bio::Tools::Run::Prints is not present
t/Prints.t .................... ok   
t/Probalign.t ................. 1/13 # Required executable for 
Bio::Tools::Run::Alignment::Probalign is not present
t/Probalign.t ................. ok     
t/Probcons.t .................. 1/11 # Required executable for 
Bio::Tools::Run::Alignment::Probcons is not present
t/Probcons.t .................. ok     
t/Profile.t ................... 1/7 # Required executable for 
Bio::Tools::Run::Profile is not present
t/Profile.t ................... ok   
t/Promoterwise.t .............. 1/9 # Required executable for 
Bio::Tools::Run::Promoterwise is not present
t/Promoterwise.t .............. ok   
t/ProtDist.t .................. 1/14 # Required executable for 
Bio::Tools::Run::Phylo::Phylip::ProtDist is not present
t/ProtDist.t .................. ok     
t/ProtPars.t .................. 1/11 # Required executable for 
Bio::Tools::Run::Phylo::Phylip::ProtPars is not present
t/ProtPars.t .................. ok     
t/Pseudowise.t ................ 1/18 # Required executable for 
Bio::Tools::Run::Pseudowise is not present
t/Pseudowise.t ................ ok     
t/QuickTree.t ................. 1/13 # Required executable for 
Bio::Tools::Run::Phylo::QuickTree is not present
t/QuickTree.t ................. ok     
t/RepeatMasker.t .............. 1/12 RepeatMasker program not found as  or 
not executable. 
# Required executable for Bio::Tools::Run::RepeatMasker is not present
t/RepeatMasker.t .............. ok     
t/SABlastPlus.t ............... 1/65 # Required executable for 
Bio::Tools::Run::BlastPlus is not present
# Looks like you planned 65 tests but ran 63.
t/SABlastPlus.t ............... Dubious, test returned 255 (wstat 65280, 
0xff00)
Failed 2/65 subtests 
(less 59 skipped subtests: 4 okay)
t/SLR.t ....................... 1/7 # Required executable for 
Bio::Tools::Run::Phylo::SLR is not present
t/SLR.t ....................... ok   
t/Samtools.t .................. ok     
t/Seg.t ....................... 1/8 # Required executable for 
Bio::Tools::Run::Seg is not present
t/Seg.t ....................... ok   
t/Semphy.t .................... 1/19 # Required executable for 
Bio::Tools::Run::Phylo::Semphy is not present
t/Semphy.t .................... ok     
t/SeqBoot.t ................... 1/9 # Required executable for 
Bio::Tools::Run::Phylo::Phylip::SeqBoot is not present
t/SeqBoot.t ................... ok   
t/Signalp.t ................... 1/7 # Required executable for 
Bio::Tools::Run::Signalp is not present
t/Signalp.t ................... ok   
t/Sim4.t ...................... 1/23 # Required executable for 
Bio::Tools::Run::Alignment::Sim4 is not present
t/Sim4.t ...................... ok     
t/Simprot.t ................... 1/6 # Required executable for 
Bio::Tools::Run::Simprot is not present
t/Simprot.t ................... ok   
t/SoapEU-function.t ........... skipped: The optional module Bio::DB::ESoap 
(or dependencies thereof) was not installed
t/SoapEU-unit.t ............... skipped: The optional module Bio::DB::ESoap 
(or dependencies thereof) was not installed
t/StandAloneFasta.t ........... 1/15 # Required executable for 
Bio::Tools::Run::Alignment::StandAloneFasta is not present
t/StandAloneFasta.t ........... ok     
t/TCoffee.t ................... 1/27 # Required executable for 
Bio::Tools::Run::Alignment::TCoffee is not present
t/TCoffee.t ................... ok     
t/TigrAssembler.t ............. 1/88 # Required executable for 
Bio::Tools::Run::TigrAssembler is not present
# Required executable for Bio::Tools::Run::TigrAssembler is not present
t/TigrAssembler.t ............. ok     
t/Tmhmm.t ..................... 1/9 # Required executable for 
Bio::Tools::Run::Tmhmm is not present
t/Tmhmm.t ..................... ok   
t/TribeMCL.t .................. ok     
t/Vista.t ..................... ok   
t/gmap-run.t .................. 1/8 # Required executable for 
Bio::Tools::Run::Alignment::Gmap is not present
t/gmap-run.t .................. ok   
t/tRNAscanSE.t ................ 1/12 # Required executable for 
Bio::Tools::Run::tRNAscanSE is not present
t/tRNAscanSE.t ................ ok     

Test Summary Report
-------------------
t/Blat.t                    (Wstat: 65280 Tests: 20 Failed: 0)
  Non-zero exit status: 255
  Parse errors: Bad plan.  You planned 33 tests but ran 20.
t/Eponine.t                 (Wstat: 65280 Tests: 2 Failed: 0)
  Non-zero exit status: 255
  Parse errors: Bad plan.  You planned 7 tests but ran 2.
t/SABlastPlus.t             (Wstat: 65280 Tests: 63 Failed: 0)
  Non-zero exit status: 255
  Parse errors: Bad plan.  You planned 65 tests but ran 63.
Files=80, Tests=2876, 39 wallclock secs ( 0.54 usr  0.23 sys + 32.54 cusr 
 4.94 csys = 38.25 CPU)
Result: FAIL
Failed 3/80 test programs. 0/2876 subtests failed.
  CJFIELDS/BioPerl-Run-1.006900.tar.gz
  ./Build test -- NOT OK
//hint// to see the cpan-testers results for installing this module, try:
  reports CJFIELDS/BioPerl-Run-1.006900.tar.gz
Running Build install
  make test had returned bad status, won't install without force

From guy.leonard at gmail.com  Wed Feb  6 13:35:38 2013
From: guy.leonard at gmail.com (guy.leonard at gmail.com)
Date: Wed, 6 Feb 2013 10:35:38 -0800 (PST)
Subject: [Bioperl-l] Next BioPerl release
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu>
References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu>
Message-ID: <3e4d717e-b58a-4bfd-943d-6f213bfae260@googlegroups.com>

Nice, super work. 

Will there be a rough list of feature changes/addition/deprecation, or 
shall I consult git logs?

On Tuesday, 5 February 2013 21:53:29 UTC, Christopher Fields wrote:
>
> All, 
>
> I am scheduling the next BioPerl CPAN release tentatively for March 1. 
>  Any help in triaging bug reports would be greatly appreciated!   
>
> Amongst all other changes, as mentioned in a separate thread we will 
> remove Bio::FeatureIO, now developed in a separate repository: 
>
>     https://github.com/bioperl/Bio-FeatureIO 
>
> Feedback, suggestions, etc are greatly appreciated. 
>
> chris 
> _______________________________________________ 
> Bioperl-l mailing list 
> Biop... at lists.open-bio.org <javascript:> 
> http://lists.open-bio.org/mailman/listinfo/bioperl-l 
>

From guy.leonard at gmail.com  Wed Feb  6 13:35:38 2013
From: guy.leonard at gmail.com (guy.leonard at gmail.com)
Date: Wed, 6 Feb 2013 10:35:38 -0800 (PST)
Subject: [Bioperl-l] Next BioPerl release
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu>
References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu>
Message-ID: <3e4d717e-b58a-4bfd-943d-6f213bfae260@googlegroups.com>

Nice, super work. 

Will there be a rough list of feature changes/addition/deprecation, or 
shall I consult git logs?

On Tuesday, 5 February 2013 21:53:29 UTC, Christopher Fields wrote:
>
> All, 
>
> I am scheduling the next BioPerl CPAN release tentatively for March 1. 
>  Any help in triaging bug reports would be greatly appreciated!   
>
> Amongst all other changes, as mentioned in a separate thread we will 
> remove Bio::FeatureIO, now developed in a separate repository: 
>
>     https://github.com/bioperl/Bio-FeatureIO 
>
> Feedback, suggestions, etc are greatly appreciated. 
>
> chris 
> _______________________________________________ 
> Bioperl-l mailing list 
> Biop... at lists.open-bio.org <javascript:> 
> http://lists.open-bio.org/mailman/listinfo/bioperl-l 
>

From sidd.basu at gmail.com  Wed Feb  6 14:36:17 2013
From: sidd.basu at gmail.com (Siddhartha Basu)
Date: Wed, 6 Feb 2013 13:36:17 -0600
Subject: [Bioperl-l]  Re: Next BioPerl release
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu>
References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu>
Message-ID: <5112b0b3.a5dc320a.4105.1fe3@mx.google.com>

Hi, 

On Tue, 05 Feb 2013, Fields, Christopher J wrote:

> All,
> 
> I am scheduling the next BioPerl CPAN release tentatively for March 1.  Any help in triaging bug reports would be greatly appreciated!  
> 
> Amongst all other changes, as mentioned in a separate thread we will remove Bio::FeatureIO, now developed in a separate repository:
> 
>     https://github.com/bioperl/Bio-FeatureIO
> 
> Feedback, suggestions, etc are greatly appreciated.

Here are CI build report on 5.12, 5.14 and 5.16 using travis. 
https://api.travis-ci.org/jobs/4623997/log.txt?deansi=true
https://api.travis-ci.org/jobs/4623998/log.txt?deansi=true
https://api.travis-ci.org/jobs/4623999/log.txt?deansi=true

Could not get 5.10 to work on travis. Though i activated the (--network)
option,  it still didn't run one of the test that needs network. Also, initially got
confused by the fact that though it has dist.ini,  the tests still has
to run through Build.PL. Running **dzil test** do not work.

Hope this helps.

thanks, 
-siddhartha

> 
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

From cjfields at illinois.edu  Wed Feb  6 14:46:49 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 6 Feb 2013 19:46:49 +0000
Subject: [Bioperl-l] Next BioPerl release
In-Reply-To: <3e4d717e-b58a-4bfd-943d-6f213bfae260@googlegroups.com>
References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu>
	<3e4d717e-b58a-4bfd-943d-6f213bfae260@googlegroups.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1A109@CHIMBX5.ad.uillinois.edu>

We've been a little better at keeping track of significant changes this time 'round.  There aren't a lot of major updates, but it's important to make sure we get a release out to ensure everyone (not just those familiar with git) can access them.

chris

On Feb 6, 2013, at 12:35 PM, <guy.leonard at gmail.com>
 wrote:

> Nice, super work. 
> 
> Will there be a rough list of feature changes/addition/deprecation, or 
> shall I consult git logs?
> 
> On Tuesday, 5 February 2013 21:53:29 UTC, Christopher Fields wrote:
>> 
>> All, 
>> 
>> I am scheduling the next BioPerl CPAN release tentatively for March 1. 
>> Any help in triaging bug reports would be greatly appreciated!   
>> 
>> Amongst all other changes, as mentioned in a separate thread we will 
>> remove Bio::FeatureIO, now developed in a separate repository: 
>> 
>>    https://github.com/bioperl/Bio-FeatureIO 
>> 
>> Feedback, suggestions, etc are greatly appreciated. 
>> 
>> chris 
>> _______________________________________________ 
>> Bioperl-l mailing list 
>> Biop... at lists.open-bio.org <javascript:> 
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l 
>> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Wed Feb  6 14:54:58 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 6 Feb 2013 19:54:58 +0000
Subject: [Bioperl-l] Next BioPerl release
In-Reply-To: <5112b0b3.a5dc320a.4105.1fe3@mx.google.com>
References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu>
	<5112b0b3.a5dc320a.4105.1fe3@mx.google.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1A1AB@CHIMBX5.ad.uillinois.edu>

On Feb 6, 2013, at 1:36 PM, Siddhartha Basu <sidd.basu at gmail.com>
 wrote:

> Hi, 
> 
> On Tue, 05 Feb 2013, Fields, Christopher J wrote:
> 
>> All,
>> 
>> I am scheduling the next BioPerl CPAN release tentatively for March 1.  Any help in triaging bug reports would be greatly appreciated!  
>> 
>> Amongst all other changes, as mentioned in a separate thread we will remove Bio::FeatureIO, now developed in a separate repository:
>> 
>>    https://github.com/bioperl/Bio-FeatureIO
>> 
>> Feedback, suggestions, etc are greatly appreciated.
> 
> Here are CI build report on 5.12, 5.14 and 5.16 using travis. 
> https://api.travis-ci.org/jobs/4623997/log.txt?deansi=true
> https://api.travis-ci.org/jobs/4623998/log.txt?deansi=true
> https://api.travis-ci.org/jobs/4623999/log.txt?deansi=true
> 
> Could not get 5.10 to work on travis. Though i activated the (--network)
> option,  it still didn't run one of the test that needs network. Also, initially got
> confused by the fact that though it has dist.ini,  the tests still has
> to run through Build.PL. Running **dzil test** do not work.
> 
> Hope this helps.
> 
> thanks, 
> -siddhartha

Just to point out, that was for Bio-FeatureIO.  Truthfully I'm not worried about that one yet; got to get over Mt. Everest first (the main release).  

Build.PL is there mainly as a convenience for users w/o Dist::Zilla, which, last I recall, had a higher dependency list than even BioPerl (though I may be mistaken).  I'll probably have to set up a Build.PL that can be clobbered by Dist::Zilla as needed.  Or we can just get rid of it and insist that dev. code has to be added via 'use lib' or PERL5LIB, and not allow installation.

chris

From sidd.basu at gmail.com  Wed Feb  6 15:26:06 2013
From: sidd.basu at gmail.com (Siddhartha Basu)
Date: Wed, 6 Feb 2013 14:26:06 -0600
Subject: [Bioperl-l]  Re: Next BioPerl release
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1A1AB@CHIMBX5.ad.uillinois.edu>
References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu>
	<5112b0b3.a5dc320a.4105.1fe3@mx.google.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1A1AB@CHIMBX5.ad.uillinois.edu>
Message-ID: <5112bc60.c69e320a.1e98.2028@mx.google.com>

On Wed, 06 Feb 2013, Fields, Christopher J wrote:

> On Feb 6, 2013, at 1:36 PM, Siddhartha Basu <sidd.basu at gmail.com>
>  wrote:
> 
> > Hi, 
> > 
> > On Tue, 05 Feb 2013, Fields, Christopher J wrote:
> > 
> >> All,
> >> 
> >> I am scheduling the next BioPerl CPAN release tentatively for March 1.  Any help in triaging bug reports would be greatly appreciated!  
> >> 
> >> Amongst all other changes, as mentioned in a separate thread we will remove Bio::FeatureIO, now developed in a separate repository:
> >> 
> >>    https://github.com/bioperl/Bio-FeatureIO
> >> 
> >> Feedback, suggestions, etc are greatly appreciated.
> > 
> > Here are CI build report on 5.12, 5.14 and 5.16 using travis. 
> > https://api.travis-ci.org/jobs/4623997/log.txt?deansi=true
> > https://api.travis-ci.org/jobs/4623998/log.txt?deansi=true
> > https://api.travis-ci.org/jobs/4623999/log.txt?deansi=true
> > 
> > Could not get 5.10 to work on travis. Though i activated the (--network)
> > option,  it still didn't run one of the test that needs network. Also, initially got
> > confused by the fact that though it has dist.ini,  the tests still has
> > to run through Build.PL. Running **dzil test** do not work.
> > 
> > Hope this helps.
> > 
> > thanks, 
> > -siddhartha
> 
> Just to point out, that was for Bio-FeatureIO.  Truthfully I'm not worried about that one yet; got to get over Mt. Everest first (the main release).  
So,  what are steps left for getting the release out to CPAN. Like are
there lot of feature branches still left to be merged,  are there a lot
of unit tests still not passing. Just trying to figure out anyway i
could be of any help to expedite the release process. However,  if they
are already taken care of,  please ignore.

> 
> Build.PL is there mainly as a convenience for users w/o Dist::Zilla, which, last I recall, had a higher dependency list than even BioPerl (though I may be mistaken).  I'll probably have to set up a Build.PL that can be clobbered by Dist::Zilla as needed.  
As far as the error i encountered, presence of Build.PL was blocking dzil
build/release process. And by default,  dzil expects to generate
Build.PL during its build/release process. However,  i am not sure which
mode is the most suitable for bioperl devs.
> Or we can just get rid of it and insist that dev. code has to be added via 'use lib' or PERL5LIB, and not allow installation.

thanks, 
-siddhartha

> 
> chris

From hlapp at drycafe.net  Wed Feb  6 16:30:33 2013
From: hlapp at drycafe.net (Hilmar Lapp)
Date: Wed, 6 Feb 2013 16:30:33 -0500
Subject: [Bioperl-l] dependencies on perl version
In-Reply-To: <20754.39343.128576.743448@gargle.gargle.HOWL>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
Message-ID: <A78F0D43-8296-45CF-9409-320D1FE7CA2F@drycafe.net>

Great points, George, and you're making a very compelling argument. I'm in total agreement. It's almost becoming a reason to having to be embarrassed to still be programming in Perl these days, so one might as well have fun while it lasts.

	-hilmar

On Feb 6, 2013, at 12:58 PM, George Hartzell wrote:

> Fields, Christopher J writes:
>> [...]
>> Right, it took ~8 yrs to go from 5.8 to 5.10.  I'd like to point
>> out that Python users are in the same boat: the Python version for
>> CentOS 5 is 2.4.3, and Biopython requires a minimum of python 2.5
>> (and recommends python 2.7).   
>> 
>> We can always state that perl 5.8 is supported for the upcoming
>> Bioperl release, but we're dropping v5.8 support for any future
>> releases. 
> 
> Do more than drop support for 5.8.
> 
> The Perl community has put a transparent and predictable process in
> place for releasing [generally] better versions of the language.  It
> means that Perl has a chance of continuing to be relevant, attracting
> new talent and actually *fixing* some of the s&%t that gives Perl a
> bad rap.  It gives people something to plan around, no one should be
> surprised that v 5.X.Y is coming out in mid 20ZZ.
> 
> BioPerl should do the same thing, declare a release policy that trails
> along with the Perl release schedule.  Keep it simple and no one can
> argue with it.  Support Perl releases as long as the releases
> themselves are supported.
> 
> Rather than expending energy supporting out of date platforms, put the
> energy into being modern (or Modern...), better distro building and
> packaging, testing, documentation and releasing so that the process of
> staying current is painless.
> 
> Look forward.  Keep it interesting and fun.
> 
> Everyone running Mac OS 9 on their Pismo, raise your hand.  Anyone
> make their living running sequencing gels in Plexiglas doohickeys on
> their lab bench?
> 
> I'm not suggesting that the BioPerl community is free to make
> arbitrary and capricious changes that makes it difficult for *anyone*
> to get anything done.  Churn is a waste of time.
> 
> But why should the all-volunteer BioPerl community be stuck supporting
> code from 12 years ago because it's cost effective for someone else to
> avoid spending *their* $/time/people to stay up to date.
> 
> Those sites that value stability/maturity/stagnation so highly have
> already accepted the cost/difficulty of nailing one of their feet to
> the floor as they try to run forward.  They recognize and depend on
> the benefits of having that stable base but generally they've also
> accepted the costs associated with their restrictive choices.  They
> know how to pull in separate kernel/driver updates so that they can
> actually run on nearly modern hardware.  They know, and live with, the
> fact that they're not going to have access to the shiny new stuff.
> And they know how to stay up to date, when they need to, with the
> software that their users need to be competitive (e.g. BioConductor
> and R).
> 
> As long as (if/when...) updating a BioPerl release is something that
> can reliably happen with a few cpanm invocations then the sites that
> otherwise favor punctuated equilibrium will learn to handle gradual
> change.
> 
> Those folks that are "stuck" on older releases always have the option
> of supporting professional Perl programmers to keep older releases
> going, backport changes, etc....  They're already buying support for
> their platforms (or freeloading and coping), let them put bread on the
> table at one of the bioinformatics consultancies or labs if they have
> something special they need.
> 
> Have fun.  Use sharp tools.  Do cool science.  Build cool things.  No
> one is paying you to be backwards compatible with the previous
> millennium.
> 
> g.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net :
===========================================================


From cjfields at illinois.edu  Wed Feb  6 17:11:06 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 6 Feb 2013 22:11:06 +0000
Subject: [Bioperl-l] BioPerl long-term, was Re:  dependencies on perl version
In-Reply-To: <CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>

George,

Should put your post on a pedestal :)

tl;dr version: I completely agree, but we need help in order to do this.

Long(-winded) version:

I agree completely, backwards compatibility is killing us.  But, we do need current and new people to get involved and help drive this forward.  We need people on all fronts, from coding and bug fixes to documentation and web site maintenance.  I've been driving this bus for a number of years now.  Not getting tired yet, but I am getting substantially busier with my current endeavors, so my time spent working on BioPerl has dwindled considerably.  Any additional support or sharing of responsibilities will help tremendously in keeping up momentum (if someone else wants to take the wheel for a bit, please let me know :).  

If we follow the perl release route, we should streamline the release process (think Dist::Zilla), end support of older versions of Perl, and work on a sustainable release schedule.  The fact that we have so many of us so-called 'old folks' speaking up in favor of this is a very good sign.  We do need a bit more than that; we need help.  BioPerl is a very large project.

A key point we need to address, which is very important for the future of BioPerl.  I use Perl quite a bit in my current work (dabble with Ruby and Python as well when I have to).  BioPerl?  A little, but not as much as I could.  

Shocked?  The main three reason I don't use it 'in anger':  performance, performance, and performance.  It is very important that we make a concerted effort to address this at all levels.  It could be as simple as completely separating parsing from object creation (where the bulk of performance problems seem to lie, but not all of them).  

A specific example: Heng Li once tested the performance of FASTQ parsing (perl, python, bioperl, biopython, his C code, etc). BioPerl's FASTQ couldn't even be measured; IIRC it went on for many hours until he killed it.  This was with the older version of the parser, but I'm willing to bet the newer one I wrote isn't any better.

This. needs. to. change.

I see no problem in stating any generic parsing and low-level interfaces are just as much a part of what BioPerl encompasses as the higher-level Bio::* classes themselves.  Steve and Jason were on to something with SearchIO; it's maybe not as performant as we would like, but it certainly is more flexible in terms of what can be done, b/c it separates out low-level parsing from object creation.  That's the general model we should look at.  There is a good reason Biopython is following this model with their SearchIO implementation (Peter C, are you reading this?)

We have a lot of very talented people involved with this project, both on the purely computational and purely biological end as well as the folks like me who straddle the two domains.  A lot of good code out there that can be used, wrapped, taken advantage of, including everything we currently have in BioPerl.  Let's come up with something that both works and works well, that people can use on a regular basis, even at a low level if they choose.  That alone would dissuade new users from writing up (yet another) custom FASTA/FASTQ/BLAST/GenBank/etc parser b/c the BioPerl one takes millennia to finish.  

A few examples on this front: Rob Buels created a generic parser for GFF3 (Bio::GFF3::LowLevel) with very few dependencies, we wrap this with the newer Bio::FeatureIO code.  Leon has Bio::SFF.  Lincoln of course wrote Bio::DB::Sam and Bio::DB::BigFile.  I have started a wrapper around Heng's FASTQ/FASTA parsing code (kseq), it seems to work quite well (~20M FASTQ in 30 sec last I recall?).  

So:

If it means targeting performance, backwards-compatibility be damned (using Devel::NYTProf?), we do that.

If it means creating a new Bio-NGS repo to focus some of these efforts, so be it.

If it means we get away from the Java-based interface stuff in favor of something more Perl-like (roles anyone?), then I'm all for it.

If it means we modularize BioPerl so this can be done, well, you probably know where I stand (yes).

If it means this is to be BioPerl 2.0, then let's move that direction, sooner than later.

But I can't do it alone.  We (not just me, but we) need to drive the direction we take.

First one who codes gets the gold ring.

chris

On Feb 6, 2013, at 12:47 PM, Aaron Mackey <amackey at virginia.edu>
 wrote:

> Huzzah!
> 
> --
> Aaron J. Mackey, PhD
> Assistant Professor
> Center for Public Health Genomics
> University of Virginia
> amackey at virginia.edu
> http://www.cphg.virginia.edu/mackey
> 
> 
> On Wed, Feb 6, 2013 at 12:58 PM, George Hartzell <hartzell at alerce.com> wrote:
> Fields, Christopher J writes:
>  > [...]
>  > Right, it took ~8 yrs to go from 5.8 to 5.10.  I'd like to point
>  > out that Python users are in the same boat: the Python version for
>  > CentOS 5 is 2.4.3, and Biopython requires a minimum of python 2.5
>  > (and recommends python 2.7).
>  >
>  > We can always state that perl 5.8 is supported for the upcoming
>  > Bioperl release, but we're dropping v5.8 support for any future
>  > releases.
> 
> Do more than drop support for 5.8.
> 
> The Perl community has put a transparent and predictable process in
> place for releasing [generally] better versions of the language.  It
> means that Perl has a chance of continuing to be relevant, attracting
> new talent and actually *fixing* some of the s&%t that gives Perl a
> bad rap.  It gives people something to plan around, no one should be
> surprised that v 5.X.Y is coming out in mid 20ZZ.
> 
> BioPerl should do the same thing, declare a release policy that trails
> along with the Perl release schedule.  Keep it simple and no one can
> argue with it.  Support Perl releases as long as the releases
> themselves are supported.
> 
> Rather than expending energy supporting out of date platforms, put the
> energy into being modern (or Modern...), better distro building and
> packaging, testing, documentation and releasing so that the process of
> staying current is painless.
> 
> Look forward.  Keep it interesting and fun.
> 
> Everyone running Mac OS 9 on their Pismo, raise your hand.  Anyone
> make their living running sequencing gels in Plexiglas doohickeys on
> their lab bench?
> 
> I'm not suggesting that the BioPerl community is free to make
> arbitrary and capricious changes that makes it difficult for *anyone*
> to get anything done.  Churn is a waste of time.
> 
> But why should the all-volunteer BioPerl community be stuck supporting
> code from 12 years ago because it's cost effective for someone else to
> avoid spending *their* $/time/people to stay up to date.
> 
> Those sites that value stability/maturity/stagnation so highly have
> already accepted the cost/difficulty of nailing one of their feet to
> the floor as they try to run forward.  They recognize and depend on
> the benefits of having that stable base but generally they've also
> accepted the costs associated with their restrictive choices.  They
> know how to pull in separate kernel/driver updates so that they can
> actually run on nearly modern hardware.  They know, and live with, the
> fact that they're not going to have access to the shiny new stuff.
> And they know how to stay up to date, when they need to, with the
> software that their users need to be competitive (e.g. BioConductor
> and R).
> 
> As long as (if/when...) updating a BioPerl release is something that
> can reliably happen with a few cpanm invocations then the sites that
> otherwise favor punctuated equilibrium will learn to handle gradual
> change.
> 
> Those folks that are "stuck" on older releases always have the option
> of supporting professional Perl programmers to keep older releases
> going, backport changes, etc....  They're already buying support for
> their platforms (or freeloading and coping), let them put bread on the
> table at one of the bioinformatics consultancies or labs if they have
> something special they need.
> 
> Have fun.  Use sharp tools.  Do cool science.  Build cool things.  No
> one is paying you to be backwards compatible with the previous
> millennium.
> 
> g.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From cjfields at illinois.edu  Wed Feb  6 17:34:42 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 6 Feb 2013 22:34:42 +0000
Subject: [Bioperl-l] BioPerl long-term,
 was Re:  dependencies on perl version
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1AF0C@CHIMBX5.ad.uillinois.edu>

I want to clarify, parser optimization isn't the only point we need to focus on by any means (and may not be the main one).  There is a lot of room for improvement top to bottom, that was one specific example I have long held to be an issue.

-c

On Feb 6, 2013, at 4:11 PM, "Fields, Christopher J" <cjfields at illinois.edu> wrote:

> Shocked?  The main three reason I don't use it 'in anger':  performance, performance, and performance.  It is very important that we make a concerted effort to address this at all levels.  It could be as simple as completely separating parsing from object creation (where the bulk of performance problems seem to lie, but not all of them).  
...

From p.j.a.cock at googlemail.com  Wed Feb  6 17:43:13 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Wed, 6 Feb 2013 22:43:13 +0000
Subject: [Bioperl-l] BioPerl long-term,
	was Re: dependencies on perl version
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
Message-ID: <CAKVJ-_6v2r4R=F-sEAtC9TCLsuU1VxNi6vk-E4gsd2e=Ri0pjQ@mail.gmail.com>

On Wed, Feb 6, 2013 at 10:11 PM, Fields, Christopher J
<cjfields at illinois.edu> wrote:
>
> I see no problem in stating any generic parsing and low-level interfaces
> are just as much a part of what BioPerl encompasses as the higher-level
> Bio::* classes themselves.  Steve and Jason were on to something with
> SearchIO; it's maybe not as performant as we would like, but it certainly
> is more flexible in terms of what can be done, b/c it separates out
> low-level parsing from object creation.  That's the general model we
> should look at.  There is a good reason Biopython is following this
> model with their SearchIO implementation (Peter C, are you reading this?)

Actually I don't think we did end up with that kind of separation in the
Biopython SearchIO - which is not so say it isn't an excellent model
to follow. Rather the Biopython SearchIO (like the BioPerl one) had
as the first goal a consistent object model across assorted file
formats.

The idea of a low level minimal overhead parsers (which are very
format specific), on which a heavier but consistent object model
can be built might be a good balance - the high level API has the
connivence, but if you give that up you can have more speed.
That's what I recommend with FASTQ and Biopython, e.g.
http://news.open-bio.org/news/2009/09/biopython-fast-fastq/

>
> I have started a wrapper around Heng's FASTQ/FASTA parsing
> code (kseq), it seems to work quite well (~20M FASTQ in 30 sec
> last I recall?).
>

I'd have to dig through my emails, but I think the BioRuby guys
looked at that too - as I recall while it was fast, the error handling
left something to be desired. Email me directly or on the BioRuby
list if you want to follow up on that.

Regards,

Peter

From cjfields at illinois.edu  Wed Feb  6 17:53:21 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 6 Feb 2013 22:53:21 +0000
Subject: [Bioperl-l] FASTQ, was Re:  BioPerl long-term,
 was Re: dependencies on perl version
In-Reply-To: <CAKVJ-_6v2r4R=F-sEAtC9TCLsuU1VxNi6vk-E4gsd2e=Ri0pjQ@mail.gmail.com>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
	<CAKVJ-_6v2r4R=F-sEAtC9TCLsuU1VxNi6vk-E4gsd2e=Ri0pjQ@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu>

On Feb 6, 2013, at 4:43 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:

> On Wed, Feb 6, 2013 at 10:11 PM, Fields, Christopher J
> <cjfields at illinois.edu> wrote:
>> 
>> I see no problem in stating any generic parsing and low-level interfaces
>> are just as much a part of what BioPerl encompasses as the higher-level
>> Bio::* classes themselves.  Steve and Jason were on to something with
>> SearchIO; it's maybe not as performant as we would like, but it certainly
>> is more flexible in terms of what can be done, b/c it separates out
>> low-level parsing from object creation.  That's the general model we
>> should look at.  There is a good reason Biopython is following this
>> model with their SearchIO implementation (Peter C, are you reading this?)
> 
> Actually I don't think we did end up with that kind of separation in the
> Biopython SearchIO - which is not so say it isn't an excellent model
> to follow. Rather the Biopython SearchIO (like the BioPerl one) had
> as the first goal a consistent object model across assorted file
> formats.
> 
> The idea of a low level minimal overhead parsers (which are very
> format specific), on which a heavier but consistent object model
> can be built might be a good balance - the high level API has the
> connivence, but if you give that up you can have more speed.
> That's what I recommend with FASTQ and Biopython, e.g.
> http://news.open-bio.org/news/2009/09/biopython-fast-fastq/
> 
>> 
>> I have started a wrapper around Heng's FASTQ/FASTA parsing
>> code (kseq), it seems to work quite well (~20M FASTQ in 30 sec
>> last I recall?).
>> 
> 
> I'd have to dig through my emails, but I think the BioRuby guys
> looked at that too - as I recall while it was fast, the error handling
> left something to be desired. Email me directly or on the BioRuby
> list if you want to follow up on that.
> 
> Regards,
> 
> Peter

I did a little on this, worth following up on, but I pulled the FASTQ test examples you created from the paper to test it out.  IIRC it parsed where it needed to, but I'm not sure how it handled bad sequences, so yes, worth looking into.  Maybe worth moving to open-bio-l for broader discussion.

chris


From whereverroadgoes at gmail.com  Wed Feb  6 16:59:04 2013
From: whereverroadgoes at gmail.com (Slym)
Date: Wed, 6 Feb 2013 13:59:04 -0800 (PST)
Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases
In-Reply-To: <87txpr26jj.fsf@topper.koldfront.dk>
References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com>
	<CAJ57qHHphLgEyfkEEyt2HVh+RahSWpiuhuaA08vi5ZxMwDDgTg@mail.gmail.com>
	<b2154001-d1eb-4266-a491-108d3e6ae77d@googlegroups.com>
	<CAJ57qHG9zFomG1wB4fN7hZZaByvP_EhxOHRTt2OrOZz__WgawQ@mail.gmail.com>
	<d5e347d1-cbaa-498a-9b64-a5242fdc4dd8@googlegroups.com>
	<87txpr26jj.fsf@topper.koldfront.dk>
Message-ID: <411e920d-e614-417d-9198-78bef9adba16@googlegroups.com>

Everything's working now! Thank you very much, especially to you Adam!


>

From carandraug+dev at gmail.com  Wed Feb  6 20:38:20 2013
From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=)
Date: Thu, 7 Feb 2013 01:38:20 +0000
Subject: [Bioperl-l] dependencies on perl version
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
Message-ID: <CAPOrs_0esYVUe_0gZHdAtk4orJQMO82fLjnfNL3Nap=BqX7RWw@mail.gmail.com>

On 5 February 2013 20:56, Fields, Christopher J <cjfields at illinois.edu> wrote:
> On Feb 5, 2013, at 2:06 PM, Carn? Draug <carandraug+dev at gmail.com> wrote:
>> how much perl backwards compatibility does bioperl needs to keep?
>
> Aim for 5.10.1, but be careful of smart-match.

Well, I solved my problem differently and ended up not needing any of
the new features. But next time I'll know. Thanks

Carn?


From pcantalupo at gmail.com  Wed Feb  6 23:04:08 2013
From: pcantalupo at gmail.com (Paul Cantalupo)
Date: Wed, 6 Feb 2013 23:04:08 -0500
Subject: [Bioperl-l] bug 3376 status needs updated
Message-ID: <CAJqbkv77bC3eWGsaOwwXFnGMrAZjVJSSU97CCRwJmMMPLQRjTQ@mail.gmail.com>

Hi,

A few months ago, I fixed bug 3376 (
https://github.com/bioperl/bioperl-live/commit/142e5d79e3a6593db32bf0af99048f47d01bd3f2).
The Redmine bug page (https://redmine.open-bio.org/issues/3376) hasn't been
updated to resolved or closed. Should I do this or is Chris the only one
who does that?

Thank you,

Paul

From cjfields at illinois.edu  Wed Feb  6 23:20:30 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Thu, 7 Feb 2013 04:20:30 +0000
Subject: [Bioperl-l] bug 3376 status needs updated
In-Reply-To: <CAJqbkv77bC3eWGsaOwwXFnGMrAZjVJSSU97CCRwJmMMPLQRjTQ@mail.gmail.com>
References: <CAJqbkv77bC3eWGsaOwwXFnGMrAZjVJSSU97CCRwJmMMPLQRjTQ@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1B45C@CHIMBX5.ad.uillinois.edu>

No, go ahead and close it.  Let me know if you run into perm. problems with it.

chris

On Feb 6, 2013, at 10:04 PM, Paul Cantalupo <pcantalupo at gmail.com>
 wrote:

> Hi,
> 
> A few months ago, I fixed bug 3376 (
> https://github.com/bioperl/bioperl-live/commit/142e5d79e3a6593db32bf0af99048f47d01bd3f2).
> The Redmine bug page (https://redmine.open-bio.org/issues/3376) hasn't been
> updated to resolved or closed. Should I do this or is Chris the only one
> who does that?
> 
> Thank you,
> 
> Paul
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From l.m.timmermans at students.uu.nl  Thu Feb  7 04:07:57 2013
From: l.m.timmermans at students.uu.nl (Leon Timmermans)
Date: Thu, 7 Feb 2013 10:07:57 +0100
Subject: [Bioperl-l] Next BioPerl release
In-Reply-To: <5112bc60.c69e320a.1e98.2028@mx.google.com>
References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu>
	<5112b0b3.a5dc320a.4105.1fe3@mx.google.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1A1AB@CHIMBX5.ad.uillinois.edu>
	<5112bc60.c69e320a.1e98.2028@mx.google.com>
Message-ID: <CAC1jpXDQG8NwaPKd8PEVqWs7NWHHAkrGaasCeJ+bKVy1z0he1Q@mail.gmail.com>

On Wed, Feb 6, 2013 at 9:26 PM, Siddhartha Basu <sidd.basu at gmail.com> wrote:
> As far as the error i encountered, presence of Build.PL was blocking dzil
> build/release process. And by default,  dzil expects to generate
> Build.PL during its build/release process. However,  i am not sure which
> mode is the most suitable for bioperl devs.

You can prune the Build.PL, and then let dzil add its own. We wouldn't
be the first to do that sort of thing.

Leon

From amackey at virginia.edu  Thu Feb  7 10:25:07 2013
From: amackey at virginia.edu (Aaron Mackey)
Date: Thu, 7 Feb 2013 10:25:07 -0500
Subject: [Bioperl-l] FASTQ, was Re: BioPerl long-term,
 was Re: dependencies on perl version
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
	<CAKVJ-_6v2r4R=F-sEAtC9TCLsuU1VxNi6vk-E4gsd2e=Ri0pjQ@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu>
Message-ID: <CAErFSojxeHBTcNK0GiYQ8D-MbPgzMvZ8xfnbeVU0-KaCNq7ZXw@mail.gmail.com>

You might also want to consider a lazy/pull-based parser to defer
parsing/object-building for pieces of the object that don't get used.  This
also usually provides some error tolerance.

-Aaron

--
Aaron J. Mackey, PhD
Assistant Professor
Center for Public Health Genomics
University of Virginia
amackey at virginia.edu
http://www.cphg.virginia.edu/mackey


On Wed, Feb 6, 2013 at 5:53 PM, Fields, Christopher J <cjfields at illinois.edu
> wrote:

> On Feb 6, 2013, at 4:43 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
>
> > On Wed, Feb 6, 2013 at 10:11 PM, Fields, Christopher J
> > <cjfields at illinois.edu> wrote:
> >>
> >> I see no problem in stating any generic parsing and low-level interfaces
> >> are just as much a part of what BioPerl encompasses as the higher-level
> >> Bio::* classes themselves.  Steve and Jason were on to something with
> >> SearchIO; it's maybe not as performant as we would like, but it
> certainly
> >> is more flexible in terms of what can be done, b/c it separates out
> >> low-level parsing from object creation.  That's the general model we
> >> should look at.  There is a good reason Biopython is following this
> >> model with their SearchIO implementation (Peter C, are you reading
> this?)
> >
> > Actually I don't think we did end up with that kind of separation in the
> > Biopython SearchIO - which is not so say it isn't an excellent model
> > to follow. Rather the Biopython SearchIO (like the BioPerl one) had
> > as the first goal a consistent object model across assorted file
> > formats.
> >
> > The idea of a low level minimal overhead parsers (which are very
> > format specific), on which a heavier but consistent object model
> > can be built might be a good balance - the high level API has the
> > connivence, but if you give that up you can have more speed.
> > That's what I recommend with FASTQ and Biopython, e.g.
> > http://news.open-bio.org/news/2009/09/biopython-fast-fastq/
> >
> >>
> >> I have started a wrapper around Heng's FASTQ/FASTA parsing
> >> code (kseq), it seems to work quite well (~20M FASTQ in 30 sec
> >> last I recall?).
> >>
> >
> > I'd have to dig through my emails, but I think the BioRuby guys
> > looked at that too - as I recall while it was fast, the error handling
> > left something to be desired. Email me directly or on the BioRuby
> > list if you want to follow up on that.
> >
> > Regards,
> >
> > Peter
>
> I did a little on this, worth following up on, but I pulled the FASTQ test
> examples you created from the paper to test it out.  IIRC it parsed where
> it needed to, but I'm not sure how it handled bad sequences, so yes, worth
> looking into.  Maybe worth moving to open-bio-l for broader discussion.
>
> chris
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

From tiago.hori at gmail.com  Thu Feb  7 09:58:37 2013
From: tiago.hori at gmail.com (Tiago Hori)
Date: Thu, 7 Feb 2013 06:58:37 -0800 (PST)
Subject: [Bioperl-l] Search I::O
In-Reply-To: <6B0BCF1B-4B67-4697-9B34-8F822B4DC565@gmail.com>
References: <39b1269f-63a7-4b29-af79-8c93ab231abf@googlegroups.com>
	<6B0BCF1B-4B67-4697-9B34-8F822B4DC565@gmail.com>
Message-ID: <e5d61704-086a-4434-ae80-434252d1f55e@googlegroups.com>

Thanks, Jason! It is working Now.

So here is what I am trying to accomplish. For a given Blastx report, I 
want to extract the best BLASTx hit that is human, and does not contain 
unnamed or Predicted. I got very close, but I still can't get it to give me 
only the top BLAST hit, it gives me all blast hits that meet my criteria. I 
tried using "last" to stop it from looping through the hits, once it found 
a human one, but it didn't work. Can someone help? Here is my code so far 
(mostly stolen for the wiki).

use strict;
use Bio::SearchIO; 

my $in = new Bio::SearchIO(-format => 'blast', 
                           -file   => 'testsalmon.txt');
while( my $result = $in->next_result ) {
 ## $result is a Bio::Search::Result::ResultI compliant object
  while( my $hit = $result->next_hit ) {
  ## $hit is a Bio::Search::Hit::HitI compliant object    
    if( $hit->description !~ /[Uu]nnamed|PREDICTED|hypothetical/){        
      if( $hit->description =~ /Homo sapiens/){  
         while( my $hsp = $hit->next_hsp ) {
          ## $hsp is a Bio::Search::HSP::HSPI compliant object
              if( $hsp->length('total') > 50 ) {
                if ( $hsp->percent_identity >= 30) {
              if( $hsp->evalue <= 1e-05){
               print "Query=",   $result->query_name,"\t",
                     " Description=",    $hit->description,"\t",
                     " Hit=",        $hit->name,"\t",
                     " Length=",     $hsp->length('total'),"\t",
                     " Percent_id=", $hsp->percent_identity,"\t",
          }
        }
          }
     }
      }
    }
  }
}


T.


On Wednesday, February 6, 2013 6:46:47 PM UTC-3:30, Jason Stajich wrote:
>
> you are missing a comma after the -format => 'blast' 
> should be 
> my $in = Bio::SearchIO->new(-format => 'blast',   
>   -file => 'XXX' ); 
>
>
> On Feb 5, 2013, at 7:21 AM, Tiago Hori <tiago... at gmail.com <javascript:>> 
> wrote: 
>
> > Hi All, 
> > 
> > I am trying to find the best putative orthologs for 44K Atlantic Salmon 
> > sequences, and so I need to parse 44K BLAST reports to find the best 
> human 
> > hit. I am trying to learn Seach::IO, but when I try the first example on 
> > the HOWTO: use strict; 
> > use Bio::SearchIO; 
> > 
> > my $in = new Bio::SearchIO(-format => 'blast' 
> >               -file => 'C001R047.txt'); 
> > 
> > while( my $result = $in->next_result ) { 
> >  ## $result is a Bio::Search::Result::ResultI compliant object 
> >  while( my $hit = $result->next_hit ) { 
> >    ## $hit is a Bio::Search::Hit::HitI compliant object 
> >    while( my $hsp = $hit->next_hsp ) { 
> >      ## $hsp is a Bio::Search::HSP::HSPI compliant object 
> >      if( $hsp->length('total') > 50 ) { 
> >        if ( $hsp->percent_identity >= 75 ) { 
> >          print "Query=",   $result->query_name, 
> >            " Hit=",        $hit->name, 
> >            " Length=",     $hsp->length('total'), 
> >            " Percent_id=", $hsp->percent_identity, "\n"; 
> >        } 
> >      } 
> >    }   
> >  } 
> > } 
> > 
> > I get this error: Odd number of elements in hash assignment at 
> > /usr/local/share/perl/5.14.2/Bio/SearchIO.pm line 189. 
> > 
> > I am using BioPerl version 1.6.901. Is there a format problem with the 
> > blast reports? 
> > 
> > Any help would be greatly appreciated! 
> > 
> > T. 
> > _______________________________________________ 
> > Bioperl-l mailing list 
> > Biop... at lists.open-bio.org <javascript:> 
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l 
>
> Jason Stajich 
> jason.... at gmail.com <javascript:> 
> ja... at bioperl.org <javascript:> 
>
>

From cjfields at illinois.edu  Thu Feb  7 10:56:04 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Thu, 7 Feb 2013 15:56:04 +0000
Subject: [Bioperl-l] FASTQ, was Re: BioPerl long-term,
 was Re: dependencies on perl version
In-Reply-To: <CAErFSojxeHBTcNK0GiYQ8D-MbPgzMvZ8xfnbeVU0-KaCNq7ZXw@mail.gmail.com>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
	<CAKVJ-_6v2r4R=F-sEAtC9TCLsuU1VxNi6vk-E4gsd2e=Ri0pjQ@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu>
	<CAErFSojxeHBTcNK0GiYQ8D-MbPgzMvZ8xfnbeVU0-KaCNq7ZXw@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1C42B@CHIMBX5.ad.uillinois.edu>

This will likely be the approach for more NGS-friendly Bio::Seq class.  Calculation of the PHRED scores could also be deferred until needed.

seqtk has some C-based methods that we could possibly take advantage of, but will have to look into it.

chris

On Feb 7, 2013, at 9:25 AM, Aaron Mackey <amackey at virginia.edu> wrote:

> You might also want to consider a lazy/pull-based parser to defer parsing/object-building for pieces of the object that don't get used.  This also usually provides some error tolerance.
> 
> -Aaron
> 
> --
> Aaron J. Mackey, PhD
> Assistant Professor
> Center for Public Health Genomics
> University of Virginia
> amackey at virginia.edu
> http://www.cphg.virginia.edu/mackey
> 
> 
> On Wed, Feb 6, 2013 at 5:53 PM, Fields, Christopher J <cjfields at illinois.edu> wrote:
> On Feb 6, 2013, at 4:43 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> 
> > On Wed, Feb 6, 2013 at 10:11 PM, Fields, Christopher J
> > <cjfields at illinois.edu> wrote:
> >>
> >> I see no problem in stating any generic parsing and low-level interfaces
> >> are just as much a part of what BioPerl encompasses as the higher-level
> >> Bio::* classes themselves.  Steve and Jason were on to something with
> >> SearchIO; it's maybe not as performant as we would like, but it certainly
> >> is more flexible in terms of what can be done, b/c it separates out
> >> low-level parsing from object creation.  That's the general model we
> >> should look at.  There is a good reason Biopython is following this
> >> model with their SearchIO implementation (Peter C, are you reading this?)
> >
> > Actually I don't think we did end up with that kind of separation in the
> > Biopython SearchIO - which is not so say it isn't an excellent model
> > to follow. Rather the Biopython SearchIO (like the BioPerl one) had
> > as the first goal a consistent object model across assorted file
> > formats.
> >
> > The idea of a low level minimal overhead parsers (which are very
> > format specific), on which a heavier but consistent object model
> > can be built might be a good balance - the high level API has the
> > connivence, but if you give that up you can have more speed.
> > That's what I recommend with FASTQ and Biopython, e.g.
> > http://news.open-bio.org/news/2009/09/biopython-fast-fastq/
> >
> >>
> >> I have started a wrapper around Heng's FASTQ/FASTA parsing
> >> code (kseq), it seems to work quite well (~20M FASTQ in 30 sec
> >> last I recall?).
> >>
> >
> > I'd have to dig through my emails, but I think the BioRuby guys
> > looked at that too - as I recall while it was fast, the error handling
> > left something to be desired. Email me directly or on the BioRuby
> > list if you want to follow up on that.
> >
> > Regards,
> >
> > Peter
> 
> I did a little on this, worth following up on, but I pulled the FASTQ test examples you created from the paper to test it out.  IIRC it parsed where it needed to, but I'm not sure how it handled bad sequences, so yes, worth looking into.  Maybe worth moving to open-bio-l for broader discussion.
> 
> chris
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From amackey at virginia.edu  Thu Feb  7 11:09:14 2013
From: amackey at virginia.edu (Aaron Mackey)
Date: Thu, 7 Feb 2013 11:09:14 -0500
Subject: [Bioperl-l] FASTQ, was Re: BioPerl long-term,
 was Re: dependencies on perl version
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1C42B@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
	<CAKVJ-_6v2r4R=F-sEAtC9TCLsuU1VxNi6vk-E4gsd2e=Ri0pjQ@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu>
	<CAErFSojxeHBTcNK0GiYQ8D-MbPgzMvZ8xfnbeVU0-KaCNq7ZXw@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1C42B@CHIMBX5.ad.uillinois.edu>
Message-ID: <CAErFSoitVuxPbBHbHcEh=dZ+A8qPjjmNvF14iYBVK=FKRKL5ig@mail.gmail.com>

e.g., a pull-based FASTQ parser that did nothing else at the top level but
"chunk" the file into as-yet-unparsed four-line blobs could appear to work
very fast, if the user code did nothing but count the number of entries:

  while (my $seq = $seqio->nextseq) { $ct++ };

in other words, you defer *everything* except the minimal amount of
parsing/logic required to detect object boundaries.

This is, in fact, the exact opposite of the event-based SearchIO "push"
parsers, which always perform the most parsing possible, despite the user
never accessing most of the material.

Lastly, with respect to performance, if the parsing/object building
operation is not simply IO bound, then parallel parser/object-building CPU
threads could be considered, which could then dynamically adapt to
pre-parse attributes (e.g. quality scores) that the calling code was
actually using.  What's the state of thread-safe Perl these days?

-Aaron


On Thu, Feb 7, 2013 at 10:56 AM, Fields, Christopher J <
cjfields at illinois.edu> wrote:

> This will likely be the approach for more NGS-friendly Bio::Seq class.
>  Calculation of the PHRED scores could also be deferred until needed.
>
> seqtk has some C-based methods that we could possibly take advantage of,
> but will have to look into it.
>
> chris
>
> On Feb 7, 2013, at 9:25 AM, Aaron Mackey <amackey at virginia.edu> wrote:
>
> > You might also want to consider a lazy/pull-based parser to defer
> parsing/object-building for pieces of the object that don't get used.  This
> also usually provides some error tolerance.
> >
> > -Aaron
>

From sidd.basu at gmail.com  Thu Feb  7 11:38:47 2013
From: sidd.basu at gmail.com (Siddhartha Basu)
Date: Thu, 7 Feb 2013 10:38:47 -0600
Subject: [Bioperl-l]  Re: FASTQ, was Re:BioPerl long-term,
	was Re:	dependencies on perl version
In-Reply-To: <CAErFSoitVuxPbBHbHcEh=dZ+A8qPjjmNvF14iYBVK=FKRKL5ig@mail.gmail.com>
References: <CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
	<CAKVJ-_6v2r4R=F-sEAtC9TCLsuU1VxNi6vk-E4gsd2e=Ri0pjQ@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu>
	<CAErFSojxeHBTcNK0GiYQ8D-MbPgzMvZ8xfnbeVU0-KaCNq7ZXw@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1C42B@CHIMBX5.ad.uillinois.edu>
	<CAErFSoitVuxPbBHbHcEh=dZ+A8qPjjmNvF14iYBVK=FKRKL5ig@mail.gmail.com>
Message-ID: <5113d899.ea64320a.489a.262d@mx.google.com>

Another approach might be use map-reduce(Hadoop) if possible. I have
seen one implementation in biopython's GFF3 parser.
http://bcbio.wordpress.com/2009/03/22/mapreduce-implementation-of-gff-parsing-for-biopython/

-siddhartha


On Thu, 07 Feb 2013, Aaron Mackey wrote:

> e.g., a pull-based FASTQ parser that did nothing else at the top level but
> "chunk" the file into as-yet-unparsed four-line blobs could appear to work
> very fast, if the user code did nothing but count the number of entries:
> 
>   while (my $seq = $seqio->nextseq) { $ct++ };
> 
> in other words, you defer *everything* except the minimal amount of
> parsing/logic required to detect object boundaries.
> 
> This is, in fact, the exact opposite of the event-based SearchIO "push"
> parsers, which always perform the most parsing possible, despite the user
> never accessing most of the material.
> 
> Lastly, with respect to performance, if the parsing/object building
> operation is not simply IO bound, then parallel parser/object-building CPU
> threads could be considered, which could then dynamically adapt to
> pre-parse attributes (e.g. quality scores) that the calling code was
> actually using.  What's the state of thread-safe Perl these days?
> 
> -Aaron
> 
> 
> On Thu, Feb 7, 2013 at 10:56 AM, Fields, Christopher J <
> cjfields at illinois.edu> wrote:
> 
> > This will likely be the approach for more NGS-friendly Bio::Seq class.
> >  Calculation of the PHRED scores could also be deferred until needed.
> >
> > seqtk has some C-based methods that we could possibly take advantage of,
> > but will have to look into it.
> >
> > chris
> >
> > On Feb 7, 2013, at 9:25 AM, Aaron Mackey <amackey at virginia.edu> wrote:
> >
> > > You might also want to consider a lazy/pull-based parser to defer
> > parsing/object-building for pieces of the object that don't get used.  This
> > also usually provides some error tolerance.
> > >
> > > -Aaron
> >
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

From cjfields at illinois.edu  Thu Feb  7 11:55:53 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Thu, 7 Feb 2013 16:55:53 +0000
Subject: [Bioperl-l] FASTQ, was Re:BioPerl long-term,
	was Re:	dependencies on perl version
In-Reply-To: <5113d899.ea64320a.489a.262d@mx.google.com>
References: <CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
	<CAKVJ-_6v2r4R=F-sEAtC9TCLsuU1VxNi6vk-E4gsd2e=Ri0pjQ@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu>
	<CAErFSojxeHBTcNK0GiYQ8D-MbPgzMvZ8xfnbeVU0-KaCNq7ZXw@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1C42B@CHIMBX5.ad.uillinois.edu>
	<CAErFSoitVuxPbBHbHcEh=dZ+A8qPjjmNvF14iYBVK=FKRKL5ig@mail.gmail.com>
	<5113d899.ea64320a.489a.262d@mx.google.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1C7B8@CHIMBX5.ad.uillinois.edu>

I think we will want to allow for a multitude of implementations.  SeqIO already allows for that to a degree, but multiple backend implementations (say, different ways of parsing/processing FASTQ and others) isn't supported yet.

chris

On Feb 7, 2013, at 10:38 AM, Siddhartha Basu <sidd.basu at gmail.com> wrote:

> Another approach might be use map-reduce(Hadoop) if possible. I have
> seen one implementation in biopython's GFF3 parser.
> http://bcbio.wordpress.com/2009/03/22/mapreduce-implementation-of-gff-parsing-for-biopython/
> 
> -siddhartha
> 
> 
> On Thu, 07 Feb 2013, Aaron Mackey wrote:
> 
>> e.g., a pull-based FASTQ parser that did nothing else at the top level but
>> "chunk" the file into as-yet-unparsed four-line blobs could appear to work
>> very fast, if the user code did nothing but count the number of entries:
>> 
>>  while (my $seq = $seqio->nextseq) { $ct++ };
>> 
>> in other words, you defer *everything* except the minimal amount of
>> parsing/logic required to detect object boundaries.
>> 
>> This is, in fact, the exact opposite of the event-based SearchIO "push"
>> parsers, which always perform the most parsing possible, despite the user
>> never accessing most of the material.
>> 
>> Lastly, with respect to performance, if the parsing/object building
>> operation is not simply IO bound, then parallel parser/object-building CPU
>> threads could be considered, which could then dynamically adapt to
>> pre-parse attributes (e.g. quality scores) that the calling code was
>> actually using.  What's the state of thread-safe Perl these days?
>> 
>> -Aaron
>> 
>> 
>> On Thu, Feb 7, 2013 at 10:56 AM, Fields, Christopher J <
>> cjfields at illinois.edu> wrote:
>> 
>>> This will likely be the approach for more NGS-friendly Bio::Seq class.
>>> Calculation of the PHRED scores could also be deferred until needed.
>>> 
>>> seqtk has some C-based methods that we could possibly take advantage of,
>>> but will have to look into it.
>>> 
>>> chris
>>> 
>>> On Feb 7, 2013, at 9:25 AM, Aaron Mackey <amackey at virginia.edu> wrote:
>>> 
>>>> You might also want to consider a lazy/pull-based parser to defer
>>> parsing/object-building for pieces of the object that don't get used.  This
>>> also usually provides some error tolerance.
>>>> 
>>>> -Aaron
>>> 
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Thu Feb  7 12:01:07 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Thu, 7 Feb 2013 17:01:07 +0000
Subject: [Bioperl-l] FASTQ, was Re: BioPerl long-term,
 was Re: dependencies on perl version
In-Reply-To: <CAErFSoitVuxPbBHbHcEh=dZ+A8qPjjmNvF14iYBVK=FKRKL5ig@mail.gmail.com>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
	<CAKVJ-_6v2r4R=F-sEAtC9TCLsuU1VxNi6vk-E4gsd2e=Ri0pjQ@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu>
	<CAErFSojxeHBTcNK0GiYQ8D-MbPgzMvZ8xfnbeVU0-KaCNq7ZXw@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1C42B@CHIMBX5.ad.uillinois.edu>
	<CAErFSoitVuxPbBHbHcEh=dZ+A8qPjjmNvF14iYBVK=FKRKL5ig@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1C7EF@CHIMBX5.ad.uillinois.edu>

re: thread-safe perl, so-so at best from what I understand.

chris

On Feb 7, 2013, at 10:09 AM, Aaron Mackey <amackey at virginia.edu> wrote:

> e.g., a pull-based FASTQ parser that did nothing else at the top level but "chunk" the file into as-yet-unparsed four-line blobs could appear to work very fast, if the user code did nothing but count the number of entries:
> 
>   while (my $seq = $seqio->nextseq) { $ct++ };
> 
> in other words, you defer *everything* except the minimal amount of parsing/logic required to detect object boundaries.
> 
> This is, in fact, the exact opposite of the event-based SearchIO "push" parsers, which always perform the most parsing possible, despite the user never accessing most of the material.
> 
> Lastly, with respect to performance, if the parsing/object building operation is not simply IO bound, then parallel parser/object-building CPU threads could be considered, which could then dynamically adapt to pre-parse attributes (e.g. quality scores) that the calling code was actually using.  What's the state of thread-safe Perl these days?
> 
> -Aaron
> 
> 
> On Thu, Feb 7, 2013 at 10:56 AM, Fields, Christopher J <cjfields at illinois.edu> wrote:
> This will likely be the approach for more NGS-friendly Bio::Seq class.  Calculation of the PHRED scores could also be deferred until needed.
> 
> seqtk has some C-based methods that we could possibly take advantage of, but will have to look into it.
> 
> chris
> 
> On Feb 7, 2013, at 9:25 AM, Aaron Mackey <amackey at virginia.edu> wrote:
> 
> > You might also want to consider a lazy/pull-based parser to defer parsing/object-building for pieces of the object that don't get used.  This also usually provides some error tolerance.
> >
> > -Aaron


From hartzell at alerce.com  Thu Feb  7 16:36:24 2013
From: hartzell at alerce.com (George Hartzell)
Date: Thu, 7 Feb 2013 13:36:24 -0800
Subject: [Bioperl-l]  BioPerl long-term,
	was Re:  dependencies on perl version
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
Message-ID: <20756.7768.125680.662488@gargle.gargle.HOWL>

Fields, Christopher J writes:
 > George,
 > 
 > Should put your post on a pedestal :)
 > 
 > tl;dr version: I completely agree, but we need help in order to do this.
 > [...]

And therein lies the [a] problem.  Don't look at me....

I'm not coding on bioinformatics problems these days (though I'm
available...) so _maybe_ I shouldn't have gotten up on the soapbox.

But I'm so sick of getting into arguments (or walking away from
them...) with Ruby and Python [and lisp and *PHP*] fans; Perl is dead,
you can't write good code in Perl, look - Ruby has GEMS!, etc...

Perl of the olden days was an easy language in which to write really
shitty code.  Even the Perl of the BioPerl heyday wasn't really much
help; role your own OO, role your own distro-building, mountains of
monkey-work to provide consistent POD, versioning, etc...

But that's not the Perl that I use.  I have Moose and Moo.  TAP and
the things built on it.  Dist::Zilla.  PerlTidy.  PerlCritic.  cpanm.
MetaCPAN.  Pinto.  GitHub.  Perlbrew.  Wow.

It isn't any harder to write good code, for measures that I care
about, using Perl than it is *any* of the other similar languages.

And it's just as easy, and happens just as frequently, for people to
write shitty (undocumented, untested, poorly managed, poorly packaged,
...) stuff in the other languages.

GET OFF MY LAWN, KID! (Yeah, I know...)

But BioPerl *is* dying.  You might be standing on the shoulders of
giants when you use it to solve a problem, but you *definitely* have
those same giants (and their extended families) on your shoulders
every time I see you try move the project forward.  All of that
history has become the tail that's wagging the dog.

If all y'all are going to keep the thing alive, moving forward and
contributing to new great works then make Apple your hero.  Deprecate
the stuff that's holding you back, give folks a path forward and move
on.

Have fun.  Use sharp tools.  Do cool science.  Build cool things.
Advance your careers (forgot that one last time).  Be reasonable and
professional.

Supporting last year's projects is someone else's business
opportunity.

g.

ps.  Are all y'all following this thread?

     http://news.ycombinator.com/item?id=5123022

Maybe someone should search down for this bit: "Where to start? Any
list of this [sic] projects?" and insert a plug for the various
open-bio projects.  (But "someone" doesn't work here, he said...).

From cjfields at illinois.edu  Thu Feb  7 18:12:19 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Thu, 7 Feb 2013 23:12:19 +0000
Subject: [Bioperl-l] BioPerl long-term,
 was Re:  dependencies on perl version
In-Reply-To: <20756.7768.125680.662488@gargle.gargle.HOWL>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
	<20756.7768.125680.662488@gargle.gargle.HOWL>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1D071@CHIMBX5.ad.uillinois.edu>

On Feb 7, 2013, at 3:36 PM, George Hartzell <hartzell at alerce.com> wrote:

> Fields, Christopher J writes:
>> George,
>> 
>> Should put your post on a pedestal :)
>> 
>> tl;dr version: I completely agree, but we need help in order to do this.
>> [...]
> 
> And therein lies the [a] problem.  Don't look at me....
> 
> I'm not coding on bioinformatics problems these days (though I'm
> available...) so _maybe_ I shouldn't have gotten up on the soapbox.
> 
> But I'm so sick of getting into arguments (or walking away from
> them...) with Ruby and Python [and lisp and *PHP*] fans; Perl is dead,
> you can't write good code in Perl, look - Ruby has GEMS!, etc?

Right, but that's a perception not just in the Bio* world.  It's larger and more pervasive than that.  

> Perl of the olden days was an easy language in which to write really
> shitty code.  Even the Perl of the BioPerl heyday wasn't really much
> help; role your own OO, role your own distro-building, mountains of
> monkey-work to provide consistent POD, versioning, etc...
> 
> But that's not the Perl that I use.  I have Moose and Moo.  TAP and
> the things built on it.  Dist::Zilla.  PerlTidy.  PerlCritic.  cpanm.
> MetaCPAN.  Pinto.  GitHub.  Perlbrew.  Wow.

Yes, and that is the direction we need to go in.

> It isn't any harder to write good code, for measures that I care
> about, using Perl than it is *any* of the other similar languages.
> 
> And it's just as easy, and happens just as frequently, for people to
> write shitty (undocumented, untested, poorly managed, poorly packaged,
> ...) stuff in the other languages.

Oh, I know.  I'm working on some very nice looking but terribly implemented Python code now.

> GET OFF MY LAWN, KID! (Yeah, I know...)
> 
> But BioPerl *is* dying.  You might be standing on the shoulders of
> giants when you use it to solve a problem, but you *definitely* have
> those same giants (and their extended families) on your shoulders
> every time I see you try move the project forward.  All of that
> history has become the tail that's wagging the dog.

Yep.

> If all y'all are going to keep the thing alive, moving forward and
> contributing to new great works then make Apple your hero.  Deprecate
> the stuff that's holding you back, give folks a path forward and move
> on.

That's fine.

> Have fun.  Use sharp tools.  Do cool science.  Build cool things.
> Advance your careers (forgot that one last time).  Be reasonable and
> professional.
> 
> Supporting last year's projects is someone else's business
> opportunity.
> 
> g.

Right, but this isn't just my show.  I can't do this alone; it's simply too much code and I don't have even 1/4 the time I used to have.

> ps.  Are all y'all following this thread?
> 
>     http://news.ycombinator.com/item?id=5123022
> 
> Maybe someone should search down for this bit: "Where to start? Any
> list of this [sic] projects?" and insert a plug for the various
> open-bio projects.  (But "someone" doesn't work here, he said?).

Read the original guy's post.  He's completely delusional (okay, maybe not *completely*, but he comes across as quite bitter and unrealistic).  

Frankly I don't feel so bad if he wants to leave.  He doesn't like messy things.  Biology is messy, if one doesn't understand that then computational biology is not for them.

chris


From carandraug+dev at gmail.com  Thu Feb  7 23:12:22 2013
From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=)
Date: Fri, 8 Feb 2013 04:12:22 +0000
Subject: [Bioperl-l] BioPerl long-term,
	was Re: dependencies on perl version
Message-ID: <CAPOrs_1+oYc20aMvUKOKdeX78XwdZaduh7LKeEG=UQrRgYB6+A@mail.gmail.com>

On 6 February 2013 22:11, "Fields, Christopher J" <cjfields at illinois.edu> wrote:
> [...]
> So:
>
> If it means targeting performance, backwards-compatibility be damned (using Devel::NYTProf?), we do that.
>
> If it means creating a new Bio-NGS repo to focus some of these efforts, so be it.
>
> If it means we get away from the Java-based interface stuff in favor of something more Perl-like (roles anyone?), then I'm all for it.
>
> If it means we modularize BioPerl so this can be done, well, you probably know where I stand (yes).
>
> If it means this is to be BioPerl 2.0, then let's move that direction, sooner than later.
>
> But I can't do it alone.  We (not just me, but we) need to drive the direction we take.
>
> First one who codes gets the gold ring.

Hi

I know I'm not much involved with bioperl development but here's my
suggestion as maintainer of another quite modular free software
project. I swear I'm not promoting it. Skip to the last paragraph for
the very short version.

Octave Forge is now a collection of packages for GNU Octave, each
released independently whenever its maintainer sees fit. But it wasn't
like that before. For a long time, everything was released at the same
time, there was no independent packages. Then it was decided to split
it into sections: main, extra and nonfree (free software dependent on
non-free libraries, now purged), and inside those, it was split into
packages, each with its own maintainer. But some packages were (and
are) more active that the others. Some packages even came from single
contributions and we never heard from the authors again. And so, with
time, cruft settled in.

We didn't want to remove the code, but no one was interested or
comfortable enough on the field, to fix it either. Packages that had a
much more active development were being dragged down by code that no
one was maintaining. So we broke with that and each package is now
released independently. We have packages that haven't been released in
3 years yes, but that just shows the packages that no one cares about.
Those have been marked as unmaintained and anyone can come around and
make a release if they care about it.

As the maintainer of the project, I do *not* make the releases of the
packages. The package maintainers prepares everything and uploads
them, I only run a handful of tests (takes me 10min), upload it to our
server, and make the official announcement. I am also the maintainer
of one of the packages, and have often made releases of unmaintained
packages because I needed it. That's to show, if they are important
enough for someone, they will get a release somehow. If they are not
important, why would we waste our time on them anyway? We now around 5
package releases per month, many of them being minor releases with a
handful of bug fixes. Preparing a release of a small package is much
easier and much less trouble than preparing a giant release
encompassing all of them at the same time.

Short version:
I'd recommend to split the project into much smaller ones. Some of the
small ones will wither and die but those are the less important ones,
and will allow the others, the ones that people care about, freedom to
grow faster. Bioperl would still be just one project, that
incorporates a hundred or so of smaller modules. Let those who care
the most about a specific module to take care of it and make the
releases. Releasing a module becomes much simpler, which means more
releases, more activity, and the smaller code base for each module
also make it less intimidating for new contributors.

Carn?


From hartzell at alerce.com  Fri Feb  8 01:17:17 2013
From: hartzell at alerce.com (George Hartzell)
Date: Thu, 7 Feb 2013 22:17:17 -0800
Subject: [Bioperl-l] injecting a bit of levity....
Message-ID: <20756.39021.553502.116384@gargle.gargle.HOWL>


Perl's not dead.  It's FAMOUS!

  http://imgs.xkcd.com/comics/perl_problems.png

g.

From carandraug+dev at gmail.com  Fri Feb  8 01:57:30 2013
From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=)
Date: Fri, 8 Feb 2013 06:57:30 +0000
Subject: [Bioperl-l] getting a Bio::Search::HSP::HSPI from Bio::SimpleAlign
 (to find differences between sequences)
Message-ID: <CAPOrs_084-eh9kq=uWk19jvLagKKGr2qOs3HpGLpBt7YOLaO4A@mail.gmail.com>

Hi

I already have a Bio::SimpleAlign object (got it after using TCoffee
through bioperl-run module) and I'm trying to get a
Bio::Search::HSP::HSPI object from a pair of the aligned sequences.
How can I do this? I want to use the seq_inds method to compare the
sequences.

Here's my actual problem just in case I should be trying to fix it
some other way. I have a bunch of sequences from protein isoforms.
They have small differences between them, point-mutations, small
insertions or deletions, nothing too big. I want to make a table of
the mutations that each of them has against the consensus sequence. I
already made the alignment and got have the consensus with
"$align->consensus_string". Now, I want to get something like:

isoform1: Ala67Gly, His90_Met91insGln
isoform2: ....

The seq_inds method from the Bio::Search::HSP::HSPI class seems to do
the part of finding the differences, but how can I get one? I can't
find it on the documentation.

Any tips, and even showing a different approach to my problem, are
most appreciated. Thanks,

Carn?


From l.m.timmermans at students.uu.nl  Fri Feb  8 06:18:58 2013
From: l.m.timmermans at students.uu.nl (Leon Timmermans)
Date: Fri, 8 Feb 2013 12:18:58 +0100
Subject: [Bioperl-l] BioPerl long-term,
	was Re: dependencies on perl version
In-Reply-To: <20756.7768.125680.662488@gargle.gargle.HOWL>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
	<20756.7768.125680.662488@gargle.gargle.HOWL>
Message-ID: <CAC1jpXA-bu20fP0WsRi=bJKxnBkfL=KJyB5n8h_XMh6eTOq3uQ@mail.gmail.com>

On Thu, Feb 7, 2013 at 10:36 PM, George Hartzell <hartzell at alerce.com> wrote:
> But I'm so sick of getting into arguments (or walking away from
> them...) with Ruby and Python [and lisp and *PHP*] fans; Perl is dead,
> you can't write good code in Perl, look - Ruby has GEMS!, etc...
>
> Perl of the olden days was an easy language in which to write really
> shitty code.  Even the Perl of the BioPerl heyday wasn't really much
> help; role your own OO, role your own distro-building, mountains of
> monkey-work to provide consistent POD, versioning, etc...
>
> But that's not the Perl that I use.  I have Moose and Moo.  TAP and
> the things built on it.  Dist::Zilla.  PerlTidy.  PerlCritic.  cpanm.
> MetaCPAN.  Pinto.  GitHub.  Perlbrew.  Wow.

I share that experience.

> But BioPerl *is* dying.  You might be standing on the shoulders of
> giants when you use it to solve a problem, but you *definitely* have
> those same giants (and their extended families) on your shoulders
> every time I see you try move the project forward.  All of that
> history has become the tail that's wagging the dog.

I share your sentiment. Most of BioPerl is architected so badly I
can't stomach it most days, and I've worked on hairy codebases
included perl itself. There's just too much sick and wrong. It's like
hundreds of dot-com-era cgi scripts.

The problem (which is common in scientific computing) is that once
code works it's effectively abandoned. BioPerl is essentially a
gathering of more than a thousand such modules.

> If all y'all are going to keep the thing alive, moving forward and
> contributing to new great works then make Apple your hero.  Deprecate
> the stuff that's holding you back, give folks a path forward and move
> on.

That would be lovely, but who is going to do that? We're suffering
from the tragedy of the commons.

> Have fun.  Use sharp tools.  Do cool science.  Build cool things.
> Advance your careers (forgot that one last time).  Be reasonable and
> professional.

Sounds like good advice to me :-)

> Supporting last year's projects is someone else's business
> opportunity.

True!

> ps.  Are all y'all following this thread?
>
>      http://news.ycombinator.com/item?id=5123022
>
> Maybe someone should search down for this bit: "Where to start? Any
> list of this [sic] projects?" and insert a plug for the various
> open-bio projects.  (But "someone" doesn't work here, he said...).

Interesting discussion, though the original post is too cynical even
for my taste.

Leon

From cjfields at illinois.edu  Fri Feb  8 09:08:56 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Fri, 8 Feb 2013 14:08:56 +0000
Subject: [Bioperl-l] BioPerl long-term,
 was Re: dependencies on perl version
In-Reply-To: <CAC1jpXA-bu20fP0WsRi=bJKxnBkfL=KJyB5n8h_XMh6eTOq3uQ@mail.gmail.com>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
	<20756.7768.125680.662488@gargle.gargle.HOWL>
	<CAC1jpXA-bu20fP0WsRi=bJKxnBkfL=KJyB5n8h_XMh6eTOq3uQ@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1DA2D@CHIMBX5.ad.uillinois.edu>

On Feb 8, 2013, at 5:18 AM, Leon Timmermans <l.m.timmermans at students.uu.nl> wrote:

> On Thu, Feb 7, 2013 at 10:36 PM, George Hartzell <hartzell at alerce.com> wrote:
>> But I'm so sick of getting into arguments (or walking away from
>> them...) with Ruby and Python [and lisp and *PHP*] fans; Perl is dead,
>> you can't write good code in Perl, look - Ruby has GEMS!, etc...
>> 
>> Perl of the olden days was an easy language in which to write really
>> shitty code.  Even the Perl of the BioPerl heyday wasn't really much
>> help; role your own OO, role your own distro-building, mountains of
>> monkey-work to provide consistent POD, versioning, etc...
>> 
>> But that's not the Perl that I use.  I have Moose and Moo.  TAP and
>> the things built on it.  Dist::Zilla.  PerlTidy.  PerlCritic.  cpanm.
>> MetaCPAN.  Pinto.  GitHub.  Perlbrew.  Wow.
> 
> I share that experience.
> 
>> But BioPerl *is* dying.  You might be standing on the shoulders of
>> giants when you use it to solve a problem, but you *definitely* have
>> those same giants (and their extended families) on your shoulders
>> every time I see you try move the project forward.  All of that
>> history has become the tail that's wagging the dog.
> 
> I share your sentiment. Most of BioPerl is architected so badly I
> can't stomach it most days, and I've worked on hairy codebases
> included perl itself. There's just too much sick and wrong. It's like
> hundreds of dot-com-era cgi scripts.
> 
> The problem (which is common in scientific computing) is that once
> code works it's effectively abandoned. BioPerl is essentially a
> gathering of more than a thousand such modules.

Yep, the progression from 'it works' to 'it works very well' tends to have very high activation energy.  Many of the fixes tend to be more bandaids (get it working) than fundamental surgery.  I tried my hand at this, got a few things done.

>> If all y'all are going to keep the thing alive, moving forward and
>> contributing to new great works then make Apple your hero.  Deprecate
>> the stuff that's holding you back, give folks a path forward and move
>> on.
> 
> That would be lovely, but who is going to do that? We're suffering
> from the tragedy of the commons.

Spot on, but we could break that path for the time being.  I think BioPerl as is will have to be in maintenance mode; we need a new effort to break with older perl, older practices.  

>> Have fun.  Use sharp tools.  Do cool science.  Build cool things.
>> Advance your careers (forgot that one last time).  Be reasonable and
>> professional.
> 
> Sounds like good advice to me :-)
> 
>> Supporting last year's projects is someone else's business
>> opportunity.
> 
> True!

We just need to make a bioperl 1.x branch for the maintenance bit, rechristen 'master' as 'v2', and just move on to fixing the f****** code.  Let's move on that.

>> ps.  Are all y'all following this thread?
>> 
>>     http://news.ycombinator.com/item?id=5123022
>> 
>> Maybe someone should search down for this bit: "Where to start? Any
>> list of this [sic] projects?" and insert a plug for the various
>> open-bio projects.  (But "someone" doesn't work here, he said...).
> 
> Interesting discussion, though the original post is too cynical even
> for my taste.
> 
> Leon

Yes, that's not unusual unfortunately.  We have a number of physicists and mathematicians here who have started their initial forays into computational biology, they're all startled at how noisy it is and how messy code can.  Of course their disciplines have had the benefit of teaching students how to (somewhat decently) code for the last 40 years.

chris

From l.m.timmermans at students.uu.nl  Fri Feb  8 07:08:06 2013
From: l.m.timmermans at students.uu.nl (Leon Timmermans)
Date: Fri, 8 Feb 2013 13:08:06 +0100
Subject: [Bioperl-l] BioPerl long-term,
	was Re: dependencies on perl version
In-Reply-To: <CAPOrs_1+oYc20aMvUKOKdeX78XwdZaduh7LKeEG=UQrRgYB6+A@mail.gmail.com>
References: <CAPOrs_1+oYc20aMvUKOKdeX78XwdZaduh7LKeEG=UQrRgYB6+A@mail.gmail.com>
Message-ID: <CAC1jpXAZJK=B_GDOTb=zznj=p+bmTQq9QrD6Lkw+do7kM89K2w@mail.gmail.com>

On Fri, Feb 8, 2013 at 5:12 AM, Carn? Draug <carandraug+dev at gmail.com> wrote:
> Short version:
> I'd recommend to split the project into much smaller ones. Some of the
> small ones will wither and die but those are the less important ones,
> and will allow the others, the ones that people care about, freedom to
> grow faster. Bioperl would still be just one project, that
> incorporates a hundred or so of smaller modules. Let those who care
> the most about a specific module to take care of it and make the
> releases. Releasing a module becomes much simpler, which means more
> releases, more activity, and the smaller code base for each module
> also make it less intimidating for new contributors.

That has been a goal for some time now, but it's fairly complicated.
Not only do we have a LOT of modules (bioperl-live alone is more than
900), they also have complicated dependencies. I've attached the
results of my static dependency analysis of bioperl-live. I suspect
this split-up needs to done by automated graph analysis, it's too much
to do by hand.

Leon
-------------- next part --------------
A non-text attachment was scrubbed...
Name: deps.dot
Type: application/octet-stream
Size: 93463 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20130208/bdbbda1e/attachment-0001.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: deps.png
Type: image/png
Size: 6694525 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20130208/bdbbda1e/attachment-0001.png>

From sebastien.moretti at unil.ch  Fri Feb  8 11:19:29 2013
From: sebastien.moretti at unil.ch (=?ISO-8859-1?Q?Moretti_S=E9bastien?=)
Date: Fri, 08 Feb 2013 17:19:29 +0100
Subject: [Bioperl-l] PhyloXML
Message-ID: <51152591.9010402@unil.ch>

Hi

I would like to add some XML to an existing PhyloXML tree.

No problem to read and write it.
I would like to add <name>smthg</name> after the <phylogeny> tag as in 
http://www.phyloxml.org/examples_syntax/phyloxml_syntax_example_1.html
but get problems with add_phyloXML_annotation() :

Can't locate object method "annotation" via package "Bio::Tree::Tree" at
         /software/share/perl5/vendor_perl/Bio/TreeIO/phyloxml.pm line 
984, <GEN0> line 1 (#1)
     (F) You called a method correctly, and it correctly indicated a package
     functioning as a class, but that package doesn't define that particular
     method, nor does any of its base classes.  See perlobj.

Uncaught exception from user code:
         Can't locate object method "annotation" via package 
"Bio::Tree::Tree" at 
/software/share/perl5/vendor_perl/Bio/TreeIO/phyloxml.pm line 984, 
<GEN0> line 1.
  at /software/share/perl5/vendor_perl/Bio/TreeIO/phyloxml.pm line 984
 
Bio::TreeIO::phyloxml::element_default('Bio::TreeIO::phyloxml=HASH(0x134b1268)') 
called at /software/share/perl5/vendor_perl/Bio/TreeIO/phyloxml.pm line 670
 
Bio::TreeIO::phyloxml::processXMLNode('Bio::TreeIO::phyloxml=HASH(0x134b1268)') 
called at /software/share/perl5/vendor_perl/Bio/TreeIO/phyloxml.pm line 309
 
Bio::TreeIO::phyloxml::add_phyloXML_annotation('Bio::TreeIO::phyloxml=HASH(0x134b1268)', 
'-obj', 'Bio::Tree::Tree=HASH(0x13525258)', '-xml', '<name>SUMF 
family</name>') called at ./add_annotation_to_phyloxml.pl line 40


I think I do something wrong but what ?
Here is the code

my $treeio = new Bio::TreeIO(-file   => "$infile",
                              -format => 'phyloxml',
                             );
my $tree = $treeio->next_tree;

# Add annotation
$treeio->add_phyloXML_annotation(-obj => $tree,
                                  -xml => '<name>SUMF family</name>',
                                 );

-- 
S?bastien Moretti


From cjfields at illinois.edu  Sat Feb  9 01:25:17 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Sat, 9 Feb 2013 06:25:17 +0000
Subject: [Bioperl-l] BioPerl future
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1F217@CHIMBX5.ad.uillinois.edu>

All,

(cross-posting to gmod-gbrowse)

I want to gauge the community's thoughts on a few things.  At the moment I think we can safely say that BioPerl 1.x is in maintenance mode.  By 'maintenance mode', I mean that we can only do so much with it w/o breaking backwards compatibility with old scripts.  We need a way forward so that we can address fundamental problems within the core codebase, namely speed.

I am thinking at the moment of pushing a 'v1' branch next week after I make an official announcement, with a new 1.6 release coming out from that branch (as already announced, tentatively scheduled for March 1).  That frees up master for any code development, removal of modules/cruft, etc.  This will open an initial path forward and at least enable us to do more.  Make sense?  This of course means that any code reliant on v1 should pull from that branch instead of 'master'.  

Thoughts?  

chris

From cjfields at illinois.edu  Sat Feb  9 01:43:24 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Sat, 9 Feb 2013 06:43:24 +0000
Subject: [Bioperl-l] BioPerl long-term,
 was Re: dependencies on perl version
In-Reply-To: <CAC1jpXAZJK=B_GDOTb=zznj=p+bmTQq9QrD6Lkw+do7kM89K2w@mail.gmail.com>
References: <CAPOrs_1+oYc20aMvUKOKdeX78XwdZaduh7LKeEG=UQrRgYB6+A@mail.gmail.com>
	<CAC1jpXAZJK=B_GDOTb=zznj=p+bmTQq9QrD6Lkw+do7kM89K2w@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1F2C6@CHIMBX5.ad.uillinois.edu>

On Feb 8, 2013, at 6:08 AM, Leon Timmermans <l.m.timmermans at students.uu.nl> wrote:

> On Fri, Feb 8, 2013 at 5:12 AM, Carn? Draug <carandraug+dev at gmail.com> wrote:
>> Short version:
>> I'd recommend to split the project into much smaller ones. Some of the
>> small ones will wither and die but those are the less important ones,
>> and will allow the others, the ones that people care about, freedom to
>> grow faster. Bioperl would still be just one project, that
>> incorporates a hundred or so of smaller modules. Let those who care
>> the most about a specific module to take care of it and make the
>> releases. Releasing a module becomes much simpler, which means more
>> releases, more activity, and the smaller code base for each module
>> also make it less intimidating for new contributors.
> 
> That has been a goal for some time now, but it's fairly complicated.
> Not only do we have a LOT of modules (bioperl-live alone is more than
> 900), they also have complicated dependencies. I've attached the
> results of my static dependency analysis of bioperl-live. I suspect
> this split-up needs to done by automated graph analysis, it's too much
> to do by hand.
> 
> Leon
> <deps.dot><deps.png>

Leon, 

I'm hoping we can do this sooner than later.  In fact, if we proceed with make a 'v1' branch or something similar, we can start extricating out code sooner than later (next few weeks).

chris

From cjfields at illinois.edu  Sat Feb  9 08:51:35 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Sat, 9 Feb 2013 13:51:35 +0000
Subject: [Bioperl-l] [Gmod-gbrowse] BioPerl future
Message-ID: <prc698q0fqtymq1n70jhdi5w.1360417710993@email.android.com>

Sheldon,

The branch is where the old (v1.x) code would reside.  Master branch would be v2.

Chris


Sent via phone


-------- Original message --------
From: Sheldon McKay <sheldon.mckay at gmail.com>
Date:
To: "Fields, Christopher J" <cjfields at illinois.edu>
Cc: BioPerl List <Bioperl-l at lists.open-bio.org>,gmod-gbrowse at lists.sourceforge.net
Subject: Re: [Gmod-gbrowse] BioPerl future


Hi Chris,

This sounds like a good idea.  I think it will eventually allow bioperl to evolve into a leaner, meaner package that would be more likely to be adopted by new or isolated bioinformaticians, who tend to be put off by the size and complexity of bioperl as it now stands.

One question I have is whether the name of branch v1 might be perceived as a step backward.  How about v2?

Sheldon

On Saturday, February 9, 2013, Fields, Christopher J wrote:
All,

(cross-posting to gmod-gbrowse)

I want to gauge the community's thoughts on a few things.  At the moment I think we can safely say that BioPerl 1.x is in maintenance mode.  By 'maintenance mode', I mean that we can only do so much with it w/o breaking backwards compatibility with old scripts.  We need a way forward so that we can address fundamental problems within the core codebase, namely speed.

I am thinking at the moment of pushing a 'v1' branch next week after I make an official announcement, with a new 1.6 release coming out from that branch (as already announced, tentatively scheduled for March 1).  That frees up master for any code development, removal of modules/cruft, etc.  This will open an initial path forward and at least enable us to do more.  Make sense?  This of course means that any code reliant on v1 should pull from that branch instead of 'master'.

Thoughts?

chris
------------------------------------------------------------------------------
Free Next-Gen Firewall Hardware Offer
Buy your Sophos next-gen firewall before the end March 2013
and get the hardware for free! Learn more.
http://p.sf.net/sfu/sophos-d2d-feb
_______________________________________________
Gmod-gbrowse mailing list
Gmod-gbrowse at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse


--
Sheldon McKay, PhD
Computational Biologist
DNA Learning Center
Cold Spring Harbor Laboratory
1 Bungtown Rd
Cold Spring Harbor, NY 11724
(516) 367-5185
www.dnalc.org<http://www.dnalc.org>


From sheldon.mckay at gmail.com  Sat Feb  9 08:04:50 2013
From: sheldon.mckay at gmail.com (Sheldon McKay)
Date: Sat, 9 Feb 2013 08:04:50 -0500
Subject: [Bioperl-l] [Gmod-gbrowse] BioPerl future
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1F217@CHIMBX5.ad.uillinois.edu>
References: <118F034CF4C3EF48A96F86CE585B94BF6CE1F217@CHIMBX5.ad.uillinois.edu>
Message-ID: <CAEs59kkOhJ-czn_aXOcP+yOszQdGGLgaAMNp+u_0MqS=xXapng@mail.gmail.com>

Hi Chris,

This sounds like a good idea.  I think it will eventually allow bioperl to
evolve into a leaner, meaner package that would be more likely to be
adopted by new or isolated bioinformaticians, who tend to be put off by
the size and complexity of bioperl as it now stands.

One question I have is whether the name of branch v1 might be perceived as
a step backward.  How about v2?

Sheldon

On Saturday, February 9, 2013, Fields, Christopher J wrote:

> All,
>
> (cross-posting to gmod-gbrowse)
>
> I want to gauge the community's thoughts on a few things.  At the moment I
> think we can safely say that BioPerl 1.x is in maintenance mode.  By
> 'maintenance mode', I mean that we can only do so much with it w/o breaking
> backwards compatibility with old scripts.  We need a way forward so that we
> can address fundamental problems within the core codebase, namely speed.
>
> I am thinking at the moment of pushing a 'v1' branch next week after I
> make an official announcement, with a new 1.6 release coming out from that
> branch (as already announced, tentatively scheduled for March 1).  That
> frees up master for any code development, removal of modules/cruft, etc.
>  This will open an initial path forward and at least enable us to do more.
>  Make sense?  This of course means that any code reliant on v1 should pull
> from that branch instead of 'master'.
>
> Thoughts?
>
> chris
>
> ------------------------------------------------------------------------------
> Free Next-Gen Firewall Hardware Offer
> Buy your Sophos next-gen firewall before the end March 2013
> and get the hardware for free! Learn more.
> http://p.sf.net/sfu/sophos-d2d-feb
> _______________________________________________
> Gmod-gbrowse mailing list
> Gmod-gbrowse at lists.sourceforge.net <javascript:;>
> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
>


-- 
Sheldon McKay, PhD
Computational Biologist
DNA Learning Center
Cold Spring Harbor Laboratory
1 Bungtown Rd
Cold Spring Harbor, NY 11724
(516) 367-5185
www.dnalc.org

From cjfields at illinois.edu  Sat Feb  9 23:25:14 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Sun, 10 Feb 2013 04:25:14 +0000
Subject: [Bioperl-l] BioPerl future
In-Reply-To: <2F53583C-9E7D-4D6A-A4C2-E5C27DDBA493@illinois.edu>
References: <2F53583C-9E7D-4D6A-A4C2-E5C27DDBA493@illinois.edu>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1FC4C@CHIMBX5.ad.uillinois.edu>

Apologies if you receive this twice. I never received the replies from the gbrowse list through bioperl-l so it is possible there were mail issues last night.

------------------------

All,

(cross-posting to gmod-gbrowse)

I want to gauge the community's thoughts on a few things.  At the moment I think we can safely say that BioPerl 1.x is in maintenance mode.  By 'maintenance mode', I mean that we can only do so much with it w/o breaking backwards compatibility with old scripts.  We need a way forward so that we can address fundamental problems within the core codebase, namely speed.

I am thinking at the moment of pushing a 'v1' branch next week after I make an official announcement, with a new 1.6 release coming out from that branch (as already announced, tentatively scheduled for March 1).  That frees up master for any code development, removal of modules/cruft, etc.  This will open an initial path forward and at least enable us to do more.  Make sense?  This of course means that any code reliant on v1 should pull from that branch instead of 'master'.  

Thoughts?  

chris


From genehack at genehack.org  Sat Feb  9 23:36:07 2013
From: genehack at genehack.org (John SJ Anderson)
Date: Sat, 9 Feb 2013 20:36:07 -0800
Subject: [Bioperl-l] BioPerl future
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1FC4C@CHIMBX5.ad.uillinois.edu>
References: <2F53583C-9E7D-4D6A-A4C2-E5C27DDBA493@illinois.edu>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1FC4C@CHIMBX5.ad.uillinois.edu>
Message-ID: <668BED38-61AE-4D21-A3BD-B7AEC9361EF6@genehack.org>

On Feb 9, 2013, at 8:25 PM, "Fields, Christopher J" <cjfields at illinois.edu> wrote:

> Thoughts?  

+1

The other thing to maybe give some advance thought to is organization of the new development. Maybe instead of one big repository, we can encourage the more loosely coupled small pieces that everybody seems to realize we need by having more, smaller repositories. 

j.

-- 
John SJ Anderson // genehack at genehack.org


From carandraug+dev at gmail.com  Sun Feb 10 13:40:33 2013
From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=)
Date: Sun, 10 Feb 2013 18:40:33 +0000
Subject: [Bioperl-l] BioPerl future
Message-ID: <CAPOrs_21WBiRwngD8_U4di_0WnXCz8cUHjv+oL6_m_UadBMfDg@mail.gmail.com>

On 10 February 2013 17:00,  <bioperl-l-request at lists.open-bio.org> wrote:
> Message: 3
> Date: Sat, 9 Feb 2013 20:36:07 -0800
> From: John SJ Anderson <genehack at genehack.org>
> Subject: Re: [Bioperl-l] BioPerl future
> To: "Fields, Christopher J" <cjfields at illinois.edu>
> Cc: BioPerl List <Bioperl-l at lists.open-bio.org>
> Message-ID: <668BED38-61AE-4D21-A3BD-B7AEC9361EF6 at genehack.org>
> Content-Type: text/plain; charset=us-ascii
>
> On Feb 9, 2013, at 8:25 PM, "Fields, Christopher J" <cjfields at illinois.edu> wrote:
>
>> Thoughts?
>
> +1
>
> The other thing to maybe give some advance thought to is organization of the new development. Maybe instead of one big repository, we can encourage the more loosely coupled small pieces that everybody seems to realize we need by having more, smaller repositories.

For those interested, I have just added instructions on the wiki on
how to split a subset of modules, tests, files, etc from the
bioperl-live repository into a new repository while keeping their old
history.

http://www.bioperl.org/wiki/Using_Git/Advanced#Split_a_module_from_bioperl-live

Carn?


From cjfields at illinois.edu  Sun Feb 10 15:08:35 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Sun, 10 Feb 2013 20:08:35 +0000
Subject: [Bioperl-l] BioPerl future
In-Reply-To: <CAPOrs_21WBiRwngD8_U4di_0WnXCz8cUHjv+oL6_m_UadBMfDg@mail.gmail.com>
References: <CAPOrs_21WBiRwngD8_U4di_0WnXCz8cUHjv+oL6_m_UadBMfDg@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE20632@CHIMBX5.ad.uillinois.edu>

On Feb 10, 2013, at 12:40 PM, Carn? Draug <carandraug+dev at gmail.com> wrote:

> On 10 February 2013 17:00,  <bioperl-l-request at lists.open-bio.org> wrote:
>> Message: 3
>> Date: Sat, 9 Feb 2013 20:36:07 -0800
>> From: John SJ Anderson <genehack at genehack.org>
>> Subject: Re: [Bioperl-l] BioPerl future
>> To: "Fields, Christopher J" <cjfields at illinois.edu>
>> Cc: BioPerl List <Bioperl-l at lists.open-bio.org>
>> Message-ID: <668BED38-61AE-4D21-A3BD-B7AEC9361EF6 at genehack.org>
>> Content-Type: text/plain; charset=us-ascii
>> 
>> On Feb 9, 2013, at 8:25 PM, "Fields, Christopher J" <cjfields at illinois.edu> wrote:
>> 
>>> Thoughts?
>> 
>> +1
>> 
>> The other thing to maybe give some advance thought to is organization of the new development. Maybe instead of one big repository, we can encourage the more loosely coupled small pieces that everybody seems to realize we need by having more, smaller repositories.
> 
> For those interested, I have just added instructions on the wiki on
> how to split a subset of modules, tests, files, etc from the
> bioperl-live repository into a new repository while keeping their old
> history.
> 
> http://www.bioperl.org/wiki/Using_Git/Advanced#Split_a_module_from_bioperl-live
> 
> Carn?

It's probably worth looking at this page as well, then:

http://www.bioperl.org/wiki/BioPerl_Modularization

We should probably merge the two.

chris


From hlapp at drycafe.net  Sun Feb 10 20:03:34 2013
From: hlapp at drycafe.net (Hilmar Lapp)
Date: Sun, 10 Feb 2013 20:03:34 -0500
Subject: [Bioperl-l] PhyloXML
In-Reply-To: <51152591.9010402@unil.ch>
References: <51152591.9010402@unil.ch>
Message-ID: <F041F111-CF8F-4096-9968-5F8CA5DCA866@drycafe.net>

On Feb 8, 2013, at 11:19 AM, Moretti S?bastien <sebastien.moretti at unil.ch> wrote:

> # Add annotation
> $treeio->add_phyloXML_annotation(-obj => $tree,
>                                -xml => '<name>SUMF family</name>',
>                               );

If you really have $treeio in your script in this line and not $tree, then that's at least one problem. But the stack suggests that the above isn't the exact line in your script - can you confirm that?

	-hilmar

-- 
===========================================================
: Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net :
===========================================================


From sebastien.moretti at unil.ch  Mon Feb 11 02:08:22 2013
From: sebastien.moretti at unil.ch (=?ISO-8859-1?Q?S=E9bastien_MORETTI?=)
Date: Mon, 11 Feb 2013 08:08:22 +0100
Subject: [Bioperl-l] PhyloXML
In-Reply-To: <F041F111-CF8F-4096-9968-5F8CA5DCA866@drycafe.net>
References: <51152591.9010402@unil.ch>
	<F041F111-CF8F-4096-9968-5F8CA5DCA866@drycafe.net>
Message-ID: <511898E6.7060400@unil.ch>

>> # Add annotation
>> $treeio->add_phyloXML_annotation(-obj => $tree,
>>                                 -xml => '<name>SUMF family</name>',
>>                                );
>
> If you really have $treeio in your script in this line and not $tree, then that's at least one problem. But the stack suggests that the above isn't the exact line in your script - can you confirm that?
>
> 	-hilmar

I replaced $treeio by $tree in the above line but still get an error.
Don't see what you mean by "the stack suggests that the above isn't the 
exact line in your script"

The only think I changed is the length of the xml string I try to 
insert. But get the same error with an empty xml string.


my $treeio = new Bio::TreeIO(-file   => "$infile",
                              -format => 'phyloxml',
                             );
my $tree = $treeio->next_tree;

# Add annotation
$tree->add_phyloXML_annotation(-obj => $tree,
                                -xml => '<name>SUMF family</name>',
                               );

Can't locate object method "add_phyloXML_annotation" via package
	"Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, <GEN0> 
line 1 (#1)
     (F) You called a method correctly, and it correctly indicated a package
     functioning as a class, but that package doesn't define that particular
     method, nor does any of its base classes.  See perlobj.

Uncaught exception from user code:
	Can't locate object method "add_phyloXML_annotation" via package 
"Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, <GEN0> line 1.
  at ./add_annotation_to_phyloxml.pl line 40


-- 
S?bastien Moretti
Department of Ecology and Evolution,
Biophore, University of Lausanne,
CH-1015 Lausanne, Switzerland
Tel.: +41 (21) 692 4221/4079
http://bioinfo.unil.ch/

From saladi1 at illinois.edu  Tue Feb 12 16:24:34 2013
From: saladi1 at illinois.edu (Shyam Saladi)
Date: Tue, 12 Feb 2013 13:24:34 -0800
Subject: [Bioperl-l] Bio::Tools::SeqStats->count_codons
Message-ID: <CAARX5cX31P-SwDAb1mfiCTUG00bBq_m37Eb3rBemSqD1TBo_nw@mail.gmail.com>

Hi,

I am using the count_codons method from Bio::Tools::SeqStats and keep
getting "AMBIGUOUS" codons, but I can't figure out why exactly.

When I translate the same sequence that gives the error using another
standard utility like (ExPASy - Translate), it seems to work alright.

An example sequence is below. Could anyone lend some insight?

Thanks,
Shyam


AAA     AAC     AAG     AAT     ACA     ACC     ACG     ACT     AGA     AGC
    AGT     *AMBIGUOUS*       ATA     ATC     ATG     ATT     CAA     CAC
  CAG     CAT     CCA     CCC     CCG     CCT     CGA     CGC     CGG
CGT     CTA     CTC     CTG     CTT     GAA     GAC     GAG     GAT     GCA
    GCC     GCG     GCT     GGA     GGC     GGG     GGT     GTA     GTC
GTG     GTT     TAA     TAC     TAT     TCA     TCC     TCG     TCT     TGG
    TGT     TTA     TTC     TTG     TTT     count   filename
1.722488038277511961722488038277511961722
2.966507177033492822966507177033492822967
1.531100478468899521531100478468899521531
0.9569377990430622009569377990430622009569
 0.4784688995215311004784688995215311004785
 1.722488038277511961722488038277511961722
1.33971291866028708133971291866028708134
 1.913875598086124401913875598086124401914
0.1913875598086124401913875598086124401914
 0.7655502392344497607655502392344497607656
 1.435406698564593301435406698564593301435       *
0.09569377990430622009569377990430622009569*
0.3827751196172248803827751196172248803828
 2.488038277511961722488038277511961722488
3.349282296650717703349282296650717703349
3.636363636363636363636363636363636363636
2.870813397129186602870813397129186602871
0.3827751196172248803827751196172248803828
 1.626794258373205741626794258373205741627
0.4784688995215311004784688995215311004785
 1.722488038277511961722488038277511961722
0.5741626794258373205741626794258373205742
 1.052631578947368421052631578947368421053
1.244019138755980861244019138755980861244
0.3827751196172248803827751196172248803828
 0.7655502392344497607655502392344497607656
 0.1913875598086124401913875598086124401914
 2.488038277511961722488038277511961722488
0.4784688995215311004784688995215311004785
 0.6698564593301435406698564593301435406699
 2.105263157894736842105263157894736842105
0.8612440191387559808612440191387559808612
 2.870813397129186602870813397129186602871
1.435406698564593301435406698564593301435
1.722488038277511961722488038277511961722
2.775119617224880382775119617224880382775
2.00956937799043062200956937799043062201
 2.488038277511961722488038277511961722488
3.540669856459330143540669856459330143541
2.00956937799043062200956937799043062201
 0.1913875598086124401913875598086124401914
 2.392344497607655502392344497607655502392
0.8612440191387559808612440191387559808612
 5.454545454545454545454545454545454545455
1.913875598086124401913875598086124401914
0.8612440191387559808612440191387559808612
 4.593301435406698564593301435406698564593
2.679425837320574162679425837320574162679
0.09569377990430622009569377990430622009569
1.148325358851674641148325358851674641148
1.148325358851674641148325358851674641148
0.8612440191387559808612440191387559808612
 0.4784688995215311004784688995215311004785
 2.105263157894736842105263157894736842105
0.9569377990430622009569377990430622009569
 0.9569377990430622009569377990430622009569
 0.09569377990430622009569377990430622009569
2.679425837320574162679425837320574162679
2.966507177033492822966507177033492822967
3.062200956937799043062200956937799043062
2.775119617224880382775119617224880382775       1045    temp.seq

ATGGCACGTTTTTTTATTGATCGTCCCATCTTTGCGTGGGTGATCGCCTTAATTATTATGTTGGCGGGGGTGCTTTCAATTCGCACCCTGCCGGTTTCTCAATATCCCAGCATTGCACCGCCAACCGTGGTGATCAGTGCTAACTACCCTGGTGCATCGGCCAAGATTGTTGAAGACTCAGTGACTCAGGTGATTGAGCAACGCATGAAGGGTATCGATCACCTACGTTATATTGCCTCAACCAGCGATAGTTTCGGTAATGCTGAAATCACTTTGACCTTCAATGCCGAAGCCGATCCTGATATTGCTCAGGTACAAGTTCAGAACAAATTGCAGGGTGCAATGACCCTGTTACCACAAGAGGTACAGGCTCAAGGGGTTGACGTTAACAAATCAAGTTCTGGCTTYTTGATGGTGCTGGGTTTCGTATCGACTGACGGTTCCTTAGATAAAGGCGACATCGCCGACTATGTGGGTGCAAACGTACAAGATCCCATGAGCCGTGTACCGGGCGTGGGTGAAATTCAGCTGTTTGGTGCCCAATATGCGATGCGTATATGGCTTGATCCTTTAAAACTGACTCAATATAACTTGACCAGTTTAGAGGTGATCTCGGCGATTCGTGCTCAAAACGCGCAGGTGTCTGCGGGTCAGTTGGGTGGTACGCCGTCAATTCAAGGGCAAGAACTTAACGCCACTGTTTCGGCGCAAAGTCGTTTGCAAACCCCTGAAGAGTTTCGCAAGATTATCCTGAAGTCTGATACTTCGGGTGCGAATGTGTTCCTCGGTGATGTGGCGCGCGTAGAGTTAGGTTCAGAGAGTTATGCCGTTGTCTCGTTCTACAATGGTAAGCCTGCTACTGGTTTAGCGATTAAACTGGCGACAGGCGCAAACGCGTTGGATACCGCTGAAGCTGTTCGTGATAAAGTTGAAGAATTGCGACCTTTCTTCCCGCAAGGGTTGGATGTTGTTTATCCCTACGATACTACGCCATTCGTTGAGAAATCGATAGAAGGCGTGGTACACACCCTGCTCGAAGCGATTGTTCTGGTGTTTGTCATCATGTACCTCTTCCTGCAAAACTTCCGTGCGACCTTAATTCCGACGATTGCGGTACCAGTGGTCTTGCTGGGAACGTTTGCGATTTTGTCGGCCACGGGCTTCTCTATCAACACCCTTACCATGTTTGCTATGGTGCTGGCGATTGGTCTGTTGGTGGACGACGCCATCGTGGTGGTTGAAAACGTTGAGCGGGTGATGTCGGAAGAAGGGTTGAGCCCACTCGAAGCGACTCGTAAATCGATGGATCAAATCACTGGCGCCTTAGTTGGTATTGGTTTGACGTTATCTGCTGTATTTGTGCCAATGGCATTTATGTCGGGTTCTACTGGGGTCATTTACCGTCAGTTCTCGATCACTATCGTGTCTGCGATGGCATTGTCGGTATTAGTGGCCTTGATTTTAACGCCGGCACTTTGTGCCACTATGTTAAAACCCGTGCAGAAGGGACATGGTCATATTGAAACCGGTTTCTTCGGTTGGTTTAACCGTAACTTTGATCGCTTAACTAACCGTTACGAATCCAGTGTGGCGGGCATAGTGAAGCGTGGCTTTAGAGTCATGATGATTTATGTGGCTTTAGTGGTCGCCGTCGGTTGGATCTTCATGCGTATGCCAACTGCATTCTTACCCGATGAAGACCAAGGTATCTTGTTTACGCAGGCGATTTTGCCAACAAACTCGACTCAAGAAAGTACCCTCAAAGTGCTGGATAAGGTATCCGATCACTTCATGGCTGAAGAAGGCGTGAGATCGGTATTCAGCGTGGCGGGCTTTAGCTTTGCGGGTCAAGGCCAAAACATGGGTATCGCTTTCGTTGGCTTGAAGGATTGGTCAGAGCGTGAAGCACCTGGTATGGATGTGCAGTCTATTGCGGGTCGTGCTATGGGTGCCTTTAGTCAAATTAAAGACGCCTTCGTATTTGCCTTCGTACCACCTGCGGTTATTGAGCTGGGTACGGCGAATGGTTTTGACATGTACCTGCAAGATAAAAACGGTCAAGGCCACGATAAGTTAATAGCGGCTCGTAACCAATTGCTGGGTATGGCGGCTCAGAATCCAAACCTTATGGGTGTTCGCCCTAATGGTCAGGAAGATGCGCCAATCTATCAATTGCATATTGATCATGCAAAGTTGAGCGCATTAGGCGTTGATATTGCTAACGTTAACAGTGTGTTGGCAACTGCTTGGGGTGGTTCCTATGTGAACGATTTTATCGACCGCGGCCGTGTGAAAAAGGTATTTGTGCAAGGTGATGCCCAATACCGTATGCAGCCTGAAGACCTCAACACTTGGTACGTGCGTAACAACAAGGGTGACATGGTGCCATTTTCGGCCTTTGCAACAGGTTCTTGGGAATACGGCTCACCGCGTCTAGAACGTTTTAACGGTTTACCAGCGGTGAATATTCAAGGCGCAACTGCACCAGGCTTTAGTACGGGTGCTGCCATGACTATCATGGAGGACTTAGTTAAGCAGCTACCACCTGGCTTTGGCATCGAGTGGAACGGCTTATCCTACGAGGAACGTTTATCGGGTAACCAAGCACCAGCCTTGTATGCGTTGTCGATTCTGGTGGTATTCCTTGTATTAGCAGCCTTGTATGAAAGCTGGTCAGTACCGTTTGCGGTTATCCTTGTGGTTCCATTGGGGATTATCGGTGCTCTATTGGCGATGAATGGTCGAGGCTTGCCTAACGACGTGTTCTTCCAAGTGGGTCTGTTAACAACGGTTGGTTTGGCAACCAAGAACGCCATCTTGATTGTGGAATTTGCAAAAGAATTCTACGAGAAGGGGGCGGGTCTGGTTGAGGCGACCTTACATGCGGTCCGCGTGCGTTTACGTCCGATTTTAATGACGTCGCTCGCTTTTGGTCTGGGGGTTGTACCGCTAGCCATTAGTACAGGTGTGGGTTCGGGCAGTCAGAACGCCATTGGTACCGGTGTACTTGGCGGTATGATGAGTTCGACCTTCTTAGGTATCTTCTTCGTGCCACTGTTCTTCGTCATTGTTGAGCGGATCTTCAGTAAACGAGAGCGAAAAGCGAAAGAGAAAAATCCTACGTCGACGGATTAA


From bosborne11 at verizon.net  Tue Feb 12 21:30:08 2013
From: bosborne11 at verizon.net (Brian Osborne)
Date: Tue, 12 Feb 2013 21:30:08 -0500
Subject: [Bioperl-l] Bio::Tools::SeqStats->count_codons
In-Reply-To: <CAARX5cX31P-SwDAb1mfiCTUG00bBq_m37Eb3rBemSqD1TBo_nw@mail.gmail.com>
References: <CAARX5cX31P-SwDAb1mfiCTUG00bBq_m37Eb3rBemSqD1TBo_nw@mail.gmail.com>
Message-ID: <C13C35A7-4DBE-4797-A584-DCB6AF772D25@verizon.net>

Shyam,

An ambiguous codon would be one that has a character other than [ACTGU] in it. I see '!' in your sequences, that would create an ambiguous codon.

Brian O.


On Feb 12, 2013, at 4:24 PM, Shyam Saladi <saladi1 at illinois.edu> wrote:

> Hi,
> 
> I am using the count_codons method from Bio::Tools::SeqStats and keep
> getting "AMBIGUOUS" codons, but I can't figure out why exactly.
> 
> When I translate the same sequence that gives the error using another
> standard utility like (ExPASy - Translate), it seems to work alright.
> 
> An example sequence is below. Could anyone lend some insight?
> 
> Thanks,
> Shyam
> 
> 
> 
> AAA     AAC     AAG     AAT     ACA     ACC     ACG     ACT     AGA     AGC
>    AGT     *AMBIGUOUS*       ATA     ATC     ATG     ATT     CAA     CAC
>  CAG     CAT     CCA     CCC     CCG     CCT     CGA     CGC     CGG
> CGT     CTA     CTC     CTG     CTT     GAA     GAC     GAG     GAT     GCA
>    GCC     GCG     GCT     GGA     GGC     GGG     GGT     GTA     GTC
> GTG     GTT     TAA     TAC     TAT     TCA     TCC     TCG     TCT     TGG
>    TGT     TTA     TTC     TTG     TTT     count   filename
> 1.722488038277511961722488038277511961722
> 2.966507177033492822966507177033492822967
> 1.531100478468899521531100478468899521531
> 0.9569377990430622009569377990430622009569
> 0.4784688995215311004784688995215311004785
> 1.722488038277511961722488038277511961722
> 1.33971291866028708133971291866028708134
> 1.913875598086124401913875598086124401914
> 0.1913875598086124401913875598086124401914
> 0.7655502392344497607655502392344497607656
> 1.435406698564593301435406698564593301435       *
> 0.09569377990430622009569377990430622009569*
> 0.3827751196172248803827751196172248803828
> 2.488038277511961722488038277511961722488
> 3.349282296650717703349282296650717703349
> 3.636363636363636363636363636363636363636
> 2.870813397129186602870813397129186602871
> 0.3827751196172248803827751196172248803828
> 1.626794258373205741626794258373205741627
> 0.4784688995215311004784688995215311004785
> 1.722488038277511961722488038277511961722
> 0.5741626794258373205741626794258373205742
> 1.052631578947368421052631578947368421053
> 1.244019138755980861244019138755980861244
> 0.3827751196172248803827751196172248803828
> 0.7655502392344497607655502392344497607656
> 0.1913875598086124401913875598086124401914
> 2.488038277511961722488038277511961722488
> 0.4784688995215311004784688995215311004785
> 0.6698564593301435406698564593301435406699
> 2.105263157894736842105263157894736842105
> 0.8612440191387559808612440191387559808612
> 2.870813397129186602870813397129186602871
> 1.435406698564593301435406698564593301435
> 1.722488038277511961722488038277511961722
> 2.775119617224880382775119617224880382775
> 2.00956937799043062200956937799043062201
> 2.488038277511961722488038277511961722488
> 3.540669856459330143540669856459330143541
> 2.00956937799043062200956937799043062201
> 0.1913875598086124401913875598086124401914
> 2.392344497607655502392344497607655502392
> 0.8612440191387559808612440191387559808612
> 5.454545454545454545454545454545454545455
> 1.913875598086124401913875598086124401914
> 0.8612440191387559808612440191387559808612
> 4.593301435406698564593301435406698564593
> 2.679425837320574162679425837320574162679
> 0.09569377990430622009569377990430622009569
> 1.148325358851674641148325358851674641148
> 1.148325358851674641148325358851674641148
> 0.8612440191387559808612440191387559808612
> 0.4784688995215311004784688995215311004785
> 2.105263157894736842105263157894736842105
> 0.9569377990430622009569377990430622009569
> 0.9569377990430622009569377990430622009569
> 0.09569377990430622009569377990430622009569
> 2.679425837320574162679425837320574162679
> 2.966507177033492822966507177033492822967
> 3.062200956937799043062200956937799043062
> 2.775119617224880382775119617224880382775       1045    temp.seq
> 
> ATGGCACGTTTTTTTATTGATCGTCCCATCTTTGCGTGGGTGATCGCCTTAATTATTATGTTGGCGGGGGTGCTTTCAATTCGCACCCTGCCGGTTTCTCAATATCCCAGCATTGCACCGCCAACCGTGGTGATCAGTGCTAACTACCCTGGTGCATCGGCCAAGATTGTTGAAGACTCAGTGACTCAGGTGATTGAGCAACGCATGAAGGGTATCGATCACCTACGTTATATTGCCTCAACCAGCGATAGTTTCGGTAATGCTGAAATCACTTTGACCTTCAATGCCGAAGCCGATCCTGATATTGCTCAGGTACAAGTTCAGAACAAATTGCAGGGTGCAATGACCCTGTTACCACAAGAGGTACAGGCTCAAGGGGTTGACGTTAACAAATCAAGTTCTGGCTTYTTGATGGTGCTGGGTTTCGTATCGACTGACGGTTCCTTAGATAAAGGCGACATCGCCGACTATGTGGGTGCAAACGTACAAGATCCCATGAGCCGTGTACCGGGCGTGGGTGAAATTCAGCTGTTTGGTGCCCAATATGCGATGCGTATATGGCTTGATCCTTTAAAACTGACTCAATATAACTTGACCAGTTTAGAGGTGATCTCGGCGATTCGTGCTCAAAACGCGCAGGTGTCTGCGGGTCAGTTGGGTGGTACGCCGTCAATTCAAGGGCAAGAACTTAACGCCACTGTTTCGGCGCAAAGTCGTTTGCAAACCCCTGAAGAGTTTCGCAAGATTATCCTGAAGTCTGATACTTCGGGTGCGAATGTGTTCCTCGGTGATGTGGCGCGCGTAGAGTTAGGTTCAGAGAGTTATGCCGTTGTCTCGTTCTACAATGGTAAGCCTGCTACTGGTTTAGCGATTAAACTGGCGACAGGCGCAAACGCGTTGGATACCGCTGAAGCTGTTCGTGATAAAGTTGAAGAATTGCGACCTTTCTTCCCGCAAGGGTTGGATGTTGTTTATCCCTACGATACTAC!
> GCCATTCGTTGAGAAATCGATAGAAGGCGTGGTACACACCCTGCTCGAAGCGATTGTTCTGGTGTTTGTCATCATGTACCTCTTCCTGCAAAACTTCCGTGCGACCTTAATTCCGACGATTGCGGTACCAGTGGTCTTGCTGGGAACGTTTGCGATTTTGTCGGCCACGGGCTTCTCTATCAACACCCTTACCATGTTTGCTATGGTGCTGGCGATTGGTCTGTTGGTGGACGACGCCATCGTGGTGGTTGAAAACGTTGAGCGGGTGATGTCGGAAGAAGGGTTGAGCCCACTCGAAGCGACTCGTAAATCGATGGATCAAATCACTGGCGCCTTAGTTGGTATTGGTTTGACGTTATCTGCTGTATTTGTGCCAATGGCATTTATGTCGGGTTCTACTGGGGTCATTTACCGTCAGTTCTCGATCACTATCGTGTCTGCGATGGCATTGTCGGTATTAGTGGCCTTGATTTTAACGCCGGCACTTTGTGCCACTATGTTAAAACCCGTGCAGAAGGGACATGGTCATATTGAAACCGGTTTCTTCGGTTGGTTTAACCGTAACTTTGATCGCTTAACTAACCGTTACGAATCCAGTGTGGCGGGCATAGTGAAGCGTGGCTTTAGAGTCATGATGATTTATGTGGCTTTAGTGGTCGCCGTCGGTTGGATCTTCATGCGTATGCCAACTGCATTCTTACCCGATGAAGACCAAGGTATCTTGTTTACGCAGGCGATTTTGCCAACAAACTCGACTCAAGAAAGTACCCTCAAAGTGCTGGATAAGGTATCCGATCACTTCATGGCTGAAGAAGGCGTGAGATCGGTATTCAGCGTGGCGGGCTTTAGCTTTGCGGGTCAAGGCCAAAACATGGGTATCGCTTTCGTTGGCTTGAAGGATTGGTCAGAGCGTGAAGCACCTGGTATGGATGTGCAGTCTATTGCGGGTCGTGCTATGGGTGCCTTTAGTCAAATTAAAGACGCCTTC!
> GTATTTGCCTTCGTACCACCTGCGGTTATTGAGCTGGGTACGGCGAATGGTTTTGACATGTACCTGCAAG
> ATAAAAACGGTCAAGGCCACGATAAGTTAATAGCGGCTCGTAACCAATTGCTGGGTATGGCGGCTCAGAATCCAAACCTTATGGGTGTTCGCCCTAATGGTCAGGAAGATGCGCCAATCTATCAATTGCATATTGATCATGCAAAGTTGAGCGCATTAGGCGTTGATATTGCTAACGTTAACAGTGTGTTGGCAACTGCTTGGGGTGGTTCCTATGTGAACGATTTTATCGACCGCGGCCGTGTGAAAAAGGTATTTGTGCAAGGTGATGCCCAATACCGTATGCAGCCTGAAGACCTCAACACTTGGTACGTGCGTAACAACAAGGGTGACATGGTGCCATTTTCGGCCTTTGCAACAGGTTCTTGGGAATACGGCTCACCGCGTCTAGAACGTTTTAACGGTTTACCAGCGGTGAATATTCAAGGCGCAACTGCACCAGGCTTTAGTACGGGTGCTGCCATGACTATCATGGAGGACTTAGTTAAGCAGCTACCACCTGGCTTTGGCATCGAGTGGAACGGCTTATCCTACGAGGAACGTTTATCGGGTAACCAAGCACCAGCCTTGTATGCGTTGTCGATTCTGGTGGTATTCCTTGTATTAGCAGCCTTGTATGAAAGCTGGTCAGTACCGTTTGCGGTTATCCTTGTGGTTCCATTGGGGATTATCGGTGCTCTATTGGCGATGAATGGTCGAGGCTTGCCTAACGACGTGTTCTTCCAAGTGGGTCTGTTAACAACGGTTGGTTTGGCAACCAAGAACGCCATCTTGATTGTGGAATTTGCAAAAGAATTCTACGAGAAGGGGGCGGGTCTGGTTGAGGCGACCTTACATGCGGTCCGCGTGCGTTTACGTCCGATTTTAATGACGTCGCTCGCTTTTGGTCTGGGGGTTGTACCGCTAGCCATTAGTACAGGTGTGGGTTCGGGCAGTCAGAACGCCATTGGTACCGGTGTACTTGGCGGTATGATGAGTTCGACCTTCTTA!
> GGTATCTTCTTCGTGCCACTGTTCTTCGTCATTGTTGAGCGGATCTTCAGTAAACGAGAGCGAAAAGCGAAAGAGAAAAATCCTACGTCGACGGATTAA
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Wed Feb 13 10:18:10 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 13 Feb 2013 15:18:10 +0000
Subject: [Bioperl-l] [ANNOUNCEMENT] BioPerl Future Development
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE24CF5@CHIMBX5.ad.uillinois.edu>

All,

tl;dr: A lot of change is coming.  Be forewarned and be prepared.

This is an 'official' announcement to the BioPerl community on future BioPerl plans.  We have decided to move continued maintenance of Bioperl release series over to the new 'v1' branch.  This branch will be the point where any future versions of 1.6.x code will be released, starting with the (already-scheduled) March 1 release.  The 'master' branch will become the main focal point for future development of BioPerl going into an eventual v2 release, with a focus on performance enhancements, addressing newer technologies like NGS and large data, code cleanup, and simplifying the code base.

We welcome any help with code improvements. GMOD folks? Want to help? This is a good opportunity to address BioPerl short-comings in the code base! 

What this means for anyone using BioPerl currently:

1) We anticipate significant issues if you are relying on the 'master' branch for anything.  To inelegantly state it, the core developers are taking back the 'master' branch for future development. Please please please do not rely on the 'master' branch for stable code; if you are reliant on the BioPerl 1.6.x, make sure to use 'v1'.  We can revisit whether to make 'v1' the default checkout branch if/when the need arises.

2) Expect not to find some modules.  We will be migrating modules requiring external dependencies and other associated chunks of the code base out into their own repositories over the next year to help future maintenance; the eventual intent is to release all of these independently on CPAN.  We will completely remove all code previously marked as deprecated, and we may immediately deprecate additional modules if needed (this will of course be discussed on list).

3) Expect version numbering to change significantly.  Because we are releasing code in separate repositories, I fully expect downstream versioning problems if we stick with the current system (e.g. all bioperl-live modules having the same version).  It will be too much of a headache to sync versions for all modules as this will entail making a full release of all bioperl code, one of the main reasons we are splitting out code to begin with.  At the moment, no specific versioning scheme has been chosen, though I *highly* recommend using X.Y versioning for simplicity (e.g. no more 3-point versions).  This is the standard that Lincoln has adopted for Bio::Graphics and GBrowse.

4) Expect quick deprecation of methods within modules as needed.  These should of course be brought up to the mail list prior to actual implementation, but I would anticipate some things changing as we try to adopt a more consistent method naming scheme.

5) The same steps outlined for bioperl-live will apply for bioperl-run modules.  We will have to decide the best approach to use for those, e.g. whether to separate them out based on task (alignment), application group (NGS, BLAST, RNA), etc. and how these may fit organically with bioperl-live modules where appropriate.

6) Do not expect a new CPAN release of such code until Dec 2013.  Even then it will be in an alpha stage.  We are all busy campers.

We do not anticipate significant changes to bioperl-network or bioperl-db at this time beyond updating them to deal with new changes. 

I'm sure there are many other points that need to be discussed.   Please reply over the next week if you have any concerns. 

chris

From cjfields at illinois.edu  Wed Feb 13 11:01:07 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 13 Feb 2013 16:01:07 +0000
Subject: [Bioperl-l] Test-pls ignore
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE2506D@CHIMBX5.ad.uillinois.edu>

testing the mail list to see if it is working.

-c


From sebastien.moretti at unil.ch  Wed Feb 13 11:21:23 2013
From: sebastien.moretti at unil.ch (=?ISO-8859-1?Q?Moretti_S=E9bastien?=)
Date: Wed, 13 Feb 2013 17:21:23 +0100
Subject: [Bioperl-l] PhyloXML
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE24E8D@CHIMBX5.ad.uillinois.edu>
References: <51152591.9010402@unil.ch>
	<F041F111-CF8F-4096-9968-5F8CA5DCA866@drycafe.net>
	<511898E6.7060400@unil.ch>
	<118F034CF4C3EF48A96F86CE585B94BF6CE24E8D@CHIMBX5.ad.uillinois.edu>
Message-ID: <511BBD83.2000708@unil.ch>

>>>> # Add annotation
>>>> $treeio->add_phyloXML_annotation(-obj => $tree,
>>>>                                 -xml => '<name>SUMF family</name>',
>>>>                                );
>>>
>>> If you really have $treeio in your script in this line and not $tree, then that's at least one problem. But the stack suggests that the above isn't the exact line in your script - can you confirm that?
>>>
>>> 	-hilmar
>>
>> I replaced $treeio by $tree in the above line but still get an error.
>> Don't see what you mean by "the stack suggests that the above isn't the exact line in your script"
>>
>> The only think I changed is the length of the xml string I try to insert. But get the same error with an empty xml string.
>>
>>
>>
>> my $treeio = new Bio::TreeIO(-file   => "$infile",
>>                              -format => 'phyloxml',
>>                             );
>> my $tree = $treeio->next_tree;
>>
>> # Add annotation
>> $tree->add_phyloXML_annotation(-obj => $tree,
>>                                -xml => '<name>SUMF family</name>',
>>                               );
>>
>> Can't locate object method "add_phyloXML_annotation" via package
>> 	"Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, <GEN0> line 1 (#1)
>>     (F) You called a method correctly, and it correctly indicated a package
>>     functioning as a class, but that package doesn't define that particular
>>     method, nor does any of its base classes.  See perlobj.
>>
>> Uncaught exception from user code:
>> 	Can't locate object method "add_phyloXML_annotation" via package "Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, <GEN0> line 1.
>> at ./add_annotation_to_phyloxml.pl line 40
>
> Will have to look into this.  One problem we have is that phyloXML support has dwindled, so if anyone wants to take this on I would be more than happy to help them get started.
>
> chris

You mean that BioPerl 1.6.901 has not a full support of PhyloXML ?
The problem I have is "expected" ?

-- 
S?bastien Moretti

From cjfields at illinois.edu  Wed Feb 13 10:47:17 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 13 Feb 2013 15:47:17 +0000
Subject: [Bioperl-l] PhyloXML
In-Reply-To: <511898E6.7060400@unil.ch>
References: <51152591.9010402@unil.ch>
	<F041F111-CF8F-4096-9968-5F8CA5DCA866@drycafe.net>
	<511898E6.7060400@unil.ch>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE24E8D@CHIMBX5.ad.uillinois.edu>

On Feb 11, 2013, at 1:08 AM, S?bastien MORETTI <sebastien.moretti at unil.ch> wrote:

>>> # Add annotation
>>> $treeio->add_phyloXML_annotation(-obj => $tree,
>>>                                -xml => '<name>SUMF family</name>',
>>>                               );
>> 
>> If you really have $treeio in your script in this line and not $tree, then that's at least one problem. But the stack suggests that the above isn't the exact line in your script - can you confirm that?
>> 
>> 	-hilmar
> 
> I replaced $treeio by $tree in the above line but still get an error.
> Don't see what you mean by "the stack suggests that the above isn't the exact line in your script"
> 
> The only think I changed is the length of the xml string I try to insert. But get the same error with an empty xml string.
> 
> 
> 
> my $treeio = new Bio::TreeIO(-file   => "$infile",
>                             -format => 'phyloxml',
>                            );
> my $tree = $treeio->next_tree;
> 
> # Add annotation
> $tree->add_phyloXML_annotation(-obj => $tree,
>                               -xml => '<name>SUMF family</name>',
>                              );
> 
> Can't locate object method "add_phyloXML_annotation" via package
> 	"Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, <GEN0> line 1 (#1)
>    (F) You called a method correctly, and it correctly indicated a package
>    functioning as a class, but that package doesn't define that particular
>    method, nor does any of its base classes.  See perlobj.
> 
> Uncaught exception from user code:
> 	Can't locate object method "add_phyloXML_annotation" via package "Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, <GEN0> line 1.
> at ./add_annotation_to_phyloxml.pl line 40
> 
> 
> 
> -- 
> S?bastien Moretti
> Department of Ecology and Evolution,
> Biophore, University of Lausanne,
> CH-1015 Lausanne, Switzerland
> Tel.: +41 (21) 692 4221/4079
> http://bioinfo.unil.ch/\

Will have to look into this.  One problem we have is that phyloXML support has dwindled, so if anyone wants to take this on I would be more than happy to help them get started.

chris


From carandraug+dev at gmail.com  Wed Feb 13 12:23:23 2013
From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=)
Date: Wed, 13 Feb 2013 17:23:23 +0000
Subject: [Bioperl-l] Next BioPerl release
Message-ID: <CAPOrs_0HoMHm6u5VFgCRONsv8YF_OX5TE1dJLTS+qBTRuh_Btw@mail.gmail.com>

On 5 February 2013 21:53, Fields, Christopher J <cjfields at illinois.edu> wrote:
> I am scheduling the next BioPerl CPAN release tentatively for March 1.  Any help in triaging bug reports would be greatly appreciated!

Hi

is this release of bioperl-live only or also includes bioperl-run?

Carn?


From cjfields at illinois.edu  Wed Feb 13 12:08:21 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 13 Feb 2013 17:08:21 +0000
Subject: [Bioperl-l] PhyloXML
In-Reply-To: <511BBD83.2000708@unil.ch>
References: <51152591.9010402@unil.ch>
	<F041F111-CF8F-4096-9968-5F8CA5DCA866@drycafe.net>
	<511898E6.7060400@unil.ch>
	<118F034CF4C3EF48A96F86CE585B94BF6CE24E8D@CHIMBX5.ad.uillinois.edu>
	<511BBD83.2000708@unil.ch>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE2532B@CHIMBX5.ad.uillinois.edu>

On Feb 13, 2013, at 10:21 AM, Moretti S?bastien <sebastien.moretti at unil.ch> wrote:

>>>>> # Add annotation
>>>>> $treeio->add_phyloXML_annotation(-obj => $tree,
>>>>>                                -xml => '<name>SUMF family</name>',
>>>>>                               );
>>>> 
>>>> If you really have $treeio in your script in this line and not $tree, then that's at least one problem. But the stack suggests that the above isn't the exact line in your script - can you confirm that?
>>>> 
>>>> 	-hilmar
>>> 
>>> I replaced $treeio by $tree in the above line but still get an error.
>>> Don't see what you mean by "the stack suggests that the above isn't the exact line in your script"
>>> 
>>> The only think I changed is the length of the xml string I try to insert. But get the same error with an empty xml string.
>>> 
>>> 
>>> 
>>> my $treeio = new Bio::TreeIO(-file   => "$infile",
>>>                             -format => 'phyloxml',
>>>                            );
>>> my $tree = $treeio->next_tree;
>>> 
>>> # Add annotation
>>> $tree->add_phyloXML_annotation(-obj => $tree,
>>>                               -xml => '<name>SUMF family</name>',
>>>                              );
>>> 
>>> Can't locate object method "add_phyloXML_annotation" via package
>>> 	"Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, <GEN0> line 1 (#1)
>>>    (F) You called a method correctly, and it correctly indicated a package
>>>    functioning as a class, but that package doesn't define that particular
>>>    method, nor does any of its base classes.  See perlobj.
>>> 
>>> Uncaught exception from user code:
>>> 	
>>> at ./add_annotation_to_phyloxml.pl line 40
>> 
>> Will have to look into this.  One problem we have is that phyloXML support has dwindled, so if anyone wants to take this on I would be more than happy to help them get started.
>> 
>> chris
> 
> You mean that BioPerl 1.6.901 has not a full support of PhyloXML ?
> The problem I have is "expected" ?
> 
> -- 
> S?bastien Moretti

I think it handles most of phyloXML fine, but the implementation of the parser is a little tricky.  I tried cleaning this up a few years back but didn't make much progress.

The function is in Bio::TreeIO::phyloxml, so the correct call should be (as you previously had it):

    $treeio->add_phyloXML_annotation(-obj => $tree,
                              -xml => '<name>SUMF family</name>',
                             );

My guess is that Bio::Tree::Tree was AnnotatableI at one point but that was removed, will have to trace that back.  Can you file a bug on this?

https://redmine.open-bio.org/

chris


From cjfields at illinois.edu  Wed Feb 13 13:05:53 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 13 Feb 2013 18:05:53 +0000
Subject: [Bioperl-l] Next BioPerl release
In-Reply-To: <CAPOrs_0HoMHm6u5VFgCRONsv8YF_OX5TE1dJLTS+qBTRuh_Btw@mail.gmail.com>
References: <CAPOrs_0HoMHm6u5VFgCRONsv8YF_OX5TE1dJLTS+qBTRuh_Btw@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE25573@CHIMBX5.ad.uillinois.edu>

On Feb 13, 2013, at 11:23 AM, Carn? Draug <carandraug+dev at gmail.com> wrote:

> On 5 February 2013 21:53, Fields, Christopher J <cjfields at illinois.edu> wrote:
>> I am scheduling the next BioPerl CPAN release tentatively for March 1.  Any help in triaging bug reports would be greatly appreciated!
> 
> Hi
> 
> is this release of bioperl-live only or also includes bioperl-run?
> 
> Carn?

We can work on a bioperl-run release.  It's too much to handle both in one go.  The problem I have faced with bioperl-run in the past is similar to bioperl-live, that the tools used are a moving target and that makes the wrappers easily out-of-date.  I would really like a more flexible generic way of defining these that would allow for easier maintenance.

chris

From l.m.timmermans at students.uu.nl  Wed Feb 13 14:44:22 2013
From: l.m.timmermans at students.uu.nl (Leon Timmermans)
Date: Wed, 13 Feb 2013 20:44:22 +0100
Subject: [Bioperl-l] Next BioPerl release
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE25573@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_0HoMHm6u5VFgCRONsv8YF_OX5TE1dJLTS+qBTRuh_Btw@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE25573@CHIMBX5.ad.uillinois.edu>
Message-ID: <CAC1jpXBf+uOXHKpxb7o8t3pYttnnRF35A49zY5M-3mEOuniGCA@mail.gmail.com>

On Wed, Feb 13, 2013 at 7:05 PM, Fields, Christopher J
<cjfields at illinois.edu> wrote:
> We can work on a bioperl-run release.  It's too much to handle both in one go.  The problem I have faced with bioperl-run in the past is similar to bioperl-live, that the tools used are a moving target and that makes the wrappers easily out-of-date.  I would really like a more flexible generic way of defining these that would allow for easier maintenance.

Also, bioperl-run needs to be cut into smaller distributions even more
than bioperl-live. Few people if anyone at all has all tools it tries
to wrap at hand, so its almost impossible to pass its testing suite.

We need dists that can realistically pass.

Leon


From cjfields at illinois.edu  Wed Feb 13 16:04:26 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 13 Feb 2013 21:04:26 +0000
Subject: [Bioperl-l] Next BioPerl release
In-Reply-To: <CAC1jpXBf+uOXHKpxb7o8t3pYttnnRF35A49zY5M-3mEOuniGCA@mail.gmail.com>
References: <CAPOrs_0HoMHm6u5VFgCRONsv8YF_OX5TE1dJLTS+qBTRuh_Btw@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE25573@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBf+uOXHKpxb7o8t3pYttnnRF35A49zY5M-3mEOuniGCA@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE25B07@CHIMBX5.ad.uillinois.edu>

On Feb 13, 2013, at 1:44 PM, Leon Timmermans <l.m.timmermans at students.uu.nl> wrote:

> On Wed, Feb 13, 2013 at 7:05 PM, Fields, Christopher J
> <cjfields at illinois.edu> wrote:
>> We can work on a bioperl-run release.  It's too much to handle both in one go.  The problem I have faced with bioperl-run in the past is similar to bioperl-live, that the tools used are a moving target and that makes the wrappers easily out-of-date.  I would really like a more flexible generic way of defining these that would allow for easier maintenance.
> 
> Also, bioperl-run needs to be cut into smaller distributions even more
> than bioperl-live. Few people if anyone at all has all tools it tries
> to wrap at hand, so its almost impossible to pass its testing suite.
> 
> We need dists that can realistically pass.
> 
> Leon

Yup.  It's a mess.

chris

From florent.angly at gmail.com  Wed Feb 13 17:33:14 2013
From: florent.angly at gmail.com (Florent Angly)
Date: Thu, 14 Feb 2013 08:33:14 +1000
Subject: [Bioperl-l] [ANNOUNCEMENT] BioPerl Future Development
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE24CF5@CHIMBX5.ad.uillinois.edu>
References: <118F034CF4C3EF48A96F86CE585B94BF6CE24CF5@CHIMBX5.ad.uillinois.edu>
Message-ID: <511C14AA.9030107@gmail.com>

On 14/02/13 01:18, Fields, Christopher J wrote:
> I*highly*  recommend using X.Y versioning for simplicity (e.g. no more 3-point versions)
Yes, I support the X.Y versioning as well.
Florent

From l.m.timmermans at students.uu.nl  Wed Feb 13 18:12:06 2013
From: l.m.timmermans at students.uu.nl (Leon Timmermans)
Date: Thu, 14 Feb 2013 00:12:06 +0100
Subject: [Bioperl-l] [ANNOUNCEMENT] BioPerl Future Development
In-Reply-To: <511C14AA.9030107@gmail.com>
References: <118F034CF4C3EF48A96F86CE585B94BF6CE24CF5@CHIMBX5.ad.uillinois.edu>
	<511C14AA.9030107@gmail.com>
Message-ID: <CAC1jpXBk9prChjjeHmnykWh4j7FRMN1adY0ibzM8uqH1+Z5uGA@mail.gmail.com>

On Wed, Feb 13, 2013 at 11:33 PM, Florent Angly <florent.angly at gmail.com> wrote:
> On 14/02/13 01:18, Fields, Christopher J wrote:
>>
>> I*highly*  recommend using X.Y versioning for simplicity (e.g. no more
>> 3-point versions)
>
> Yes, I support the X.Y versioning as well.
> Florent

See also: http://www.dagolden.com/index.php/369/version-numbers-should-be-boring/

Leon

From daisieh at gmail.com  Thu Feb 14 00:21:15 2013
From: daisieh at gmail.com (Daisie Huang)
Date: Wed, 13 Feb 2013 21:21:15 -0800 (PST)
Subject: [Bioperl-l] Question regarding while loops for reading files
In-Reply-To: <CADdQm2mHL-_X+bPh=cVwp1_xMCrVGhe0=D75Uf410X_L=qHz3g@mail.gmail.com>
References: <CADdQm2mHL-_X+bPh=cVwp1_xMCrVGhe0=D75Uf410X_L=qHz3g@mail.gmail.com>
Message-ID: <3cbbba3b-759d-4281-9592-6b690aea92ab@googlegroups.com>

I think you need to reset the pointer to the filehandle before you go 
through the while loop the second time: seek $fh,0,0

On Wednesday, February 13, 2013 6:46:41 PM UTC-8, Tiago Hori wrote:
>
> Hey Guys,
>
> I am still at the same place. I am writing these little pieces of code to 
> try to learn the language better, so any advice would be useful. I am again 
> parsing through tab delimited files and now trying to find fish from on id 
> (in these case families AS5 and AS9), retrieve the weights and average 
> them. When I started I did it for one family and it worked (instead of the 
> @families I had a scalar $family set to AS5). But really it is more useful 
> to look at more than one family at time (I should mention that are 2 types 
> of fish per family one ends in PS , the other doesn't). So I tried to use a 
> foreach loop to go through the file twice, once with a the search value set 
> to AS5 and a second time to AS9. It works for AS5, but for some reason, the 
> foreach loop sets $test to AS9 the second time, but it doesn't go through 
> the while loop. What am I doing wrong? 
>
> here is the code:
>
> #! /usr/bin/perl
> use strict;
> use warnings;
>
> my $file = $ARGV[0];
> my @family = ('AS5','AS9');
> my $i;
> my $ii;
> my $test;
>
> open (my $fh, "<", $file) or die ("Can't open $file: $!");
>
> foreach (@family){
>     $test = $_;
>     my @data_weight_2N = ();
>     my @data_weight_3N = ();
>     while (<$fh>){
>         chomp;  
>         my $line = $_;
>         my @data  = split ("\t", $line);
>         if ($data[0] !~ /[0-9]*/){
>         next;}
>         elsif ($data[1] eq "ABF09-$test"){
>             $i += 1; 
>             push (@data_weight_2N,  $data[6]);
>         }elsif ($data[1] eq "ABF09-".$test."PS"){
>         $ii += 1;
>             push (@data_weight_3N,$data[6]);
>     }
> }
>     my $mean_2N = &average (\@data_weight_2N);
>     my $stdev_2N = &stdev (\@data_weight_2N);
>     my $stderr_2N = ($stdev_2N/sqrt($i));
>
>     print "These are the the avearge weight, stdev and stderr for $test 
> 2N:\t", $mean_2N,"\t",$stdev_2N,"\t",$stderr_2N, "\n";
>
>     my $mean_3N = &average (\@data_weight_3N);
>     my $stdev_3N = &stdev (\@data_weight_3N);
>     my $stderr_3N = ($stdev_3N/sqrt($i));
>
>     print "These are the the avearge weight, stdev and stderr for $test 
> 3N:\t", $mean_3N,"\t",$stdev_3N,"\t",$stderr_3N, "\n";
> }
>
> close ($fh);
>
> sub average{
>         my($data) = @_;
>         if (not @$data) {
>                 print ("Empty array\n");
>                 return 0;
>         }
>         my $total = 0;
>         foreach (@$data) {
>                 $total += $_;
>         }
>         my $average = $total / @$data;
>         return $average;
> }
>
> sub stdev{
>         my($data) = @_;
>         if(@$data == 1){
>                 return 0;
>         }
>         my $average = &average($data);
>         my $sqtotal = 0;
>         foreach(@$data) {
>                 $sqtotal += ($average-$_) ** 2;
>         }
>         my $std = ($sqtotal / (@$data-1)) ** 0.5;
>         return $std;
> }
>
> Thanks,
>
> T.
>
> -- 
> "Education is not to be used to promote obscurantism." - Theodonius 
> Dobzhansky.
>
> "Gracias a la vida que me ha dado tanto
> Me ha dado el sonido y el abecedario
> Con ?l, las palabras que pienso y declaro
> Madre, amigo, hermano
> Y luz alumbrando la ruta del alma del que estoy amando
>
> Gracias a la vida que me ha dado tanto
> Me ha dado la marcha de mis pies cansados
> Con ellos anduve ciudades y charcos
> Playas y desiertos, monta?as y llanos
> Y la casa tuya, tu calle y tu patio"
>
> Violeta Parra - Gracias a la Vida
>
> Tiago S. F. Hori. PhD.
> Ocean Science Center-Memorial University of Newfoundland 
>


From sebastien.moretti at unil.ch  Thu Feb 14 03:09:06 2013
From: sebastien.moretti at unil.ch (=?ISO-8859-1?Q?S=E9bastien_MORETTI?=)
Date: Thu, 14 Feb 2013 09:09:06 +0100
Subject: [Bioperl-l] PhyloXML
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE2532B@CHIMBX5.ad.uillinois.edu>
References: <51152591.9010402@unil.ch>
	<F041F111-CF8F-4096-9968-5F8CA5DCA866@drycafe.net>
	<511898E6.7060400@unil.ch>
	<118F034CF4C3EF48A96F86CE585B94BF6CE24E8D@CHIMBX5.ad.uillinois.edu>
	<511BBD83.2000708@unil.ch>
	<118F034CF4C3EF48A96F86CE585B94BF6CE2532B@CHIMBX5.ad.uillinois.edu>
Message-ID: <511C9BA2.9000508@unil.ch>

>>>>>> # Add annotation
>>>>>> $treeio->add_phyloXML_annotation(-obj => $tree,
>>>>>>                                 -xml => '<name>SUMF family</name>',
>>>>>>                                );
>>>>>
>>>>> If you really have $treeio in your script in this line and not $tree, then that's at least one problem. But the stack suggests that the above isn't the exact line in your script - can you confirm that?
>>>>>
>>>>> 	-hilmar
>>>>
>>>> I replaced $treeio by $tree in the above line but still get an error.
>>>> Don't see what you mean by "the stack suggests that the above isn't the exact line in your script"
>>>>
>>>> The only think I changed is the length of the xml string I try to insert. But get the same error with an empty xml string.
>>>>
>>>>
>>>>
>>>> my $treeio = new Bio::TreeIO(-file   => "$infile",
>>>>                              -format => 'phyloxml',
>>>>                             );
>>>> my $tree = $treeio->next_tree;
>>>>
>>>> # Add annotation
>>>> $tree->add_phyloXML_annotation(-obj => $tree,
>>>>                                -xml => '<name>SUMF family</name>',
>>>>                               );
>>>>
>>>> Can't locate object method "add_phyloXML_annotation" via package
>>>> 	"Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, <GEN0> line 1 (#1)
>>>>     (F) You called a method correctly, and it correctly indicated a package
>>>>     functioning as a class, but that package doesn't define that particular
>>>>     method, nor does any of its base classes.  See perlobj.
>>>>
>>>> Uncaught exception from user code:
>>>> 	
>>>> at ./add_annotation_to_phyloxml.pl line 40
>>>
>>> Will have to look into this.  One problem we have is that phyloXML support has dwindled, so if anyone wants to take this on I would be more than happy to help them get started.
>>>
>>> chris
>>
>> You mean that BioPerl 1.6.901 has not a full support of PhyloXML ?
>> The problem I have is "expected" ?
>>
>> --
>> S?bastien Moretti
>
> I think it handles most of phyloXML fine, but the implementation of the parser is a little tricky.  I tried cleaning this up a few years back but didn't make much progress.
>
> The function is in Bio::TreeIO::phyloxml, so the correct call should be (as you previously had it):
>
>      $treeio->add_phyloXML_annotation(-obj => $tree,
>                                -xml => '<name>SUMF family</name>',
>                               );
>
> My guess is that Bio::Tree::Tree was AnnotatableI at one point but that was removed, will have to trace that back.  Can you file a bug on this?
>
> https://redmine.open-bio.org/
>
> chris

I will fill a bug on this.

I'd be happy to try to contribute to the phyloxml code.
But don't know how to proceed for BioPerl.

-- 
S?bastien Moretti

From hartzell at alerce.com  Thu Feb 14 15:04:44 2013
From: hartzell at alerce.com (George Hartzell)
Date: Thu, 14 Feb 2013 12:04:44 -0800
Subject: [Bioperl-l] Question regarding while loops for reading files
In-Reply-To: <3cbbba3b-759d-4281-9592-6b690aea92ab@googlegroups.com>
References: <CADdQm2mHL-_X+bPh=cVwp1_xMCrVGhe0=D75Uf410X_L=qHz3g@mail.gmail.com>
	<3cbbba3b-759d-4281-9592-6b690aea92ab@googlegroups.com>
Message-ID: <20765.17244.185833.755900@gargle.gargle.HOWL>


I think that it's important to get feedback on code that one has
written and to try to understand how/what/why someone else has done in
their code.  To that end....

Since Tiago's using this to learn the language better I can't resist
some comments beyond resetting the file handle.

For grins I rewrote it using Text::CSV_XS and Statistics::Basic and to
take a single pass through the data file using a multilevel data
structure.

I resisted the urge to rewrite it in Moose.  Didn't even have an urge
to rewrite it in R.  Funny, that....

The script is here

  Tiago.pl
    https://gist.github.com/hartzell/4955401

With something like what I think the data looks like here:

    https://gist.github.com/hartzell/4955570

Even without that big of a rewrite, I had a bunch of local comments
which are inline below.

Daisie Huang writes:
 > [...]
 > On Wednesday, February 13, 2013 6:46:41 PM UTC-8, Tiago Hori wrote:
 > >
 > > Hey Guys,
 > >
 > > I am still at the same place. I am writing these little pieces of code to 
 > > try to learn the language better, so any advice would be useful.
 > > [...]
 > > here is the code:
 > >
 > > #! /usr/bin/perl
 > > use strict;
 > > use warnings;
 > >
 > > my $file = $ARGV[0];

Slightly better would be $filename, so that when you step up to
Path::Class you can differentiate a file object from a file name
string.

 > > my @family = ('AS5','AS9');

Better would be @families, plural.  See the use of $family below.

 > > my $i;
 > > my $ii;

As far as I can tell, these are just counting the number of things
that you push onto the various arrays.  You don't need them, referring
to the list in scalar context will give you its size.

 > > my $test;

You use this to hold the name of the family, so it's not particularly
evocative.  You should also restrict it's scope to within the loop.
See the comment for the foreach loop.

 > > open (my $fh, "<", $file) or die ("Can't open $file: $!");

You made my day, three arg. open *and* you checked for errors.  Nice!

 > > foreach (@family){

Better would be

  for my $family (@families) {

which is evocative and restricts the scope of $family to the for loop
(and for is 4 characters shorter than foreach...).

 > >     $test = $_;

No longer need this, using $family declared in the for loop with the
proper scoping.

 > >     my @data_weight_2N = ();
 > >     my @data_weight_3N = ();
 > >     while (<$fh>){
 > >         chomp;  
 > >         my $line = $_;
 > >         my @data  = split ("\t", $line);

Don't parse CSV (TSV) files yourself.  Get in the habit of using
Text::CSV_XS.

 > >         if ($data[0] !~ /[0-9]*/){
 > >         next;}
 > >         elsif ($data[1] eq "ABF09-$test"){
 > >             $i += 1; 

You don't need the counter.

 > >             push (@data_weight_2N,  $data[6]);
 > >         }elsif ($data[1] eq "ABF09-".$test."PS"){
 > >         $ii += 1;

You don't need the counter.

 > >             push (@data_weight_3N,$data[6]);
 > >     }
 > > }
 > >     my $mean_2N = &average (\@data_weight_2N);
 > >     my $stdev_2N = &stdev (\@data_weight_2N);

You don't need the ampersands on the subroutine calls.  They're old
school <joke> and just encourage people to make fun of our language for its
use of all those funny punctuation marks </joke>.

 > >     my $stderr_2N = ($stdev_2N/sqrt($i));

Unless I'm mistaken, this is equivalent

    my $stderr_2N = ($stdev_2N/sqrt(scalar @data_weight_2N));

and you don't need the counter, the explicit use of scalar there might
even be redundant (I'm a coward).  You use the same trick in your
subroutine defn's below.

 > >
 > >     print "These are the the avearge weight, stdev and stderr for $test 
 > > 2N:\t", $mean_2N,"\t",$stdev_2N,"\t",$stderr_2N, "\n";
 > >
 > >     my $mean_3N = &average (\@data_weight_3N);
 > >     my $stdev_3N = &stdev (\@data_weight_3N);
 > >     my $stderr_3N = ($stdev_3N/sqrt($i));
 > >
 > >     print "These are the the avearge weight, stdev and stderr for $test 
 > > 3N:\t", $mean_3N,"\t",$stdev_3N,"\t",$stderr_3N, "\n";
 > > }
 > >
 > > close ($fh);

Ah, rats.  You checked whether open worked, you need to do the same
thing on close too!

  close ($fh) or die !$;

Or you could just

  use autodie qw(open close);

and then they'll die appropriately when they have to and you don't
have to bother with the checking.

 > > sub average{
 > >         my($data) = @_;
 > >         if (not @$data) {
 > >                 print ("Empty array\n");
 > >                 return 0;
 > >         }
 > >         my $total = 0;
 > >         foreach (@$data) {
 > >                 $total += $_;
 > >         }

  use List::AllUtils qw(sum); # somewhere up at the top of the script...

  my $total = sum(@$data);
  if (not defined $total) {
     print "Empty array\n";
     return;
  }

List::AllUtils is your friend.  Learn to use it.

Your returning 0 for an empty list is probably the wrong thing, isn't
it possible to the total to actually be 0?  Just return instead.
Don't return undef, just return (and let perl take context into
account for you).

You probably don't actually want to spew "Empty array" out into your
output stream, imagine writing a script that postprocesses your output
and having to deal with it.  If you really need to say it, send it to
standard error with

  print STDERR "Empty array\n";

 > >         my $average = $total / @$data;
 > >         return $average;

If you don't really need the error message, then you can get to

  my $total = sum(@$data);
  return unless $total;
  return $total / @$data;

And if an empty data array is *truly* unexpected, maybe you should
just die/carp.

 > > }
 > >
 > > sub stdev{
 > >         my($data) = @_;
 > >         if(@$data == 1){
 > >                 return 0;
 > >         }
 > >         my $average = &average($data);
 > >         my $sqtotal = 0;
 > >         foreach(@$data) {
 > >                 $sqtotal += ($average-$_) ** 2;
 > >         }
 > >         my $std = ($sqtotal / (@$data-1)) ** 0.5;
 > >         return $std;
 > > }

Ditto on the use of List::AllUtils, etc...

Phew.

The only other thing I'd like to see would be an arrangement that
let's you write simple tests.  A simple sol'n would be to package the
entire main part of the code up into e.g. a subroutine that returns a
hashref keyed by family, containing a hashref keyed by 2N/3N/... and
then you could just:

  use Test::More;
  
  use Tiago qw(summarize);
  
  my $output = summarize("test_data.tsv");
  
  is($output->{AS5}->{'2N}, "42", "Got the magic number")
  
  # etc...
  
  done_testing;
  
Thanks for sharing your code.  Keep practicing!

g.

From carandraug+dev at gmail.com  Thu Feb 14 17:13:45 2013
From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=)
Date: Thu, 14 Feb 2013 22:13:45 +0000
Subject: [Bioperl-l] bioperl in Google Summer of Code 2013
Message-ID: <CAPOrs_2GA-h1hM73+jZ13Mjh3w3ZDh7jupQ4jHYcG=560jTQPg@mail.gmail.com>

Hi

we got word of it on another project I'm involved with and I was
wondering. Is bioperl going to apply for the Google Summer of Code
this year?

http://www.google-melange.com/gsoc/homepage/google/gsoc2013

Carn?


From hlapp at drycafe.net  Fri Feb 15 09:28:30 2013
From: hlapp at drycafe.net (Hilmar Lapp)
Date: Fri, 15 Feb 2013 09:28:30 -0500
Subject: [Bioperl-l] bioperl in Google Summer of Code 2013
In-Reply-To: <CAPOrs_2GA-h1hM73+jZ13Mjh3w3ZDh7jupQ4jHYcG=560jTQPg@mail.gmail.com>
References: <CAPOrs_2GA-h1hM73+jZ13Mjh3w3ZDh7jupQ4jHYcG=560jTQPg@mail.gmail.com>
Message-ID: <50A8D5F3-7248-49C1-AA1D-969C2BCA38A9@drycafe.net>

I presume the OBF does as an umbrella organization on behalf of all Bio* projects. If you fancy proposing a project idea or mentoring, now is not a bad time to think about that or looking for co-mentors.

-hilmar

Sent with a tap.

On Feb 14, 2013, at 5:13 PM, Carn? Draug <carandraug+dev at gmail.com> wrote:

> Hi
> 
> we got word of it on another project I'm involved with and I was
> wondering. Is bioperl going to apply for the Google Summer of Code
> this year?
> 
> http://www.google-melange.com/gsoc/homepage/google/gsoc2013
> 
> Carn?
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From p.j.a.cock at googlemail.com  Fri Feb 15 09:47:39 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Fri, 15 Feb 2013 14:47:39 +0000
Subject: [Bioperl-l] bioperl in Google Summer of Code 2013
In-Reply-To: <50A8D5F3-7248-49C1-AA1D-969C2BCA38A9@drycafe.net>
References: <CAPOrs_2GA-h1hM73+jZ13Mjh3w3ZDh7jupQ4jHYcG=560jTQPg@mail.gmail.com>
	<50A8D5F3-7248-49C1-AA1D-969C2BCA38A9@drycafe.net>
Message-ID: <CAKVJ-_5M9r9ZA7=KLFhzejcJ36dL11f_2kCrJBp1vR5+S9BF3Q@mail.gmail.com>

On Fri, Feb 15, 2013 at 2:28 PM, Hilmar Lapp <hlapp at drycafe.net> wrote:
> I presume the OBF does as an umbrella organization on behalf of all Bio*
> projects. If you fancy proposing a project idea or mentoring, now is not a
> bad time to think about that or looking for co-mentors.
>
> -hilmar

Yes, the plan is that as in the last few years, the OBF will apply to
GSoC and cover for BioPerl, BioJava, BioRuby, Biopython etc. At
this stage the Bio* projects would be wise to start coming up with
some good project ideas and experienced developers thinking about
being a mentor. For potential students, getting involved in the
community early is a good idea (e.g. bug reports, or better fixing
existing bugs)

See also:
http://lists.open-bio.org/mailman/listinfo/gsoc
http://lists.open-bio.org/mailman/listinfo/gsoc-mentors

Peter

From cjfields at illinois.edu  Fri Feb 15 09:59:43 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Fri, 15 Feb 2013 14:59:43 +0000
Subject: [Bioperl-l] bioperl in Google Summer of Code 2013
In-Reply-To: <CAKVJ-_5M9r9ZA7=KLFhzejcJ36dL11f_2kCrJBp1vR5+S9BF3Q@mail.gmail.com>
References: <CAPOrs_2GA-h1hM73+jZ13Mjh3w3ZDh7jupQ4jHYcG=560jTQPg@mail.gmail.com>
	<50A8D5F3-7248-49C1-AA1D-969C2BCA38A9@drycafe.net>
	<CAKVJ-_5M9r9ZA7=KLFhzejcJ36dL11f_2kCrJBp1vR5+S9BF3Q@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE28328@CHIMBX5.ad.uillinois.edu>

On Feb 15, 2013, at 8:47 AM, Peter Cock <p.j.a.cock at googlemail.com> wrote:

> On Fri, Feb 15, 2013 at 2:28 PM, Hilmar Lapp <hlapp at drycafe.net> wrote:
>> I presume the OBF does as an umbrella organization on behalf of all Bio*
>> projects. If you fancy proposing a project idea or mentoring, now is not a
>> bad time to think about that or looking for co-mentors.
>> 
>> -hilmar
> 
> Yes, the plan is that as in the last few years, the OBF will apply to
> GSoC and cover for BioPerl, BioJava, BioRuby, Biopython etc. At
> this stage the Bio* projects would be wise to start coming up with
> some good project ideas and experienced developers thinking about
> being a mentor. For potential students, getting involved in the
> community early is a good idea (e.g. bug reports, or better fixing
> existing bugs)
> 
> See also:
> http://lists.open-bio.org/mailman/listinfo/gsoc
> http://lists.open-bio.org/mailman/listinfo/gsoc-mentors
> 
> Peter

At the moment I'm not sure if Rob is heading this up or if the baton will be passed on to someone else.  I can't take charge of writing up a proposal at the moment but I can certainly help edit.

chris


From scott at scottcain.net  Fri Feb 15 14:18:37 2013
From: scott at scottcain.net (Scott Cain)
Date: Fri, 15 Feb 2013 14:18:37 -0500
Subject: [Bioperl-l] sequence-region directives in gff files
In-Reply-To: <CAPOrs_3r_cay3d59uBXCNqKwGHRBOBy+c+XOzvrfMeHdbzNTLg@mail.gmail.com>
References: <CAPOrs_3r_cay3d59uBXCNqKwGHRBOBy+c+XOzvrfMeHdbzNTLg@mail.gmail.com>
Message-ID: <CA+JTaox4SeQueWRpvgmq7GpdJ=EzQe6t3Lim2yn6y=_dBcp95A@mail.gmail.com>

Hi Carn?,

Thanks for pointing this out; I was only sort of paying attention to
the FeatureIO discussion, and it hadn't occurred to me that my commit
was the problem.

I believe I've reproduced the functionality from that commit, and I
even added a test that makes use of the added method (yes, I know, it
surprised me too!).  All of the tests now pass for me in the FeatureIO
master.  I'm putting it on my todo list to check that the Chado loader
that makes use of Bio::FeatureIO still works as expected with the new
incarnation.

Thanks,
Scott


On Wed, Feb 13, 2013 at 5:22 AM, Carn? Draug <carandraug+dev at gmail.com> wrote:
> Hi Scott
>
> 3 years ago, the code for the Bio::SeqFeatureIO::* modules was split
> from bioperl-live into a separate repository[1]. Because the code was
> not removed from the bioperl-live repository, people ended up patching
> on both sides, leading to 2 branches of development. Last weekend I
> merged them back together with the exception of one commit that would
> not longer apply[2].
>
> This commit was authored by you with the following commit message:
> "tiny change to Bio::FeatureIO::gff to allow the gmod chado gff3 bulk
> loader to not choke when the gff file has ##sequence-region
> directives.  The loader is documented not to support this, but now it
> will quitely ignore those directives."
>
> Do you think you could take a look at it?
>
> Thank you,
> Carn?
>
> [1] https://github.com/bioperl/Bio-FeatureIO
> [2] https://github.com/bioperl/bioperl-live/commit/7218728b66ad297953676236077fd0ec757378c0


-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   scott at scottcain dot net
GMOD Coordinator (http://gmod.org/)                     216-392-3087
Ontario Institute for Cancer Research


From carandraug+dev at gmail.com  Tue Feb 19 13:52:57 2013
From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=)
Date: Tue, 19 Feb 2013 18:52:57 +0000
Subject: [Bioperl-l] bioperl in Google Summer of Code 2013
In-Reply-To: <CAPOrs_0u2Qpft6_pWMaj3Wdf_-ZPOfnoYoOaevdCL443hnUsoA@mail.gmail.com>
References: <CAPOrs_2GA-h1hM73+jZ13Mjh3w3ZDh7jupQ4jHYcG=560jTQPg@mail.gmail.com>
	<50A8D5F3-7248-49C1-AA1D-969C2BCA38A9@drycafe.net>
	<CAKVJ-_5M9r9ZA7=KLFhzejcJ36dL11f_2kCrJBp1vR5+S9BF3Q@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE28328@CHIMBX5.ad.uillinois.edu>
	<CAPOrs_0u2Qpft6_pWMaj3Wdf_-ZPOfnoYoOaevdCL443hnUsoA@mail.gmail.com>
Message-ID: <CAPOrs_0kiyqSfvS7ZgEkWwbAaiA2L5fV9U2r5U9cROTvyMGLRw@mail.gmail.com>

On 15 February 2013 14:28, Hilmar Lapp <hlapp at drycafe.net> wrote:
> [...]
> If you fancy proposing a project idea or mentoring, now is not a bad time to think about that or looking for co-mentors.

On 15 February 2013 14:59, Fields, Christopher J <cjfields at illinois.edu> wrote:
> At the moment I'm not sure if Rob is heading this up or if the baton will be passed on to someone else.  I can't take charge of writing up a proposal at the moment but I can certainly help edit.

I would like to participate this year as a student.

I do not have however, have any bioperl itch that would last a summer
to fix. The largest of them is to implement BLAST using NCBI's server.
They have made available a SOAP-based BLAST and doing this has been on
my todo for ages. Would you suggest any other project for bioperl?

Carn?


From peymanalavi at yahoo.com  Tue Feb 19 16:16:49 2013
From: peymanalavi at yahoo.com (peyman alavi)
Date: Tue, 19 Feb 2013 13:16:49 -0800 (PST)
Subject: [Bioperl-l] BioGraphics: Bio::SCF installation through cpan fails
Message-ID: <1361308609.90384.YahooMailNeo@web120901.mail.ne1.yahoo.com>

Hello,
I am having
problems for a while trying to install the Bio::SCF module on my Vista32. Now, I know that Bio::SCF isn't really a Bioperl module, but I need it for Bio::Graphics, and I thought perhaps other people had experienced the same problem before.? I
have installed zlib and io_lib (both their last available versions), but it
looks like sth. (presumably with io_lib) is missing. I should be very grateful
if someone could tell me what still needs to be done!
Here are
the paths where the io_lib "library" and "include" directories are installed, and I
set them to cpan before trying to install Bio::SCF:
o conf
makepl_arg ?LIBS=-Lc:/MinGW/msys/1.0/local/lib INC=-Ic:/MinGW/msys/1.0/local/include?
And the
following is what I get on the STDOUT:
?
Set up gcc environment - 4.7.2
[32m
cpan shell -- CPAN exploration and modules installation (v1.9800)
Enter 'h' for help.[0m
?
[32m??? makepl_arg???????? [LIBS=-Lc:/MinGW/msys/1.0/local/lib
INC=-Ic:/MinGW/msys/1.0/local/include][0m
[32mPlease use 'o conf commit' to make the config permanent![0m
?
[32m[0m
[32mReading 'D:\Perl\cpan\Metadata'[0m
[32m? Database was generated on
Sun, 17 Feb 2013 12:17:02 GMT[0m
[32mRunning install for module 'Bio::SCF'[0m
[32mRunning make for L/LD/LDS/Bio-SCF-1.03.tar.gz[0m
[32mChecksum for
D:\Perl\cpan\sources\authors\id\L\LD\LDS\Bio-SCF-1.03.tar.gz ok[0m
[32mScanning cache D:\Perl/cpan/build for sizes[0m
[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32mDONE[0m
[32mBio-SCF-1.03/[0m
[32mBio-SCF-1.03/t/[0m
[32mBio-SCF-1.03/t/scf.t[0m
[32mBio-SCF-1.03/eg/[0m
[32mBio-SCF-1.03/eg/write_test_obj.pl[0m
[32mBio-SCF-1.03/eg/write_test_tied.pl[0m
[32mBio-SCF-1.03/eg/read_test_obj.pl[0m
[32mBio-SCF-1.03/eg/read_test_tied.pl[0m
[32mBio-SCF-1.03/SCF/[0m
[32mBio-SCF-1.03/SCF/Arrays.pm[0m
[32mBio-SCF-1.03/DISCLAIMER[0m
[32mBio-SCF-1.03/README[0m
[32mBio-SCF-1.03/SCF.pm[0m
[32mBio-SCF-1.03/SCF.xs[0m
[32mBio-SCF-1.03/Changes[0m
[32mBio-SCF-1.03/test.scf[0m
[32mBio-SCF-1.03/Makefile.PL[0m
[32mBio-SCF-1.03/META.yml[0m
[32mBio-SCF-1.03/INSTALL[0m
[32mBio-SCF-1.03/MANIFEST[0m
[32m
? CPAN.pm: Building
L/LD/LDS/Bio-SCF-1.03.tar.gz[0m
?
Set up gcc environment - 4.7.2
Checking if your kit is complete...
Looks good
Writing Makefile for Bio::SCF
Writing MYMETA.yml and MYMETA.json
cp SCF.pm blib\lib\Bio\SCF.pm
cp SCF/Arrays.pm blib\lib\Bio\SCF\Arrays.pm
D:\Perl\bin\perl.exe D:\Perl\site\lib\ExtUtils\xsubpp? -typemap D:\Perl\lib\ExtUtils\typemap? SCF.xs > SCF.xsc &&
D:\Perl\bin\perl.exe -MExtUtils::Command -e mv -- SCF.xsc SCF.c
Please specify prototyping behavior for SCF.xs (see perlxs manual)
c:/MinGW/bin/gcc.exe -c? -Ic:/MinGW/msys/1.0/local/include ???????????? -DNDEBUG
-DWIN32 -D_CONSOLE -DNO_STRICT -DHAVE_DES_FCRYPT -DUSE_SITECUSTOMIZE
-DPERL_IMPLICIT_CONTEXT -DPERL_IMPLICIT_SYS -DUSE_PERLIO -D_USE_32BIT_TIME_T
-DPERL_MSVCRT_READFIX -DHASATTRIBUTE -fno-strict-aliasing -mms-bitfields -O2 ??????? ??-DVERSION=\"1.03\" ??????? -DXS_VERSION=\"1.03\"? "-ID:\Perl\lib\CORE"? -DLITTLE_ENDIAN SCF.c
In file included from c:/MinGW/msys/1.0/local/include/io_lib/scf.h:31:0,
???????????????? from SCF.xs:12:
c:/MinGW/msys/1.0/local/include/io_lib/mFILE.h:23:0: warning:
"MF_APPEND" redefined [enabled by default]
In file included from
c:\mingw\bin\../lib/gcc/mingw32/4.7.2/../../../../include/windows.h:55:0,
???????????????? from
D:\Perl\lib\CORE/win32.h:61,
???????????????? from
D:\Perl\lib\CORE/win32thread.h:4,
???????????????? from
D:\Perl\lib\CORE/perl.h:2825,
???????????????? from SCF.xs:5:
c:\mingw\bin\../lib/gcc/mingw32/4.7.2/../../../../include/winuser.h:131:0:
note: this is the location of the previous definition
SCF.xs: In function 'XS_Bio__SCF_get_scf_pointer':
SCF.xs:35:2: warning: passing argument 3 of '(*Perl_ILIO_ptr((struct
PerlInterpreter *)Perl_get_context()))->pNameStat' from incompatible pointer
type [enabled by default]
SCF.xs:35:2: note: expected 'struct _stati64 *' but argument is of type
'struct stat *'
Running Mkbootstrap for Bio::SCF ()
D:\Perl\bin\perl.exe -MExtUtils::Command -e chmod -- 644 SCF.bs
D:\Perl\bin\perl.exe -MExtUtils::Mksymlists \
???? -e
"Mksymlists('NAME'=>\"Bio::SCF\", 'DLBASE' => 'SCF',
'DL_FUNCS' => {? }, 'FUNCLIST' =>
[], 'IMPORTS' => {? }, 'DL_VARS' =>
[]);"
Set up gcc environment - 4.7.2
dlltool --def SCF.def --output-exp dll.exp
c:\MinGW\bin\g++.exe -o blib\arch\auto\Bio\SCF\SCF.dll -Wl,--base-file
-Wl,dll.base -mdll -L"D:\Perl\lib\CORE" SCF.o?? D:\Perl\lib\CORE\perl512.lib
c:\MinGW\lib\libkernel32.a c:\MinGW\lib\libuser32.a c:\MinGW\lib\libgdi32.a
c:\MinGW\lib\libwinspool.a c:\MinGW\lib\libcomdlg32.a c:\MinGW\lib\libadvapi32.a
c:\MinGW\lib\libshell32.a c:\MinGW\lib\libole32.a c:\MinGW\lib\liboleaut32.a
c:\MinGW\lib\libnetapi32.a c:\MinGW\lib\libuuid.a c:\MinGW\lib\libws2_32.a
c:\MinGW\lib\libmpr.a c:\MinGW\lib\libwinmm.a c:\MinGW\lib\libversion.a
c:\MinGW\lib\libodbc32.a c:\MinGW\lib\libodbccp32.a c:\MinGW\lib\libcomctl32.a
c:\MinGW\lib\libmsvcrt.a dll.exp
Warning: resolving _VirtualQuery at 12 by linking to _VirtualQuery
Use --enable-stdcall-fixup to disable these warnings
Use --disable-stdcall-fixup to disable these fixups
Warning: resolving _VirtualProtect at 16 by linking to _VirtualProtect
Warning: resolving _EnterCriticalSection at 4 by linking to
_EnterCriticalSection
Warning: resolving _TlsGetValue at 4 by linking to _TlsGetValue
Warning: resolving _GetLastError at 0 by linking to _GetLastError
Warning: resolving _LeaveCriticalSection at 4 by linking to
_LeaveCriticalSection
Warning: resolving _DeleteCriticalSection at 4 by linking to
_DeleteCriticalSection
Warning: resolving _InitializeCriticalSection at 4 by linking to
_InitializeCriticalSection
SCF.o:SCF.c:(.text+0xf35): undefined reference to `mfreopen'
SCF.o:SCF.c:(.text+0xf4b): undefined reference to `mfwrite_scf'
SCF.o:SCF.c:(.text+0xf6a): undefined reference to `mfflush'
SCF.o:SCF.c:(.text+0xf72): undefined reference to `mfdestroy'
SCF.o:SCF.c:(.text+0x1138): undefined reference to `write_scf'
SCF.o:SCF.c:(.text+0x16ac): undefined reference to `scf_deallocate'
SCF.o:SCF.c:(.text+0x17b1): undefined reference to `mfreopen'
SCF.o:SCF.c:(.text+0x17c1): undefined reference to `mfread_scf'
SCF.o:SCF.c:(.text+0x19bd): undefined reference to `read_scf'
c:/mingw/bin/../lib/gcc/mingw32/4.7.2/../../../../mingw32/bin/ld.exe:
SCF.o: bad reloc address 0xa4 in section `.rdata'
c:/mingw/bin/../lib/gcc/mingw32/4.7.2/../../../../mingw32/bin/ld.exe:
final link failed: Invalid operation
collect2.exe: error: ld returned 1 exit status
dmake.exe:? Error code 129, while
making 'blib\arch\auto\Bio\SCF\SCF.dll'
[32m? LDS/Bio-SCF-1.03.tar.gz[0m
[31m? D:\Perl\site\bin\dmake.exe
-- NOT OK[0m
[32mRunning make test[0m
[32m? Can't test without successful
make[0m
[32mRunning make install[0m
[32m? Make had returned bad
status, install seems impossible[0m
[32mFailed during this command:
?LDS/Bio-SCF-1.03.tar.gz????????????????????? : make NO[0m
[32m[0m
[31mWarning: Configuration not saved.[0m
[32mLockfile removed.[0m
?
?
?Thanks in advance for any useful
suggestions/help!!
Peyman


From scott at scottcain.net  Tue Feb 19 18:39:44 2013
From: scott at scottcain.net (Scott Cain)
Date: Tue, 19 Feb 2013 18:39:44 -0500
Subject: [Bioperl-l] BioGraphics: Bio::SCF installation through cpan
	fails
In-Reply-To: <1361308609.90384.YahooMailNeo@web120901.mail.ne1.yahoo.com>
References: <1361308609.90384.YahooMailNeo@web120901.mail.ne1.yahoo.com>
Message-ID: <777246AB-2EF0-403D-9652-8EA8390D5C53@scottcain.net>

Hi Peyman,

I have no idea what might be required to get staden and Bio::SCF installed on a windows machine; you have my sympathies for having to go through it. 

But what I wanted to touch on was what you wrote, that is, that you "need" it for Bio::Graphics. I just wanted to point out that you don't need it unless you want to be able to display traces from ABI sequencers (which most people don't really care to do these days). Bioi::SCF is listed as a recommended module, not a required one.

Scott


Sent from my iPad

On Feb 19, 2013, at 4:16 PM, peyman alavi <peymanalavi at yahoo.com> wrote:

> Hello,
> I am having
> problems for a while trying to install the Bio::SCF module on my Vista32. Now, I know that Bio::SCF isn't really a Bioperl module, but I need it for Bio::Graphics, and I thought perhaps other people had experienced the same problem before.  I
> have installed zlib and io_lib (both their last available versions), but it
> looks like sth. (presumably with io_lib) is missing. I should be very grateful
> if someone could tell me what still needs to be done!
> Here are
> the paths where the io_lib "library" and "include" directories are installed, and I
> set them to cpan before trying to install Bio::SCF:
> o conf
> makepl_arg ?LIBS=-Lc:/MinGW/msys/1.0/local/lib INC=-Ic:/MinGW/msys/1.0/local/include?
> And the
> following is what I get on the STDOUT:
>  
> Set up gcc environment - 4.7.2
> [32m
> cpan shell -- CPAN exploration and modules installation (v1.9800)
> Enter 'h' for help.[0m
>  
> [32m    makepl_arg         [LIBS=-Lc:/MinGW/msys/1.0/local/lib
> INC=-Ic:/MinGW/msys/1.0/local/include][0m
> [32mPlease use 'o conf commit' to make the config permanent![0m
>  
> [32m[0m
> [32mReading 'D:\Perl\cpan\Metadata'[0m
> [32m  Database was generated on
> Sun, 17 Feb 2013 12:17:02 GMT[0m
> [32mRunning install for module 'Bio::SCF'[0m
> [32mRunning make for L/LD/LDS/Bio-SCF-1.03.tar.gz[0m
> [32mChecksum for
> D:\Perl\cpan\sources\authors\id\L\LD\LDS\Bio-SCF-1.03.tar.gz ok[0m
> [32mScanning cache D:\Perl/cpan/build for sizes[0m
> [32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32mDONE[0m
> [32mBio-SCF-1.03/[0m
> [32mBio-SCF-1.03/t/[0m
> [32mBio-SCF-1.03/t/scf.t[0m
> [32mBio-SCF-1.03/eg/[0m
> [32mBio-SCF-1.03/eg/write_test_obj.pl[0m
> [32mBio-SCF-1.03/eg/write_test_tied.pl[0m
> [32mBio-SCF-1.03/eg/read_test_obj.pl[0m
> [32mBio-SCF-1.03/eg/read_test_tied.pl[0m
> [32mBio-SCF-1.03/SCF/[0m
> [32mBio-SCF-1.03/SCF/Arrays.pm[0m
> [32mBio-SCF-1.03/DISCLAIMER[0m
> [32mBio-SCF-1.03/README[0m
> [32mBio-SCF-1.03/SCF.pm[0m
> [32mBio-SCF-1.03/SCF.xs[0m
> [32mBio-SCF-1.03/Changes[0m
> [32mBio-SCF-1.03/test.scf[0m
> [32mBio-SCF-1.03/Makefile.PL[0m
> [32mBio-SCF-1.03/META.yml[0m
> [32mBio-SCF-1.03/INSTALL[0m
> [32mBio-SCF-1.03/MANIFEST[0m
> [32m
>   CPAN.pm: Building
> L/LD/LDS/Bio-SCF-1.03.tar.gz[0m
>  
> Set up gcc environment - 4.7.2
> Checking if your kit is complete...
> Looks good
> Writing Makefile for Bio::SCF
> Writing MYMETA.yml and MYMETA.json
> cp SCF.pm blib\lib\Bio\SCF.pm
> cp SCF/Arrays.pm blib\lib\Bio\SCF\Arrays.pm
> D:\Perl\bin\perl.exe D:\Perl\site\lib\ExtUtils\xsubpp  -typemap D:\Perl\lib\ExtUtils\typemap  SCF.xs > SCF.xsc &&
> D:\Perl\bin\perl.exe -MExtUtils::Command -e mv -- SCF.xsc SCF.c
> Please specify prototyping behavior for SCF.xs (see perlxs manual)
> c:/MinGW/bin/gcc.exe -c  -Ic:/MinGW/msys/1.0/local/include              -DNDEBUG
> -DWIN32 -D_CONSOLE -DNO_STRICT -DHAVE_DES_FCRYPT -DUSE_SITECUSTOMIZE
> -DPERL_IMPLICIT_CONTEXT -DPERL_IMPLICIT_SYS -DUSE_PERLIO -D_USE_32BIT_TIME_T
> -DPERL_MSVCRT_READFIX -DHASATTRIBUTE -fno-strict-aliasing -mms-bitfields -O2           -DVERSION=\"1.03\"         -DXS_VERSION=\"1.03\"  "-ID:\Perl\lib\CORE"  -DLITTLE_ENDIAN SCF.c
> In file included from c:/MinGW/msys/1.0/local/include/io_lib/scf.h:31:0,
>                  from SCF.xs:12:
> c:/MinGW/msys/1.0/local/include/io_lib/mFILE.h:23:0: warning:
> "MF_APPEND" redefined [enabled by default]
> In file included from
> c:\mingw\bin\../lib/gcc/mingw32/4.7.2/../../../../include/windows.h:55:0,
>                  from
> D:\Perl\lib\CORE/win32.h:61,
>                  from
> D:\Perl\lib\CORE/win32thread.h:4,
>                  from
> D:\Perl\lib\CORE/perl.h:2825,
>                  from SCF.xs:5:
> c:\mingw\bin\../lib/gcc/mingw32/4.7.2/../../../../include/winuser.h:131:0:
> note: this is the location of the previous definition
> SCF.xs: In function 'XS_Bio__SCF_get_scf_pointer':
> SCF.xs:35:2: warning: passing argument 3 of '(*Perl_ILIO_ptr((struct
> PerlInterpreter *)Perl_get_context()))->pNameStat' from incompatible pointer
> type [enabled by default]
> SCF.xs:35:2: note: expected 'struct _stati64 *' but argument is of type
> 'struct stat *'
> Running Mkbootstrap for Bio::SCF ()
> D:\Perl\bin\perl.exe -MExtUtils::Command -e chmod -- 644 SCF.bs
> D:\Perl\bin\perl.exe -MExtUtils::Mksymlists \
>      -e
> "Mksymlists('NAME'=>\"Bio::SCF\", 'DLBASE' => 'SCF',
> 'DL_FUNCS' => {  }, 'FUNCLIST' =>
> [], 'IMPORTS' => {  }, 'DL_VARS' =>
> []);"
> Set up gcc environment - 4.7.2
> dlltool --def SCF.def --output-exp dll.exp
> c:\MinGW\bin\g++.exe -o blib\arch\auto\Bio\SCF\SCF.dll -Wl,--base-file
> -Wl,dll.base -mdll -L"D:\Perl\lib\CORE" SCF.o   D:\Perl\lib\CORE\perl512.lib
> c:\MinGW\lib\libkernel32.a c:\MinGW\lib\libuser32.a c:\MinGW\lib\libgdi32.a
> c:\MinGW\lib\libwinspool.a c:\MinGW\lib\libcomdlg32.a c:\MinGW\lib\libadvapi32.a
> c:\MinGW\lib\libshell32.a c:\MinGW\lib\libole32.a c:\MinGW\lib\liboleaut32.a
> c:\MinGW\lib\libnetapi32.a c:\MinGW\lib\libuuid.a c:\MinGW\lib\libws2_32.a
> c:\MinGW\lib\libmpr.a c:\MinGW\lib\libwinmm.a c:\MinGW\lib\libversion.a
> c:\MinGW\lib\libodbc32.a c:\MinGW\lib\libodbccp32.a c:\MinGW\lib\libcomctl32.a
> c:\MinGW\lib\libmsvcrt.a dll.exp
> Warning: resolving _VirtualQuery at 12 by linking to _VirtualQuery
> Use --enable-stdcall-fixup to disable these warnings
> Use --disable-stdcall-fixup to disable these fixups
> Warning: resolving _VirtualProtect at 16 by linking to _VirtualProtect
> Warning: resolving _EnterCriticalSection at 4 by linking to
> _EnterCriticalSection
> Warning: resolving _TlsGetValue at 4 by linking to _TlsGetValue
> Warning: resolving _GetLastError at 0 by linking to _GetLastError
> Warning: resolving _LeaveCriticalSection at 4 by linking to
> _LeaveCriticalSection
> Warning: resolving _DeleteCriticalSection at 4 by linking to
> _DeleteCriticalSection
> Warning: resolving _InitializeCriticalSection at 4 by linking to
> _InitializeCriticalSection
> SCF.o:SCF.c:(.text+0xf35): undefined reference to `mfreopen'
> SCF.o:SCF.c:(.text+0xf4b): undefined reference to `mfwrite_scf'
> SCF.o:SCF.c:(.text+0xf6a): undefined reference to `mfflush'
> SCF.o:SCF.c:(.text+0xf72): undefined reference to `mfdestroy'
> SCF.o:SCF.c:(.text+0x1138): undefined reference to `write_scf'
> SCF.o:SCF.c:(.text+0x16ac): undefined reference to `scf_deallocate'
> SCF.o:SCF.c:(.text+0x17b1): undefined reference to `mfreopen'
> SCF.o:SCF.c:(.text+0x17c1): undefined reference to `mfread_scf'
> SCF.o:SCF.c:(.text+0x19bd): undefined reference to `read_scf'
> c:/mingw/bin/../lib/gcc/mingw32/4.7.2/../../../../mingw32/bin/ld.exe:
> SCF.o: bad reloc address 0xa4 in section `.rdata'
> c:/mingw/bin/../lib/gcc/mingw32/4.7.2/../../../../mingw32/bin/ld.exe:
> final link failed: Invalid operation
> collect2.exe: error: ld returned 1 exit status
> dmake.exe:  Error code 129, while
> making 'blib\arch\auto\Bio\SCF\SCF.dll'
> [32m  LDS/Bio-SCF-1.03.tar.gz[0m
> [31m  D:\Perl\site\bin\dmake.exe
> -- NOT OK[0m
> [32mRunning make test[0m
> [32m  Can't test without successful
> make[0m
> [32mRunning make install[0m
> [32m  Make had returned bad
> status, install seems impossible[0m
> [32mFailed during this command:
>  LDS/Bio-SCF-1.03.tar.gz                      : make NO[0m
> [32m[0m
> [31mWarning: Configuration not saved.[0m
> [32mLockfile removed.[0m
>  
>  
>  Thanks in advance for any useful
> suggestions/help!!
> Peyman
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From anngregory at email.arizona.edu  Wed Feb 20 00:20:41 2013
From: anngregory at email.arizona.edu (Ann Gregory)
Date: Tue, 19 Feb 2013 22:20:41 -0700
Subject: [Bioperl-l]  Problem Parsing BLAST output to annotate FASTA file
Message-ID: <CAHxs2gtL=UVAh_f7nSCFKAOj11wf92MThNqHCDxAEfRyb+M_zw@mail.gmail.com>

Hi BioPerl,

I am having issues with a BioPerl script. I have a blastxml file from a
blastx blast and the original multifasta file containing the original
nucleotides sequences.

I want to take the blast result (ie. the blast description) and annotate my
multifasta file.

I have written 2 while loops that extract the blast descriptions as well as
the nucleotide sequence from the multifasta file.

My problem is that I cannot incorporate one of the while loops into the
other without loosing the loop property of one of the loops. I would like
to take the 1st blast description, then the 1st nucleotide sequence, then
the 2nd blast description, then the 2nd nucleotide sequence and so
on...just can figure out how to alternate the results.

See script below:


use warnings;
use strict;
use Bio::SearchIO;
use Bio::SeqIO;


my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file =>
"$ARGV[0]");
while (my $result = $search_in->next_result) {
while (my $hit = $result->next_hit) {
while (my $hsp = $hit->next_hsp) {
my $qd = $hit->description;
print $qd, "\n";
}
}
}

my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]");
while (my $seqobj = $seqio->next_seq) {
my $nuc = $seqobj->seq();
print $nuc, "\n";
}--
Ann (Nina) Gregory
Graduate Student
Rich Lab / Sullivan Lab
Soil, Water, Environmental Science Department
University of Arizona

From yonexhalaolv at gmail.com  Wed Feb 20 04:17:12 2013
From: yonexhalaolv at gmail.com (Sebastian Lau)
Date: Wed, 20 Feb 2013 01:17:12 -0800 (PST)
Subject: [Bioperl-l] =?utf-8?q?failed_to_install_via_fink=EF=BC=9Ano_packa?=
 =?utf-8?q?ge_found_for_specification_=27bioperl-pm5100=27!?=
Message-ID: <84fa1bcb-a39f-4847-bff2-e3a9c2b909ea@googlegroups.com>

*Hi guys,*
*
*
*I just about to install bioperl on my MacOS 10.7.5 via fink. but after 
typing the command, fink said it couldn't find any package:*

fangmatoMacBook-Pro:~ yoyo$ fink install bioperl-pm5100
Information about 6901 packages read in 1 seconds.
Failed: no package found for specification 'bioperl-pm5100'!
fangmatoMacBook-Pro:~ yoyo$ fink install bioperl-pm588
Information about 6901 packages read in 1 seconds.
Failed: no package found for specification 'bioperl-pm588'!
fangmatoMacBook-Pro:~ yoyo$ fink install bioperl-pm586
Information about 6901 packages read in 1 seconds.
Failed: no package found for specification 'bioperl-pm586'!

*I followed the instruction on wiki. I don't know what's wrong with it. 
Thanks for your help.*

From awitney at sgul.ac.uk  Wed Feb 20 10:22:51 2013
From: awitney at sgul.ac.uk (Adam Witney)
Date: Wed, 20 Feb 2013 15:22:51 +0000
Subject: [Bioperl-l] Problem Parsing BLAST output to annotate FASTA file
In-Reply-To: <CAHxs2gtL=UVAh_f7nSCFKAOj11wf92MThNqHCDxAEfRyb+M_zw@mail.gmail.com>
References: <CAHxs2gtL=UVAh_f7nSCFKAOj11wf92MThNqHCDxAEfRyb+M_zw@mail.gmail.com>
Message-ID: <5124EA4B.5020409@sgul.ac.uk>


Hi Ann,

On 20/02/2013 05:20, Ann Gregory wrote:
> Hi BioPerl,
> 
> I am having issues with a BioPerl script. I have a blastxml file from a
> blastx blast and the original multifasta file containing the original
> nucleotides sequences.
> 
> I want to take the blast result (ie. the blast description) and annotate my
> multifasta file.
> 
> I have written 2 while loops that extract the blast descriptions as well as
> the nucleotide sequence from the multifasta file.
> 
> My problem is that I cannot incorporate one of the while loops into the
> other without loosing the loop property of one of the loops. I would like
> to take the 1st blast description, then the 1st nucleotide sequence, then
> the 2nd blast description, then the 2nd nucleotide sequence and so
> on...just can figure out how to alternate the results.
> 
> See script below:
> 
> 
> use warnings;
> use strict;
> use Bio::SearchIO;
> use Bio::SeqIO;
> 
> 
> my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file =>
> "$ARGV[0]");
> while (my $result = $search_in->next_result) {
> while (my $hit = $result->next_hit) {
> while (my $hsp = $hit->next_hsp) {
> my $qd = $hit->description;
> print $qd, "\n";
> }
> }
> }
> 
> my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]");
> while (my $seqobj = $seqio->next_seq) {
> my $nuc = $seqobj->seq();
> print $nuc, "\n";
> }--

I think what you are proposing assumes that the loop over the BLAST
results will come back in the same order as the loop over the Fasta
file, this may be the case, but I'm not sure its something I would rely on.

Anyway, I would loop over the BLAST results, storing the relevant data
to an array or hash and then loop over the fasta file to put the two
together. eg:

my $blast_data;

while ( ... blast data ... ) {
	...
	$blast_data->{$qd} = <whatever you want to store>
	...
}

while ( my $seqobj = $seqio->next_seq ) {
	my $id = $seqobj->id;
	print $blast_data->{$id}."\n";
}

something along those lines... or have i misunderstood you? if so can
you provide some more details, like what do you want your output to look
like?

HTH

Adam

From andreas.leimbach at uni-wuerzburg.de  Wed Feb 20 11:24:50 2013
From: andreas.leimbach at uni-wuerzburg.de (Andreas Leimbach)
Date: Wed, 20 Feb 2013 17:24:50 +0100
Subject: [Bioperl-l] Problem Parsing BLAST output to annotate FASTA file
In-Reply-To: <CAHxs2gtL=UVAh_f7nSCFKAOj11wf92MThNqHCDxAEfRyb+M_zw@mail.gmail.com>
References: <CAHxs2gtL=UVAh_f7nSCFKAOj11wf92MThNqHCDxAEfRyb+M_zw@mail.gmail.com>
Message-ID: <5124F8D2.4020904@uni-wuerzburg.de>

oops, I just realized I had one loop to much in there. Adam is correct. 
Sorry.

The last part of the code I send you should look like this:

my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]");
while (my $seqobj = $seqio->next_seq) {
print ">$hits{$seqobj->display_id}\n";
my $nuc = $seqobj->seq();
print $nuc, "\n";
}


Cheers,
Andreas

--
Andreas Leimbach
Universit?t M?nster
Institut f?r Hygiene
Mendelstr. 7
D-48149 M?nster
Germany

Tel.: +49 (0)551 39 3843
E-Mail: andreas.leimbach at uni-wuerzburg.de

On 20.2.13 06:20, Ann Gregory wrote:
> Hi BioPerl,
>
> I am having issues with a BioPerl script. I have a blastxml file from a
> blastx blast and the original multifasta file containing the original
> nucleotides sequences.
>
> I want to take the blast result (ie. the blast description) and annotate my
> multifasta file.
>
> I have written 2 while loops that extract the blast descriptions as well as
> the nucleotide sequence from the multifasta file.
>
> My problem is that I cannot incorporate one of the while loops into the
> other without loosing the loop property of one of the loops. I would like
> to take the 1st blast description, then the 1st nucleotide sequence, then
> the 2nd blast description, then the 2nd nucleotide sequence and so
> on...just can figure out how to alternate the results.
>
> See script below:
>
>
> use warnings;
> use strict;
> use Bio::SearchIO;
> use Bio::SeqIO;
>
>
> my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file =>
> "$ARGV[0]");
> while (my $result = $search_in->next_result) {
> while (my $hit = $result->next_hit) {
> while (my $hsp = $hit->next_hsp) {
> my $qd = $hit->description;
> print $qd, "\n";
> }
> }
> }
>
> my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]");
> while (my $seqobj = $seqio->next_seq) {
> my $nuc = $seqobj->seq();
> print $nuc, "\n";
> }--
> Ann (Nina) Gregory
> Graduate Student
> Rich Lab / Sullivan Lab
> Soil, Water, Environmental Science Department
> University of Arizona
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

From andreas.leimbach at uni-wuerzburg.de  Wed Feb 20 11:14:29 2013
From: andreas.leimbach at uni-wuerzburg.de (Andreas Leimbach)
Date: Wed, 20 Feb 2013 17:14:29 +0100
Subject: [Bioperl-l] Problem Parsing BLAST output to annotate FASTA file
In-Reply-To: <CAHxs2gtL=UVAh_f7nSCFKAOj11wf92MThNqHCDxAEfRyb+M_zw@mail.gmail.com>
References: <CAHxs2gtL=UVAh_f7nSCFKAOj11wf92MThNqHCDxAEfRyb+M_zw@mail.gmail.com>
Message-ID: <5124F665.5050602@uni-wuerzburg.de>

Hi Ann,

I agree with Adam, but I was already writing my email, while his came 
in. Hope it helps:

I hope I understand correctly what you want to do.
Just to clarify, you queried a protein blast database with blastx and 
nucleotide queries. Now you want to associate the protein description 
for the FIRST blast hit with the corresponding nucleotide fasta file. Is 
that correct?
You have to put the two while loops into one another. Or associate the 
blast hits with the query descriptions. But it's not feasible to take 
the first blast hit and the first nucleotide fasta seq, then the 2nd of 
both etc, as Adam already pointed out.
You would have to iterate through both at the same time. I.e. take the 
first blast hit, then iterate through the nucleotide fasta until you 
find the hit. Then take the 2nd blast hit and iterate through the 
nucleotide fasta etc. It's probably easiest to do this in a hash.

Something along the lines of (not tested I just punched that in the E-Mail):

my %hits;
my $hit_desc;
my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file =>
"$ARGV[0]");
while (my $result = $search_in->next_result) {
while (my $hit = $result->next_hit) {
while (my $hsp = $hit->next_hsp) {
if ($hit->description eq $hit_desc) { # Only want the first blast hit
next;
}
my $hit_desc = $hit->description;
$hits{$result->query_description} = $hit_desc;
}
}
}

my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]");
foreach my $query (keys %hits) {
while (my $seqobj = $seqio->next_seq) {
if ($seqobj->display_id eq $query) {
print ">$hits{$query}\n";
my $nuc = $seqobj->seq();
print $nuc, "\n";
}

You might want to put some evalue cutoff in there to only score 
significant hits. Also if your nucleotide query multi-fasta file is very 
large, you might consider creating an index first:
http://www.bioperl.org/wiki/HOWTO:Local_Databases#Bio::Index

Hope that helps!

Cheers,
Andreas

P.S.: Please next time include version numbers for BioPerl and Perl and 
a little more detail what you want to do. ;-)


--
Andreas Leimbach
Universit?t M?nster
Institut f?r Hygiene
Mendelstr. 7
D-48149 M?nster
Germany

Tel.: +49 (0)551 39 3843
E-Mail: andreas.leimbach at uni-wuerzburg.de

On 20.2.13 06:20, Ann Gregory wrote:
> Hi BioPerl,
>
> I am having issues with a BioPerl script. I have a blastxml file from a
> blastx blast and the original multifasta file containing the original
> nucleotides sequences.
>
> I want to take the blast result (ie. the blast description) and annotate my
> multifasta file.
>
> I have written 2 while loops that extract the blast descriptions as well as
> the nucleotide sequence from the multifasta file.
>
> My problem is that I cannot incorporate one of the while loops into the
> other without loosing the loop property of one of the loops. I would like
> to take the 1st blast description, then the 1st nucleotide sequence, then
> the 2nd blast description, then the 2nd nucleotide sequence and so
> on...just can figure out how to alternate the results.
>
> See script below:
>
>
> use warnings;
> use strict;
> use Bio::SearchIO;
> use Bio::SeqIO;
>
>
> my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file =>
> "$ARGV[0]");
> while (my $result = $search_in->next_result) {
> while (my $hit = $result->next_hit) {
> while (my $hsp = $hit->next_hsp) {
> my $qd = $hit->description;
> print $qd, "\n";
> }
> }
> }
>
> my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]");
> while (my $seqobj = $seqio->next_seq) {
> my $nuc = $seqobj->seq();
> print $nuc, "\n";
> }--
> Ann (Nina) Gregory
> Graduate Student
> Rich Lab / Sullivan Lab
> Soil, Water, Environmental Science Department
> University of Arizona
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

From andreas.leimbach at uni-wuerzburg.de  Wed Feb 20 12:00:51 2013
From: andreas.leimbach at uni-wuerzburg.de (Andreas Leimbach)
Date: Wed, 20 Feb 2013 18:00:51 +0100
Subject: [Bioperl-l] Problem Parsing BLAST output to annotate FASTA file
In-Reply-To: <CAHxs2gtYf70wvFtEX2nFZEtTsUcuw0i1nHzKBRL=H4tcVo+vBQ@mail.gmail.com>
References: <CAHxs2gtL=UVAh_f7nSCFKAOj11wf92MThNqHCDxAEfRyb+M_zw@mail.gmail.com>
	<5124F8D2.4020904@uni-wuerzburg.de>
	<CAHxs2gtYf70wvFtEX2nFZEtTsUcuw0i1nHzKBRL=H4tcVo+vBQ@mail.gmail.com>
Message-ID: <51250143.9050503@uni-wuerzburg.de>

Hey Ann,

damn, it 's not my best day ... Anyways, I wouldn't work with 
List::MoreUtils's each_array function, as this assumes that the blast 
hits and the nucleotide queries are in the same order (as Adam pointed 
out). Rather use a hash which associates a key to a certain value. Also, 
the hash can be used to skip sequences that have no hits.
Here's my new version:

my %hits;
my $hit_desc;
my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file =>
"$ARGV[0]");
while (my $result = $search_in->next_result) {
while (my $hit = $result->next_hit) {
while (my $hsp = $hit->next_hsp) {
$hits{$result->query_description} = $hit->description; # hash: associate 
query_desc (key) with hit_desc (value)
last; # jump out of the while loop; this should resolve getting only the 
first hit
}
last; # see above
}
}


my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]");
while (my $seqobj = $seqio->next_seq) {
if ($hits{$seqobj->display_id}) { # only true if display_id associated 
with hit_desc and should skip seqs without hits
print ">$hits{$seqobj->display_id}\n";
my $nuc = $seqobj->seq();
print $nuc, "\n";
}
}

Cheers,
Andreas

P.S.: I redirected your mail to the BioPerl mailing list, others might 
profit from my mistakes ;-) ...

--
Andreas Leimbach
Universit?t M?nster
Institut f?r Hygiene
Mendelstr. 7
D-48149 M?nster
Germany

Tel.: +49 (0)551 39 3843
E-Mail: andreas.leimbach at uni-wuerzburg.de

On 20.2.13 17:35, Ann Gregory wrote:
> Hi Andreas,
>
> Thanks for you help! I don't understand how this gets the first blast hit:
>
> if ($hit->description eq $hit_desc) { # Only want the first blast hit
> next;
> }
>
> I tried this and seems to be working...but I can't get the 1st blast hit
> or skip the sequences that had no hits. Do you know any quick fixes?
>
> *
> use warnings;
> use strict;
> use Bio::SearchIO;
> use Bio::SeqIO;
> use List::MoreUtils qw(each_array);
>
> my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file =>
> "$ARGV[0]");
> my @ids;
> while (my $result = $search_in->next_result) {
> while (my $hit = $result->next_hit) {
> while (my $hsp = $hit->next_hsp) {
> my $match = $result->num_hits;
> push(@ids, $qd);
> }
> }
> }
> }
>
> my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]");
> my @seqs;
> while (my $seqobj = $seqio->next_seq) {
> my $nuc = $seqobj->seq();
> push(@seqs, $nuc);
> }
>
> my $it = each_array(@ids, at seqs);
> while(my($ids,$seqs)=$it->()){
> print $ids, "\n", $seqs, "\n";
> }
> *
>
> Thanks again!
> ~Ann
>
> On Wed, Feb 20, 2013 at 9:24 AM, Andreas Leimbach
> <andreas.leimbach at uni-wuerzburg.de
> <mailto:andreas.leimbach at uni-wuerzburg.de>> wrote:
>
>     oops, I just realized I had one loop to much in there. Adam is
>     correct. Sorry.
>
>     The last part of the code I send you should look like this:
>
>
>     my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]");
>     while (my $seqobj = $seqio->next_seq) {
>     print ">$hits{$seqobj->display_id}\__n";
>
>     my $nuc = $seqobj->seq();
>     print $nuc, "\n";
>     }
>
>
>     Cheers,
>     Andreas
>
>
>     --
>     Andreas Leimbach
>     Universit?t M?nster
>     Institut f?r Hygiene
>     Mendelstr. 7
>     D-48149 M?nster
>     Germany
>
>     Tel.: +49 (0)551 39 3843 <tel:%2B49%20%280%29551%2039%203843>
>     E-Mail: andreas.leimbach at uni-__wuerzburg.de
>     <mailto:andreas.leimbach at uni-wuerzburg.de>
>
>     On 20.2.13 06:20, Ann Gregory wrote:
>
>         Hi BioPerl,
>
>         I am having issues with a BioPerl script. I have a blastxml file
>         from a
>         blastx blast and the original multifasta file containing the
>         original
>         nucleotides sequences.
>
>         I want to take the blast result (ie. the blast description) and
>         annotate my
>         multifasta file.
>
>         I have written 2 while loops that extract the blast descriptions
>         as well as
>         the nucleotide sequence from the multifasta file.
>
>         My problem is that I cannot incorporate one of the while loops
>         into the
>         other without loosing the loop property of one of the loops. I
>         would like
>         to take the 1st blast description, then the 1st nucleotide
>         sequence, then
>         the 2nd blast description, then the 2nd nucleotide sequence and so
>         on...just can figure out how to alternate the results.
>
>         See script below:
>
>
>         use warnings;
>         use strict;
>         use Bio::SearchIO;
>         use Bio::SeqIO;
>
>
>         my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file =>
>         "$ARGV[0]");
>         while (my $result = $search_in->next_result) {
>         while (my $hit = $result->next_hit) {
>         while (my $hsp = $hit->next_hsp) {
>         my $qd = $hit->description;
>         print $qd, "\n";
>         }
>         }
>         }
>
>         my $seqio = Bio::SeqIO->new(-format => 'fasta', -file =>
>         "$ARGV[1]");
>         while (my $seqobj = $seqio->next_seq) {
>         my $nuc = $seqobj->seq();
>         print $nuc, "\n";
>         }--
>         Ann (Nina) Gregory
>         Graduate Student
>         Rich Lab / Sullivan Lab
>         Soil, Water, Environmental Science Department
>         University of Arizona
>         _________________________________________________
>         Bioperl-l mailing list
>         Bioperl-l at lists.open-bio.org <mailto:Bioperl-l at lists.open-bio.org>
>         http://lists.open-bio.org/__mailman/listinfo/bioperl-l
>         <http://lists.open-bio.org/mailman/listinfo/bioperl-l>
>
>
>
>
> --
> Ann (Nina) Gregory
> Graduate Student
> Rich Lab / Sullivan Lab
> Soil, Water, Environmental Science Department
> University of Arizona
>
>
>

From cjfields at illinois.edu  Wed Feb 20 13:24:58 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 20 Feb 2013 18:24:58 +0000
Subject: [Bioperl-l] Problem Parsing BLAST output to annotate FASTA file
In-Reply-To: <51250143.9050503@uni-wuerzburg.de>
References: <CAHxs2gtL=UVAh_f7nSCFKAOj11wf92MThNqHCDxAEfRyb+M_zw@mail.gmail.com>
	<5124F8D2.4020904@uni-wuerzburg.de>
	<CAHxs2gtYf70wvFtEX2nFZEtTsUcuw0i1nHzKBRL=H4tcVo+vBQ@mail.gmail.com>
	<51250143.9050503@uni-wuerzburg.de>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE2EB4A@CHIMBX5.ad.uillinois.edu>

If this is meant to be something done using the same FASTA files for a bunch of BLAST reports, might be worth setting up a flat file index and using that to look up and grab the sequences; it should be a LOT faster, just the first pass (generation of the initial index) would take a little time.  Look at Bio::DB::Fasta for an example.

chris

On Feb 20, 2013, at 11:00 AM, Andreas Leimbach <andreas.leimbach at uni-wuerzburg.de>
 wrote:

> Hey Ann,
> 
> damn, it 's not my best day ... Anyways, I wouldn't work with List::MoreUtils's each_array function, as this assumes that the blast hits and the nucleotide queries are in the same order (as Adam pointed out). Rather use a hash which associates a key to a certain value. Also, the hash can be used to skip sequences that have no hits.
> Here's my new version:
> 
> my %hits;
> my $hit_desc;
> my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file =>
> "$ARGV[0]");
> while (my $result = $search_in->next_result) {
> while (my $hit = $result->next_hit) {
> while (my $hsp = $hit->next_hsp) {
> $hits{$result->query_description} = $hit->description; # hash: associate query_desc (key) with hit_desc (value)
> last; # jump out of the while loop; this should resolve getting only the first hit
> }
> last; # see above
> }
> }
> 
> 
> my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]");
> while (my $seqobj = $seqio->next_seq) {
> if ($hits{$seqobj->display_id}) { # only true if display_id associated with hit_desc and should skip seqs without hits
> print ">$hits{$seqobj->display_id}\n";
> my $nuc = $seqobj->seq();
> print $nuc, "\n";
> }
> }
> 
> Cheers,
> Andreas
> 
> P.S.: I redirected your mail to the BioPerl mailing list, others might profit from my mistakes ;-) ...
> 
> --
> Andreas Leimbach
> Universit?t M?nster
> Institut f?r Hygiene
> Mendelstr. 7
> D-48149 M?nster
> Germany
> 
> Tel.: +49 (0)551 39 3843
> E-Mail: andreas.leimbach at uni-wuerzburg.de
> 
> On 20.2.13 17:35, Ann Gregory wrote:
>> Hi Andreas,
>> 
>> Thanks for you help! I don't understand how this gets the first blast hit:
>> 
>> if ($hit->description eq $hit_desc) { # Only want the first blast hit
>> next;
>> }
>> 
>> I tried this and seems to be working...but I can't get the 1st blast hit
>> or skip the sequences that had no hits. Do you know any quick fixes?
>> 
>> *
>> use warnings;
>> use strict;
>> use Bio::SearchIO;
>> use Bio::SeqIO;
>> use List::MoreUtils qw(each_array);
>> 
>> my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file =>
>> "$ARGV[0]");
>> my @ids;
>> while (my $result = $search_in->next_result) {
>> while (my $hit = $result->next_hit) {
>> while (my $hsp = $hit->next_hsp) {
>> my $match = $result->num_hits;
>> push(@ids, $qd);
>> }
>> }
>> }
>> }
>> 
>> my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]");
>> my @seqs;
>> while (my $seqobj = $seqio->next_seq) {
>> my $nuc = $seqobj->seq();
>> push(@seqs, $nuc);
>> }
>> 
>> my $it = each_array(@ids, at seqs);
>> while(my($ids,$seqs)=$it->()){
>> print $ids, "\n", $seqs, "\n";
>> }
>> *
>> 
>> Thanks again!
>> ~Ann
>> 
>> On Wed, Feb 20, 2013 at 9:24 AM, Andreas Leimbach
>> <andreas.leimbach at uni-wuerzburg.de
>> <mailto:andreas.leimbach at uni-wuerzburg.de>> wrote:
>> 
>>    oops, I just realized I had one loop to much in there. Adam is
>>    correct. Sorry.
>> 
>>    The last part of the code I send you should look like this:
>> 
>> 
>>    my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]");
>>    while (my $seqobj = $seqio->next_seq) {
>>    print ">$hits{$seqobj->display_id}\__n";
>> 
>>    my $nuc = $seqobj->seq();
>>    print $nuc, "\n";
>>    }
>> 
>> 
>>    Cheers,
>>    Andreas
>> 
>> 
>>    --
>>    Andreas Leimbach
>>    Universit?t M?nster
>>    Institut f?r Hygiene
>>    Mendelstr. 7
>>    D-48149 M?nster
>>    Germany
>> 
>>    Tel.: +49 (0)551 39 3843 <tel:%2B49%20%280%29551%2039%203843>
>>    E-Mail: andreas.leimbach at uni-__wuerzburg.de
>>    <mailto:andreas.leimbach at uni-wuerzburg.de>
>> 
>>    On 20.2.13 06:20, Ann Gregory wrote:
>> 
>>        Hi BioPerl,
>> 
>>        I am having issues with a BioPerl script. I have a blastxml file
>>        from a
>>        blastx blast and the original multifasta file containing the
>>        original
>>        nucleotides sequences.
>> 
>>        I want to take the blast result (ie. the blast description) and
>>        annotate my
>>        multifasta file.
>> 
>>        I have written 2 while loops that extract the blast descriptions
>>        as well as
>>        the nucleotide sequence from the multifasta file.
>> 
>>        My problem is that I cannot incorporate one of the while loops
>>        into the
>>        other without loosing the loop property of one of the loops. I
>>        would like
>>        to take the 1st blast description, then the 1st nucleotide
>>        sequence, then
>>        the 2nd blast description, then the 2nd nucleotide sequence and so
>>        on...just can figure out how to alternate the results.
>> 
>>        See script below:
>> 
>> 
>>        use warnings;
>>        use strict;
>>        use Bio::SearchIO;
>>        use Bio::SeqIO;
>> 
>> 
>>        my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file =>
>>        "$ARGV[0]");
>>        while (my $result = $search_in->next_result) {
>>        while (my $hit = $result->next_hit) {
>>        while (my $hsp = $hit->next_hsp) {
>>        my $qd = $hit->description;
>>        print $qd, "\n";
>>        }
>>        }
>>        }
>> 
>>        my $seqio = Bio::SeqIO->new(-format => 'fasta', -file =>
>>        "$ARGV[1]");
>>        while (my $seqobj = $seqio->next_seq) {
>>        my $nuc = $seqobj->seq();
>>        print $nuc, "\n";
>>        }--
>>        Ann (Nina) Gregory
>>        Graduate Student
>>        Rich Lab / Sullivan Lab
>>        Soil, Water, Environmental Science Department
>>        University of Arizona
>>        _________________________________________________
>>        Bioperl-l mailing list
>>        Bioperl-l at lists.open-bio.org <mailto:Bioperl-l at lists.open-bio.org>
>>        http://lists.open-bio.org/__mailman/listinfo/bioperl-l
>>        <http://lists.open-bio.org/mailman/listinfo/bioperl-l>
>> 
>> 
>> 
>> 
>> --
>> Ann (Nina) Gregory
>> Graduate Student
>> Rich Lab / Sullivan Lab
>> Soil, Water, Environmental Science Department
>> University of Arizona
>> 
>> 
>> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From carandraug+dev at gmail.com  Mon Feb 25 05:08:23 2013
From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=)
Date: Mon, 25 Feb 2013 10:08:23 +0000
Subject: [Bioperl-l] module for description of sequence variants (where to
	place code)
Message-ID: <CAPOrs_0X9tF0_4q-KmV_OMu5vPDT7JbRsPZteLf5dYh1n9_vPg@mail.gmail.com>

Hi

I'm writing a perl module to write a description of the variance
between 2 sequences as described on
http://www.hgvs.org/mutnomen/recs-prot.html

Basically, given 2 sequences, would returns something like "p.Lys2del
p.His25_Met26insGln" if those are the differences. It also accounts
for the existence of - characters on the sequences that may come from
their alignment.

My question is, where on the project tree should I place the module?

Also, is there something already written that would convert from 1 to
3 letter code?

Carn?


From andreas.leimbach at uni-wuerzburg.de  Mon Feb 25 05:32:43 2013
From: andreas.leimbach at uni-wuerzburg.de (Andreas Leimbach)
Date: Mon, 25 Feb 2013 11:32:43 +0100
Subject: [Bioperl-l] module for description of sequence variants (where
 to place code)
In-Reply-To: <CAPOrs_0X9tF0_4q-KmV_OMu5vPDT7JbRsPZteLf5dYh1n9_vPg@mail.gmail.com>
References: <CAPOrs_0X9tF0_4q-KmV_OMu5vPDT7JbRsPZteLf5dYh1n9_vPg@mail.gmail.com>
Message-ID: <512B3DCB.7050008@uni-wuerzburg.de>

Hi Carn?,

for your last question:
You can convert aa strings from one to three letter code with 
'Bio::SeqUtils'.

Cheers,
Andreas

--
Andreas Leimbach
Universit?t M?nster
Institut f?r Hygiene
Mendelstr. 7
D-48149 M?nster
Germany

Tel.: +49 (0)551 39 3843
E-Mail: andreas.leimbach at uni-wuerzburg.de

On 25.2.13 11:08, Carn? Draug wrote:
> Hi
>
> I'm writing a perl module to write a description of the variance
> between 2 sequences as described on
> http://www.hgvs.org/mutnomen/recs-prot.html
>
> Basically, given 2 sequences, would returns something like "p.Lys2del
> p.His25_Met26insGln" if those are the differences. It also accounts
> for the existence of - characters on the sequences that may come from
> their alignment.
>
> My question is, where on the project tree should I place the module?
>
> Also, is there something already written that would convert from 1 to
> 3 letter code?
>
> Carn?
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

From genehack at genehack.org  Wed Feb 27 19:57:48 2013
From: genehack at genehack.org (John SJ Anderson)
Date: Wed, 27 Feb 2013 16:57:48 -0800
Subject: [Bioperl-l] YAPC talks?
Message-ID: <CABJ3DF_o2n2nS5ywzweYaaA6AQzXuQ-KPQHp80QkVv+U09T0aw@mail.gmail.com>

Hi -

Is there anyone that was planning on submitting a Bioperl talk to
YAPC::NA? In an unrelated conversation, one of the organizers
expressed an interest in getting a Bioperl talk this year.

If no one else is planning on a talk submission, Jay Hannah (aka
deafferret) and I are promising/threatening a tag-team style "Bioperl
rules / Bioperl sucks" overview/state of the dist style talk...

thanks,
john.

From cjfields at illinois.edu  Wed Feb 27 21:48:55 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Thu, 28 Feb 2013 02:48:55 +0000
Subject: [Bioperl-l] YAPC talks?
In-Reply-To: <CABJ3DF_o2n2nS5ywzweYaaA6AQzXuQ-KPQHp80QkVv+U09T0aw@mail.gmail.com>
References: <CABJ3DF_o2n2nS5ywzweYaaA6AQzXuQ-KPQHp80QkVv+U09T0aw@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6E705CD3@CHIMBX5.ad.uillinois.edu>

At the moment I personally have no plans on going, but I think a no-holds-barred bioperl talk is a good idea.  

chris

On Feb 27, 2013, at 6:57 PM, John SJ Anderson <genehack at genehack.org> wrote:

> Hi -
> 
> Is there anyone that was planning on submitting a Bioperl talk to
> YAPC::NA? In an unrelated conversation, one of the organizers
> expressed an interest in getting a Bioperl talk this year.
> 
> If no one else is planning on a talk submission, Jay Hannah (aka
> deafferret) and I are promising/threatening a tag-team style "Bioperl
> rules / Bioperl sucks" overview/state of the dist style talk...
> 
> thanks,
> john.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From hlapp at drycafe.net  Wed Feb 27 22:20:34 2013
From: hlapp at drycafe.net (Hilmar Lapp)
Date: Wed, 27 Feb 2013 22:20:34 -0500
Subject: [Bioperl-l] YAPC talks?
In-Reply-To: <CABJ3DF_o2n2nS5ywzweYaaA6AQzXuQ-KPQHp80QkVv+U09T0aw@mail.gmail.com>
References: <CABJ3DF_o2n2nS5ywzweYaaA6AQzXuQ-KPQHp80QkVv+U09T0aw@mail.gmail.com>
Message-ID: <42C1F1B8-FE26-43A8-B601-E80D17D215EC@drycafe.net>


On Feb 27, 2013, at 7:57 PM, John SJ Anderson wrote:

> Jay Hannah (aka deafferret) and I are promising/threatening a tag-team style "Bioperl
> rules / Bioperl sucks" overview/state of the dist style talk...

Please videotape. I'll be sure to watch and promote it :-)

	-hilmar
-- 
===========================================================
: Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net :
===========================================================


From saladi1 at illinois.edu  Thu Feb 28 01:58:20 2013
From: saladi1 at illinois.edu (Shyam Saladi)
Date: Wed, 27 Feb 2013 22:58:20 -0800
Subject: [Bioperl-l] EUtilities Cookbook - Accn to gi
Message-ID: <CAARX5cXXD_DNb+Sbt-_zXvsn63QAaVBcot9YGtEjQ7ucrqAEKQ@mail.gmail.com>

Hi,

I think that rettype for the section "Get GIs for a list of accessions"
should be

-rettype => 'gi');

instead of 'gilist' as it is now. I think this change is due to a change in
NCBI eutils.

webpage:
http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook#Get_GIs_for_a_list_of_accessions

Thanks,
Shyam

From fossandonc at hotmail.com  Thu Feb 28 10:36:34 2013
From: fossandonc at hotmail.com (=?iso-8859-1?Q?Francisco_J._Ossand=F3n?=)
Date: Thu, 28 Feb 2013 12:36:34 -0300
Subject: [Bioperl-l] Fix for Bug #3376 broke somewhere else
Message-ID: <SNT133-ds14A180BAFAE068EE359031CFFE0@phx.gbl>

Hi,
I was re-checking Bug #3302 using the Bio::SearchIO modules of the
repository and found that now it can't parse a Hmmer2 file that was
previously fine. After tracking the problem, I discovered that a change in a
regular expression to fix another bug broke the parse.
 
The fix for the Bug #3376 consisted in adding an extra condition to omit
lines where end of domain indicator is split across lines
(https://redmine.open-bio.org/issues/3376):
TEST: domain 1 of 1, from 8 to 97: score 184.7, E = 2.5e-56
                   *->svfqqqqssksttgstvtAiAiAigYRYRYRAvtWnsGsLssGvnDn
                      sv+qqqq+  +    +vtAiAiAigYRYRYRAv Wn GsLs G nDn
        Test     8    SVYQQQQGGSA----MVTAIAIAIGYRYRYRAVVWNKGSLSTGTNDN 50   

                   DnDqqsdgLYtiYYsvtvpssslpsqtviHHHaHkasstkiiikiePr<-
                   DnDq +d LYtiYYsvtv +ss+p q+v+HHHaH+asstkiiiki P   
        Test    51 DNDQAAD-LYTIYYSVTVSASSWPGQSVTHHHAHPASSTKIIIKIAPS   97   

                   *

        Test     -   -
This case is characterized by the 2 dashes in the line...

So the expression added in hmmer2.pm - ?next_result?
(https://github.com/bioperl/bioperl-live/commit/142e5d79e3a6593db32bf0af9904
8f47d01bd3f2):
                        elsif (CORE::length($_) == 0
                            || ( $count != 1 && /^\s+$/o )
                            || /^\s+\-?\*\s*$/
                            || /^.+\-\s+\-\s*$/ ) ### <--- This regex was
designed for bug 3376
                        {
                            next;
                        }

But the expression used is too broad because it uses the "^.+" just before
the 2 dashes, and it broke these lines parsing, where is full of dashes:
                   KyACrqCdtiVQAPaPakpIErGiptaGLLArvlVSKyaEHlPLYRQsEI
                                                                     
  lcl|gi|340     - -------------------------------------------------- -    

                   yaRqGVeiaRstLadWVgrtgarLaPLvdALaeyVLkeGklHADeTPVqV
                         +i  s L   V++ + r                           
  lcl|gi|340 60938 ------AIMISGLIHGVSARCLRF-------------------------- 60955

I think a reasonable fix that still fixes the original bug and restore the
function for this case is to add an extra \s+ in the regex just before the
first dash, so the expression makes sure that the first dash is the one that
comes AFTER the description (and is replacing the usual coordinate number)
and is not the last of an alignment or a series of dashes like the one
above:
                        elsif (CORE::length($_) == 0
                            || ( $count != 1 && /^\s+$/o )
                            || /^\s+\-?\*\s*$/
                            || /^.+\s+\-\s+\-\s*$/ ) ### <--- Tweaked regex
                        {
                            next;
                        }
I tested it and it works fine, hope you find the fix acceptable.

Cheers,

--
Francisco J. Ossandon
Bioinformatician.
Ph.D. Candidate, University Andres Bello.
Center for Bioinformatics and Genome Biology,
Fundacion Ciencia para la Vida.
Santiago, Chile.
www.cienciavida.cl/CBGB.htm


From PDagosto at edgebio.com  Mon Feb 25 11:50:34 2013
From: PDagosto at edgebio.com (Phil Dagosto)
Date: Mon, 25 Feb 2013 16:50:34 +0000
Subject: [Bioperl-l] Error when running Build.PL
Message-ID: <DC8C6FE0AED292469CF192A00459937BC0F8660B@EDGE-EXCH02.edgebio.com>

Greetings,

I downloaded BioPerl 1.6.1 from this location: http://www.bioperl.org/wiki/Getting_BioPerl

When I ran Build.PL with all of the default settings chosen in the interactive mode I got the following error message:

Could not get valid metadata. Error is: Invalid metadata structure. Errors: 'Perl_5' for 'license' does not have a URL scheme (resources -> license) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::FeatureIO::gff -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::WebAgent -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::EUtilParameters -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::OntologyIO::InterProParser -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Biblio::IO::medlinexml -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::strider -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::PhyloNetwork::RandomFactory -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::Analysis::DNA::ESEfinder -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::game::gameSubs -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::FeatureIO::interpro -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::GFF::Adaptor::berkeleydb -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::entrezgene -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::tinyseq -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::chadoxml -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::game::gameWriter -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::FileCache -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::bsml_sax -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::Primer3 -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::GFF::Adaptor::ace -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::PopGen::HtSNP -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tree::Compatible -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::Ace -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::Taxonomy::entrez -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::agave -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::PopGen::TagHaplotype -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::SeqFeature::Store::FeatureFileLoader -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::* -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::Analysis::Protein* -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SearchIO::blastxml -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::EUtilities -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tree::Draw::Cladogram -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::SeqPattern -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::tigrxml -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqFeature::Collection -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Draw::Pictogram -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SearchIO::Writer::BSMLResultWriter -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::Query::HIVQuery -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::TreeIO::svggraph -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::Biblio::eutils -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::PhyloNetwork -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::SeqPattern::BackTranslate -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::Query::GenBank -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Variation::IO::xml -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::PhyloNetwork::GraphViz -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqFeature::Annotated -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::NCBIHelper -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::HIV -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::Analysis::DNA* -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::Run::RemoteBlast -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::excel -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::ClusterIO::dbsnp -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Microarray::Tools::ReseqChip -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::Biblio::soap -> requires) [Validation: 1.2]
at /usr/local/lib/perl5/5.10.1/Module/Build/Base.pm line 4559

Could not create MYMETA files
Creating new 'Build' script for 'BioPerl' version '1.006001'

I have no idea whether this is a problem or not or if I can proceed. Also, I'm confused by the version number referenced in the last line. 1.006001 is our current version - I thought I was installing version 1.6.1. Are these version numbers equivalent, i.e., are the zeros not meaningful?.

I was actually looking for version 1.2.3 (or greater) - where can I find that?

Thanks,
Phil

Phil Dagosto
Sr. Software Engineer
Edge Bio
201 Perry Parkway, Suite 5
Gaithersburg, MD 20850

pdagosto at edgebio.com
(240) 912-8669


From chapmanb at 50mail.com  Thu Feb 28 21:30:01 2013
From: chapmanb at 50mail.com (Brad Chapman)
Date: Thu, 28 Feb 2013 21:30:01 -0500
Subject: [Bioperl-l] Coming soon: BOSC/Broad Hackathon, BOSC Codefest
Message-ID: <874ngvua1i.fsf@fastmail.fm>


Hi all; 
There are some upcoming coding events and conferences of interest to open source
biology programmers:

- BOSC/Broad Interoperability Hackathon -- This is a two day coding session at
  the Broad Institute in Cambridge, MA on April 7-8 focused on improving tool
  interoperability.
  
  Sign up and details: http://j.mp/XJT6ew
  
- Codefest at the Bioinformatics Open Source Conference -- This year BOSC is
  taking place in Berlin from July 19-20 and we'll have a two day coding session
  before the conference. This is the 4th year of Codefests and they've proven to
  be a productive and fun time to work collectively on open source projects.

  Sign up and details: http://www.open-bio.org/wiki/Codefest_2013
  BOSC conference: http://www.open-bio.org/wiki/BOSC_2013

Here are the key dates for the events and abstracts:

April  7-8, 2013: BOSC/Broad Interoperability Hackathon, Cambridge, MA
April   12, 2013: BOSC abstracts due
July 17-18, 2013: Codefest 2013, Berlin
July 19-20, 2013: BOSC 2013, Berlin

Looking forward to seeing everyone this spring and summer for plenty of fun
science and code,
Brad

From jason.stajich at gmail.com  Fri Feb  1 01:58:57 2013
From: jason.stajich at gmail.com (Jason Stajich)
Date: Thu, 31 Jan 2013 22:58:57 -0800
Subject: [Bioperl-l] Bioperl-l Digest, Vol 117, Issue 13
In-Reply-To: <575B184F-C051-4FEF-9BEB-7AB98E3C52A6@yahoo.com>
References: <mailman.7.1359565204.26693.bioperl-l@lists.open-bio.org>
	<575B184F-C051-4FEF-9BEB-7AB98E3C52A6@yahoo.com>
Message-ID: <CD561DB2-ACFC-4592-B83B-829F44ADE6A3@gmail.com>

Dan - 

I think the answer is yes if others are doing it - I am not in a position to be much of a main coder.

I don't know which format you speak of here or if you had to write something for the text blast changes or something else.  Specific bug reports on formats that aren't working is always helpful.  The XML format has been pretty stable so I would suggest that if you are simply parsing reports not looking at them.

Chris posted instructions on how to contribute and the move to github simplifies this.  That you had to write a whole new parser seems probably a bit severe - I hope that in the future people can speak to the problems sooner. If I hit a wall with something I can't do I usually write the code to fix it and contribute it back but I don't play follow-the-format-changes with the tools anymore, but hopefully others like yourself can make the contributions.

If you speak to the response I made to the question below, I don't think anyone will be trying and support the NCBI's additional markups that refer to the upstream and downstream features as they are laid out in the text files without some serious effort. Perhaps in the future that information will be reported in the XML format and thus be more parseable.

best wishes,
Jason
On Jan 30, 2013, at 1:40 PM, Dan kilburn <dr_kilburn59 at yahoo.com> wrote:

> Hi Jason,
> 
> Are there any plans to keep SearchIO up to date with ncbi blast? I know they change formats ridiculously often, but I had to write my own parser to get sequence identity, which I would rather not have done. I realize that this job would be a big load on anyone who takes it, but it's so fundamental. Maybe I can help.
> 
> --Dan
> Sent from my iPhone
> 
> On Jan 30, 2013, at 12:00 PM, bioperl-l-request at lists.open-bio.org wrote:
> 
>> Send Bioperl-l mailing list submissions to
>>   bioperl-l at lists.open-bio.org
>> 
>> To subscribe or unsubscribe via the World Wide Web, visit
>>   http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> or, via email, send a message with subject or body 'help' to
>>   bioperl-l-request at lists.open-bio.org
>> 
>> You can reach the person managing the list at
>>   bioperl-l-owner at lists.open-bio.org
>> 
>> When replying, please edit your Subject line so it is more specific
>> than "Re: Contents of Bioperl-l digest..."
>> 
>> 
>> Today's Topics:
>> 
>>  1. Re:  Parsing Blast-Report extracting "Features flanking    .."
>>     (Jason Stajich)
>> 
>> 
>> ----------------------------------------------------------------------
>> 
>> Message: 1
>> Date: Tue, 29 Jan 2013 11:00:16 -0800
>> From: Jason Stajich <jason.stajich at gmail.com>
>> Subject: Re: [Bioperl-l] Parsing Blast-Report extracting "Features
>>   flanking    .."
>> To: buschj at hhu.de
>> Cc: bioperl-l at lists.open-bio.org
>> Message-ID: <6E83E3F3-C304-4DC4-9A11-FE1CA90F207D at gmail.com>
>> Content-Type: text/plain;    charset=us-ascii
>> 
>> We don't parse the NCBI feature info from the BLAST reports per your query. To look up a specific feature you can use Bio::DB::GenBank to query for sequence from a specific feature by accession number - see the HOWTOs for that.
>> 
>> However, most people use tools that generate SAM/BAM files with short reads - then you can use a tool like bedtools to find overlaps of reads with the locations of features.
>> 
>> basically:
>> - download the genome and GFF for arabidopsis
>> - align your sRNA to the genome with a short read aligner - bowtie, bwa, others
>> - convert your sam to bam file with SAMtools or picard
>> - compare the location of features with the reads to get expression summaries or individuals reads with BEDTools
>> 
>> 
>> On Jan 25, 2013, at 2:20 AM, jobu <buschj at hhu.de> wrote:
>> 
>>> Am 22.01.2013 19:03, schrieb Mgavi Brathwaite:
>>>> What upstream and downstream elements are you interested in?
>>> 
>>> 
>>> I've got a huge pile of short RNA reads.
>>> Part of the question now is whether those RNA fragments originate from
>>> siRNA events,
>>> or may represent miRNAs / parts of pre-miRNAs.
>>> 
>>> So I did an online  blast search against database nt.
>>> The resulting report quite often just gives subject information like this:
>>> 
>>> -----
>>>> gb|CP002686.1| Arabidopsis thaliana chromosome 3, complete sequence
>>> Length=23459830
>>> -----
>>> 
>>> Now I would like to get the hit's neighbouring regions  for further
>>> analysis.
>>> Preferably I would like to do that  in an automized way, but the only
>>> possible action with this kind of subject gi | description would be to
>>> fetch the entire chromosomal  sequence I guess ?
>>> 
>>> However,
>>> right below the line above, the report states more precisely:
>>> 
>>> ------
>>> Features flanking this part of subject sequence:
>>> 8872 bp at 5' side: cytochrome P450 90B1
>>> 402 bp at 3' side: U1 small nuclear ribonucleoprotein-70K
>>> ------
>>> 
>>> Still I would like to have the possibility to automatically fetch the
>>> subject's sequence(s),
>>> as of now I think  parsing the report with SearchIO won't let me aquire
>>> that information, because SearchIO does not recognize report sections
>>> like those.
>>> 
>>> I hope I did not miss any of SearchIOs capabilities, but I could not
>>> find any method covering my wish?!
>>> 
>>> Right now maybe the only way to get the information I want is to
>>> construct my own parser and write it out into a separate file, which in
>>> turn again  I could read into a hash before processing the Blast-Report
>>> with SearchIO to combine both data for further automized work.
>>> 
>>> I am aware though that even successfully getting the flanking features
>>> would leave me with the more or less wide  intergenic gap my hsp is
>>> located in.
>>> 
>>> However I'm in need of a way to get the flanking features including
>>> their annotation and the region spanning between them.
>>> But I hope I do not have to get complete sequences to accomplish that,
>>> as this would be kind of an overkill.
>>> 
>>> with kind regards
>>> Jochen
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> 
>> Jason Stajich
>> jason.stajich at gmail.com
>> jason at bioperl.org
>> 
>> 
>> 
>> 
>> ------------------------------
>> 
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> 
>> End of Bioperl-l Digest, Vol 117, Issue 13
>> ******************************************
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From dr_kilburn59 at yahoo.com  Fri Feb  1 09:25:34 2013
From: dr_kilburn59 at yahoo.com (Dan Kilburn)
Date: Fri, 1 Feb 2013 06:25:34 -0800 (PST)
Subject: [Bioperl-l] Bioperl-l Digest, Vol 117, Issue 13
In-Reply-To: <CD561DB2-ACFC-4592-B83B-829F44ADE6A3@gmail.com>
References: <mailman.7.1359565204.26693.bioperl-l@lists.open-bio.org>
	<575B184F-C051-4FEF-9BEB-7AB98E3C52A6@yahoo.com>
	<CD561DB2-ACFC-4592-B83B-829F44ADE6A3@gmail.com>
Message-ID: <1359728734.27412.YahooMailNeo@web162006.mail.bf1.yahoo.com>

Hi Jason,
?
Thanks for?the detailed feedback.? The real reason I had to write my own parser is that even with close, repeated support from NCBI we couldn't get XML output with short_web_blast.pl?because the parameter that turns on XML output was not functioning (they've probably fixed it by now), and I had to crank out a parser asap to support a job talk.
?
I don't think the upstream and downstream feature reports are particulalry useful, becase in mammals they tend to be so far away that they are not likely to be biologically relevant.? But the internal motif reports are useful, maybe especially if you are blasting short reads, like I was.? A 16-mer preserved domain hit is really good if you're blasting 18-mer Illumina short reads, like I was.
?
As far as my involvement goes, I got diagnosed with cancer on Wednesday, so I'll be taking a step back until next week's surgery and taking a lot a deep breaths.? On the other hand, this just makes me more motivated: I've been thinking alot about time, and timely contributions, the last two days.
?
Cheers,
Dan
 

________________________________
 From: Jason Stajich <jason.stajich at gmail.com>
To: Dan kilburn <dr_kilburn59 at yahoo.com> 
Cc: "bioperl-l at lists.open-bio.org" <bioperl-l at lists.open-bio.org> 
Sent: Friday, February 1, 2013 1:58 AM
Subject: Re: [Bioperl-l] Bioperl-l Digest, Vol 117, Issue 13
  

Dan -?

I think the answer is yes if others are doing it - I am not in a position to be much of a main coder.

I don't know which format you speak of here or if you had to write something for the text blast changes or something else. ?Specific bug reports on formats that aren't working is always helpful. ?The XML format has been pretty stable so I would suggest that if you are simply parsing reports not looking at them.

Chris posted instructions on how to contribute and the move to github simplifies this. ?That you had to write a whole new parser seems probably a bit severe - I hope that in the future people can speak to the problems sooner. If I hit a wall with something I can't do I usually write the code to fix it and contribute it back but I don't play follow-the-format-changes with the tools anymore, but hopefully others like yourself can make the contributions.

If you speak to the response I made to the question below, I don't think anyone will be trying and support the NCBI's additional markups that refer to the upstream and downstream features as they are laid out in the text files without some serious effort. Perhaps in the future that information will be reported in the XML format and thus be more parseable.
best wishes,
Jason

On Jan 30, 2013, at 1:40 PM, Dan kilburn <dr_kilburn59 at yahoo.com> wrote:

Hi Jason,
>
>Are there any plans to keep SearchIO up to date with ncbi blast? I know they change formats ridiculously often, but I had to write my own parser to get sequence identity, which I would rather not have done. I realize that this job would be a big load on anyone who takes it, but it's so fundamental. Maybe I can help.
>
>--Dan
>Sent from my iPhone
>
>On Jan 30, 2013, at 12:00 PM, bioperl-l-request at lists.open-bio.org wrote:
>
>
>Send Bioperl-l mailing list submissions to
>>??bioperl-l at lists.open-bio.org
>>
>>To subscribe or unsubscribe via the World Wide Web, visit
>>??http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>or, via email, send a message with subject or body 'help' to
>>??bioperl-l-request at lists.open-bio.org
>>
>>You can reach the person managing the list at
>>??bioperl-l-owner at lists.open-bio.org
>>
>>When replying, please edit your Subject line so it is more specific
>>than "Re: Contents of Bioperl-l digest..."
>>
>>
>>Today's Topics:
>>
>>?1. Re: ?Parsing Blast-Report extracting "Features flanking ???.."
>>????(Jason Stajich)
>>
>>
>>----------------------------------------------------------------------
>>
>>Message: 1
>>Date: Tue, 29 Jan 2013 11:00:16 -0800
>>From: Jason Stajich <jason.stajich at gmail.com>
>>Subject: Re: [Bioperl-l] Parsing Blast-Report extracting "Features
>>??flanking ???.."
>>To: buschj at hhu.de
>>Cc: bioperl-l at lists.open-bio.org
>>Message-ID: <6E83E3F3-C304-4DC4-9A11-FE1CA90F207D at gmail.com>
>>Content-Type: text/plain; ???charset=us-ascii
>>
>>We don't parse the NCBI feature info from the BLAST reports per your query. To look up a specific feature you can use Bio::DB::GenBank to query for sequence from a specific feature by accession number - see the HOWTOs for that.
>>
>>However, most people use tools that generate SAM/BAM files with short reads - then you can use a tool like bedtools to find overlaps of reads with the locations of features.
>>
>>basically:
>>- download the genome and GFF for arabidopsis
>>- align your sRNA to the genome with a short read aligner - bowtie, bwa, others
>>- convert your sam to bam file with SAMtools or picard
>>- compare the location of features with the reads to get expression summaries or individuals reads with BEDTools
>>
>>
>>On Jan 25, 2013, at 2:20 AM, jobu <buschj at hhu.de> wrote:
>>
>>
>>Am 22.01.2013 19:03, schrieb Mgavi Brathwaite:
>>>
>>>What upstream and downstream elements are you interested in?
>>>>
>>>
>>>I've got a huge pile of short RNA reads.
>>>Part of the question now is whether those RNA fragments originate from
>>>siRNA events,
>>>or may represent miRNAs / parts of pre-miRNAs.
>>>
>>>So I did an online ?blast search against database nt.
>>>The resulting report quite often just gives subject information like this:
>>>
>>>-----
>>>
>>>gb|CP002686.1| Arabidopsis thaliana chromosome 3, complete sequence
>>>>Length=23459830
>>>-----
>>>
>>>Now I would like to get the hit's neighbouring regions ?for further
>>>analysis.
>>>Preferably I would like to do that ?in an automized way, but the only
>>>possible action with this kind of subject gi | description would be to
>>>fetch the entire chromosomal ?sequence I guess ?
>>>
>>>However,
>>>right below the line above, the report states more precisely:
>>>
>>>------
>>>Features flanking this part of subject sequence:
>>>8872 bp at 5' side: cytochrome P450 90B1
>>>402 bp at 3' side: U1 small nuclear ribonucleoprotein-70K
>>>------
>>>
>>>Still I would like to have the possibility to automatically fetch the
>>>subject's sequence(s),
>>>as of now I think ?parsing the report with SearchIO won't let me aquire
>>>that information, because SearchIO does not recognize report sections
>>>like those.
>>>
>>>I hope I did not miss any of SearchIOs capabilities, but I could not
>>>find any method covering my wish?!
>>>
>>>Right now maybe the only way to get the information I want is to
>>>construct my own parser and write it out into a separate file, which in
>>>turn again ?I could read into a hash before processing the Blast-Report
>>>with SearchIO to combine both data for further automized work.
>>>
>>>I am aware though that even successfully getting the flanking features
>>>would leave me with the more or less wide ?intergenic gap my hsp is
>>>located in.
>>>
>>>However I'm in need of a way to get the flanking features including
>>>their annotation and the region spanning between them.
>>>But I hope I do not have to get complete sequences to accomplish that,
>>>as this would be kind of an overkill.
>>>
>>>with kind regards
>>>Jochen
>>>
>>>
>>>
>>>_______________________________________________
>>>Bioperl-l mailing list
>>>Bioperl-l at lists.open-bio.org
>>>http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>Jason Stajich
>>jason.stajich at gmail.com
>>jason at bioperl.org
>>
>>
>>
>>
>>------------------------------
>>
>>_______________________________________________
>>Bioperl-l mailing list
>>Bioperl-l at lists.open-bio.org
>>http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>End of Bioperl-l Digest, Vol 117, Issue 13
>>******************************************
>>
>_______________________________________________
>Bioperl-l mailing list
>Bioperl-l at lists.open-bio.org
>http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org  


From carandraug+dev at gmail.com  Sat Feb  2 20:44:31 2013
From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=)
Date: Sun, 3 Feb 2013 01:44:31 +0000
Subject: [Bioperl-l] TCofee does not accept named arguments and issue with
	output option
Message-ID: <CAPOrs_3TM5+yD3s3=npWb1sucmy_smSLejxz3Cr6C0Rg6h3Dyw@mail.gmail.com>

Hi

the TCoffee module does not options of the named argument type:

-arg => option

one needs to do like

'arg' => option

Is there a special reason for this? I tracked down this to the commit

7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e

12 years ago[1]. A comment on the code actually says "don't want named
parameters"[2] (though the commit message sounds pretty innocuous
"migrated to new Bio::Root::RootI chained new"). Is there a reason for
this? The rest of bioperl has no issue with named parameters, and the
API should be the same as Clustalw which also has no problem with it.
This is very easy to fix, I can submit a pull request no problem.

Also, shouldn't the code complain in the case of non-supported
options? Took me a very long time to find out the problem because
there was no complaints coming from the code.

There is also a problem with the way it handles the output option.
I'll have to look closer into it, but the documentation is simply
incorrect. "'output' => 'fasta_aln'" gives an error while just 'fasta'
(undocumented), works fine.

Carn?
[1] https://github.com/carandraug/bioperl-run/commit/7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e
[2] https://github.com/carandraug/bioperl-run/commit/7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e#L0R374


From cjfields at illinois.edu  Sun Feb  3 16:54:51 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Sun, 3 Feb 2013 21:54:51 +0000
Subject: [Bioperl-l] TCofee does not accept named arguments and issue
 with	output option
In-Reply-To: <CAPOrs_3TM5+yD3s3=npWb1sucmy_smSLejxz3Cr6C0Rg6h3Dyw@mail.gmail.com>
References: <CAPOrs_3TM5+yD3s3=npWb1sucmy_smSLejxz3Cr6C0Rg6h3Dyw@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE14D30@CHIMBX5.ad.uillinois.edu>

Carn?,

On Feb 2, 2013, at 7:44 PM, Carn? Draug <carandraug+dev at gmail.com> wrote:

> Hi
> 
> the TCoffee module does not options of the named argument type:
> 
> -arg => option
> 
> one needs to do like
> 
> 'arg' => option
> 
> Is there a special reason for this? I tracked down this to the commit
> 
> 7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e
> 
> 12 years ago[1]. A comment on the code actually says "don't want named
> parameters"[2] (though the commit message sounds pretty innocuous
> "migrated to new Bio::Root::RootI chained new"). Is there a reason for
> this? The rest of bioperl has no issue with named parameters, and the
> API should be the same as Clustalw which also has no problem with it.
> This is very easy to fix, I can submit a pull request no problem.

IIRC the reasoning behind this was to differentiate Bioperl parameters from command-specific ones.  This decision predates my involvement w/ core dev, but my general feeling is that anything that is an object attribute (regardless whether it is a direct representation of a value passed to a wrapped program or not) should be preceded by '-' for consistency.  

The downside of big changes like this: potential backwards compatibility issues.  Such changes would need to be tested out rigorously, as there are a ton of old scripts that would potentially break with a direct change.  I don't have a problem breaking this with a bioperl 2.0 release, though.  

> Also, shouldn't the code complain in the case of non-supported
> options? Took me a very long time to find out the problem because
> there was no complaints coming from the code.

Yes, it should complain when options are given that do not make sense, some validation would help there.  With some modules this might be a side-effect of using AUTOLOAD or simply not checking the parameters.

> There is also a problem with the way it handles the output option.
> I'll have to look closer into it, but the documentation is simply
> incorrect. "'output' => 'fasta_aln'" gives an error while just 'fasta'
> (undocumented), works fine.

That's entirely possible.

> Carn?
> [1] https://github.com/carandraug/bioperl-run/commit/7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e
> [2] https://github.com/carandraug/bioperl-run/commit/7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e#L0R374

As an aside, there are a few downsides of trying to implement command-line parameters as perl object attributes (getter/setter), one being that many can't be directly represented as an object attribute (namely, anything that can't be a getter/setter named subroutine, such as those having hyphens, starting with a number, etc) so you have to hack your way around it.  Infernal was this way IIRC.  Maybe these should just be simply stored as a semi-validated set of key-value pairs.  

chris


From carandraug+dev at gmail.com  Sun Feb  3 23:34:22 2013
From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=)
Date: Mon, 4 Feb 2013 04:34:22 +0000
Subject: [Bioperl-l] TCofee does not accept named arguments and issue
 with output option
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE14D30@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_3TM5+yD3s3=npWb1sucmy_smSLejxz3Cr6C0Rg6h3Dyw@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE14D30@CHIMBX5.ad.uillinois.edu>
Message-ID: <CAPOrs_2b2+Dy-HW3ngjNd2tjaTxgvFpTR-rKzq7HOO-6ZzyoTQ@mail.gmail.com>

On 3 February 2013 21:54, Fields, Christopher J <cjfields at illinois.edu> wrote:
> On Feb 2, 2013, at 7:44 PM, Carn? Draug <carandraug+dev at gmail.com> wrote:
>
>> Hi
>>
>> the TCoffee module does not options of the named argument type:
>>
>> -arg => option
>>
>> one needs to do like
>>
>> 'arg' => option
>>
>> Is there a special reason for this? I tracked down this to the commit
>>
>> 7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e
>>
>> 12 years ago[1]. A comment on the code actually says "don't want named
>> parameters"[2] (though the commit message sounds pretty innocuous
>> "migrated to new Bio::Root::RootI chained new"). Is there a reason for
>> this? The rest of bioperl has no issue with named parameters, and the
>> API should be the same as Clustalw which also has no problem with it.
>> This is very easy to fix, I can submit a pull request no problem.
>
> IIRC the reasoning behind this was to differentiate Bioperl parameters from command-specific ones.  This decision predates my involvement w/ core dev, but my general feeling is that anything that is an object attribute (regardless whether it is a direct representation of a value passed to a wrapped program or not) should be preceded by '-' for consistency.
>
> The downside of big changes like this: potential backwards compatibility issues.  Such changes would need to be tested out rigorously, as there are a ton of old scripts that would potentially break with a direct change.  I don't have a problem breaking this with a bioperl 2.0 release, though.

Should passing the tests be enough? There's one for TCofee. At the
moment I don't see how this would cause compatibility issues, we are
adding an option, not removing it. But the comment on the code,
stating plainly that the -param API was not wanted caught me by
surpise and why I'm asking.

> As an aside, there are a few downsides of trying to implement command-line parameters as perl object attributes (getter/setter), one being that many can't be directly represented as an object attribute (namely, anything that can't be a getter/setter named subroutine, such as those having hyphens, starting with a number, etc) so you have to hack your way around it.  Infernal was this way IIRC.  Maybe these should just be simply stored as a semi-validated set of key-value pairs.

>From a quick glance at the list of TCoffee parameters I don't at the
moment see any that should cause problem.

I have submitted a bug report[1] which mentions some other issues I
found with TCoffee. If someone could comment on them would be great
and I can start fixing it.

Carn?

[1] https://redmine.open-bio.org/issues/3406


From whereverroadgoes at gmail.com  Mon Feb  4 10:39:19 2013
From: whereverroadgoes at gmail.com (Slym)
Date: Mon, 4 Feb 2013 07:39:19 -0800 (PST)
Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases
Message-ID: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com>

The result I get is:

Number of bases of type A = 
Number of bases of type C = 
Number of bases of type G = 
Number of bases of type T = 

i.e. There's no expected values. 
Please help!

#! /usr/bin/perl

use Bio::Tools::SeqStats;
use Bio::Seq;

open (FILE, "seq.fasta");
@array = <FILE>;

# Removing first line of fasta

shift (@array);
$array = join('', at array);
open (FILE2, ">>seq2.fasta");
print FILE2 "$array";

$seqobj = Bio::PrimarySeq->new( -file => "sekw2.fasta",
- alphabet => 'dna',);


my $seq_stats = Bio::Tools::SeqStats->new(-seq=>$seqobj);

my $monomer_ref = $seq_stats->count_monomers();

foreach $base (sort keys %$monomer_ref) {
print "Liczba zasad typu ", $base," = ", $monomer_ref{$base},"\n";
}


From hamish.mcwilliam at bioinfo-user.org.uk  Mon Feb  4 11:59:16 2013
From: hamish.mcwilliam at bioinfo-user.org.uk (Hamish McWilliam)
Date: Mon, 4 Feb 2013 16:59:16 +0000
Subject: [Bioperl-l] Where to get BLASTCLUST or equivalent?
In-Reply-To: <loom.20130201T045704-740@post.gmane.org>
References: <200305311150.h4VBopn2019091@localhost.localdomain>
	<loom.20130201T045704-740@post.gmane.org>
Message-ID: <CABqDwwLHWp2fZm5h8KJmZhBFV6QmNLJrg5OE=hR+9U3Y3UJ7_g@mail.gmail.com>

BLASTCLUST is part of the legacy NCBI BLAST package (not NCBI BLAST+)
and can be obtained from:

ftp://ftp.ncbi.nlm.nih.gov/blast/executables/release/LATEST

As Robert notes there are many other tools which can be used to
perform sequence clustering, Wikipedia has a Sequence Clustering
article (http://en.wikipedia.org/wiki/Sequence_clustering) which lists
some of the most commonly used.

All the best,

Hamish

On 1 February 2013 04:15, Rob <yuf228 at hotmail.com> wrote:
> Cyril C.C. Chua <bmbcccc <at> bmb.leeds.ac.uk> writes:
>
>>
>> Hi,
>>
>> I have some difficulty in sourcing for BLASTCLUST or related
>> programs/mods. Does any1 know exactly how to locate them?
>>
>> Regards
>>
>> Cyril Chua
>>
>
>
> Hi Cyril,
>
> I heard of the following programmes that might do similar things (I HAVEN'T
> used any of them yet):
>
> Afree - http://www.vicbioinformatics.com/software.afree.shtml
> Uclust - http://drive5.com/uclust/uclust_userguide_2_1.pdf
> Usearch - http://www.drive5.com/usearch/
> DomClust - http://mbgd.genome.ad.jp/domclust/
>
> or
>
> Check this:
>
> http://ppod.princeton.edu/help/help_tech.html
>
> God bless,
>
>
> Robert
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


--
----
"Saying the internet has changed dramatically over the last five years
is clich? ? the internet is always changing dramatically" - Craig
Labovitz, Arbor Networks.


From whereverroadgoes at gmail.com  Mon Feb  4 12:34:10 2013
From: whereverroadgoes at gmail.com (Slym)
Date: Mon, 4 Feb 2013 09:34:10 -0800 (PST)
Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases
In-Reply-To: <CAJ57qHHphLgEyfkEEyt2HVh+RahSWpiuhuaA08vi5ZxMwDDgTg@mail.gmail.com>
References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com>
	<CAJ57qHHphLgEyfkEEyt2HVh+RahSWpiuhuaA08vi5ZxMwDDgTg@mail.gmail.com>
Message-ID: <b2154001-d1eb-4266-a491-108d3e6ae77d@googlegroups.com>

Thanks Roy,

It still doesn't seem to produce anything. :/


From roy.chaudhuri at gmail.com  Mon Feb  4 12:51:03 2013
From: roy.chaudhuri at gmail.com (Roy Chaudhuri)
Date: Mon, 4 Feb 2013 17:51:03 +0000
Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases
In-Reply-To: <b2154001-d1eb-4266-a491-108d3e6ae77d@googlegroups.com>
References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com>
	<CAJ57qHHphLgEyfkEEyt2HVh+RahSWpiuhuaA08vi5ZxMwDDgTg@mail.gmail.com>
	<b2154001-d1eb-4266-a491-108d3e6ae77d@googlegroups.com>
Message-ID: <CAJ57qHG9zFomG1wB4fN7hZZaByvP_EhxOHRTt2OrOZz__WgawQ@mail.gmail.com>

Sorry, I'd missed another problem in your code - you are trying to
load a fasta file using Bio::PrimarySeq. To read sequence data from a
file you should use Bio::SeqIO, see:

http://www.bioperl.org/wiki/HOWTO:Beginners#Retrieving_a_sequence_from_a_file
http://www.bioperl.org/wiki/HOWTO:SeqIO

Cheers,
Roy.


From asjo at koldfront.dk  Mon Feb  4 12:58:25 2013
From: asjo at koldfront.dk (Adam =?iso-8859-1?Q?Sj=F8gren?=)
Date: Mon, 04 Feb 2013 18:58:25 +0100
Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases
In-Reply-To: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com> (Slym's
	message of "Mon, 4 Feb 2013 07:39:19 -0800 (PST)")
References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com>
Message-ID: <8738xc2c72.fsf@topper.koldfront.dk>

On Mon, 4 Feb 2013 07:39:19 -0800 (PST), Slym wrote:

> #! /usr/bin/perl

> use Bio::Tools::SeqStats;
> use Bio::Seq;

It can be a good idea to add "use strict; use warnings;" to the top of
your script. At least two problems in your program would have been
caught by perl if you had.

> open (FILE, "seq.fasta");

Using (global) literal filehandles and the two parameter open() is
somewhat outdated, a more current way to do it could be:

  open my $fh, '<', 'seq.fasta';

> @array = <FILE>;

> # Removing first line of fasta

> shift (@array);
> $array = join('', at array);
> open (FILE2, ">>seq2.fasta");
> print FILE2 "$array";

Note that you are writing just the sequence to your seq2.fasta file
here, so the new file isn't really a fasta file.

> $seqobj = Bio::PrimarySeq->new( -file => "sekw2.fasta",
> - alphabet => 'dna',);

Bio::PrimarySeq doesn't take a '-file' parameter. Also, note that the
filename is different than before "sekw2" vs. "seq2"!

Either you should use Bio::SeqIO with a '-file' parameter, or you can
use Bio::PrimarySeq with a '-seq' parameter.

> my $seq_stats = Bio::Tools::SeqStats->new(-seq=>$seqobj);

> my $monomer_ref = $seq_stats->count_monomers();

> foreach $base (sort keys %$monomer_ref) {
> print "Liczba zasad typu ", $base," = ", $monomer_ref{$base},"\n";

Here you wanted $monomer_ref->{$base}, as %monomer_ref isn't mentioned
anywhere else.

> }

Here is a complete version of your script - I chose to use Bio::SeqIO -
that works:

  #!/usr/bin/perl

  use strict;
  use warnings;

  use Bio::SeqIO;
  use Bio::Tools::SeqStats;

  my $io=Bio::SeqIO->new(-file=>'seq.fasta', -alphabet=>'dna');
  my $seqobj=$io->next_seq; # Get the first sequence from the file

  my $seq_stats = Bio::Tools::SeqStats->new(-seq=>$seqobj);
  my $monomer_ref = $seq_stats->count_monomers();
  foreach my $base (sort keys %$monomer_ref) {
      print "Liczba zasad typu ", $base," = ", $monomer_ref->{$base},"\n";
  }

E.g.:

  $ cat seq.fasta
  >test
  aaaacccggt
  $ ./slym.pl 
  Liczba zasad typu A = 4
  Liczba zasad typu C = 3
  Liczba zasad typu G = 2
  Liczba zasad typu T = 1
  $ 


  Best regards,

    Adam

-- 
 "Grittings. Ma nam is Kahlfin."                              Adam Sj?gren
                                                         asjo at koldfront.dk


From whereverroadgoes at gmail.com  Mon Feb  4 13:02:29 2013
From: whereverroadgoes at gmail.com (Slym)
Date: Mon, 4 Feb 2013 10:02:29 -0800 (PST)
Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases
In-Reply-To: <CAJ57qHG9zFomG1wB4fN7hZZaByvP_EhxOHRTt2OrOZz__WgawQ@mail.gmail.com>
References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com>
	<CAJ57qHHphLgEyfkEEyt2HVh+RahSWpiuhuaA08vi5ZxMwDDgTg@mail.gmail.com>
	<b2154001-d1eb-4266-a491-108d3e6ae77d@googlegroups.com>
	<CAJ57qHG9zFomG1wB4fN7hZZaByvP_EhxOHRTt2OrOZz__WgawQ@mail.gmail.com>
Message-ID: <d5e347d1-cbaa-498a-9b64-a5242fdc4dd8@googlegroups.com>

The thing is, if I use Bio::SeqIO then  Bio::Tools::SeqStats produces an 
error (saying that it wants input provided by Bio::PrimarySeq).
(btw in this line
 $seqobj = Bio::PrimarySeq->new( -file => "sekw2.fasta", - alphabet => 
'dna',); 
there's a typo "sekw2" instead of "seq2" but this is correct in my original 
code).


From whereverroadgoes at gmail.com  Mon Feb  4 13:02:29 2013
From: whereverroadgoes at gmail.com (Slym)
Date: Mon, 4 Feb 2013 10:02:29 -0800 (PST)
Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases
In-Reply-To: <CAJ57qHG9zFomG1wB4fN7hZZaByvP_EhxOHRTt2OrOZz__WgawQ@mail.gmail.com>
References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com>
	<CAJ57qHHphLgEyfkEEyt2HVh+RahSWpiuhuaA08vi5ZxMwDDgTg@mail.gmail.com>
	<b2154001-d1eb-4266-a491-108d3e6ae77d@googlegroups.com>
	<CAJ57qHG9zFomG1wB4fN7hZZaByvP_EhxOHRTt2OrOZz__WgawQ@mail.gmail.com>
Message-ID: <d5e347d1-cbaa-498a-9b64-a5242fdc4dd8@googlegroups.com>

The thing is, if I use Bio::SeqIO then  Bio::Tools::SeqStats produces an 
error (saying that it wants input provided by Bio::PrimarySeq).
(btw in this line
 $seqobj = Bio::PrimarySeq->new( -file => "sekw2.fasta", - alphabet => 
'dna',); 
there's a typo "sekw2" instead of "seq2" but this is correct in my original 
code).


From cjfields at illinois.edu  Mon Feb  4 13:54:39 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Mon, 4 Feb 2013 18:54:39 +0000
Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases
In-Reply-To: <d5e347d1-cbaa-498a-9b64-a5242fdc4dd8@googlegroups.com>
References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com>
	<CAJ57qHHphLgEyfkEEyt2HVh+RahSWpiuhuaA08vi5ZxMwDDgTg@mail.gmail.com>
	<b2154001-d1eb-4266-a491-108d3e6ae77d@googlegroups.com>
	<CAJ57qHG9zFomG1wB4fN7hZZaByvP_EhxOHRTt2OrOZz__WgawQ@mail.gmail.com>
	<d5e347d1-cbaa-498a-9b64-a5242fdc4dd8@googlegroups.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE161ED@CHIMBX5.ad.uillinois.edu>

Please make sure and read both Roy's and Adam's responses all the way through; Bio::SeqIO is not a sequence object but the front-end for format parsing (e.g. FASTA, etc).  Bio::PrimarySeq does not have a '-file' parameter, Bio::SeqIO does.  

If SeqStats truly doesn't work with Bio::Seq we can fix that, but according to Adam he has tested using Bio::SeqIO out and it seems to work.

chris

On Feb 4, 2013, at 12:02 PM, Slym <whereverroadgoes at gmail.com>
 wrote:

> The thing is, if I use Bio::SeqIO then  Bio::Tools::SeqStats produces an 
> error (saying that it wants input provided by Bio::PrimarySeq).
> (btw in this line
> $seqobj = Bio::PrimarySeq->new( -file => "sekw2.fasta", - alphabet => 
> 'dna',); 
> there's a typo "sekw2" instead of "seq2" but this is correct in my original 
> code).
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From asjo at koldfront.dk  Mon Feb  4 15:00:32 2013
From: asjo at koldfront.dk (Adam =?iso-8859-1?Q?Sj=F8gren?=)
Date: Mon, 04 Feb 2013 21:00:32 +0100
Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases
In-Reply-To: <d5e347d1-cbaa-498a-9b64-a5242fdc4dd8@googlegroups.com> (Slym's
	message of "Mon, 4 Feb 2013 10:02:29 -0800 (PST)")
References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com>
	<CAJ57qHHphLgEyfkEEyt2HVh+RahSWpiuhuaA08vi5ZxMwDDgTg@mail.gmail.com>
	<b2154001-d1eb-4266-a491-108d3e6ae77d@googlegroups.com>
	<CAJ57qHG9zFomG1wB4fN7hZZaByvP_EhxOHRTt2OrOZz__WgawQ@mail.gmail.com>
	<d5e347d1-cbaa-498a-9b64-a5242fdc4dd8@googlegroups.com>
Message-ID: <87txpr26jj.fsf@topper.koldfront.dk>

On Mon, 4 Feb 2013 10:02:29 -0800 (PST), Slym wrote:

> The thing is, if I use Bio::SeqIO then  Bio::Tools::SeqStats produces an 
> error (saying that it wants input provided by Bio::PrimarySeq).

That sounds like you forgot to call ->next_seq() on the Bio::SeqIO
object - to get a sequence object - please see the complete, working
example I sent earlier.


  Best regards,

    Adam

-- 
 "Denial springs eternal."                                    Adam Sj?gren
                                                         asjo at koldfront.dk


From scott at scottcain.net  Tue Feb  5 09:45:14 2013
From: scott at scottcain.net (Scott Cain)
Date: Tue, 5 Feb 2013 09:45:14 -0500
Subject: [Bioperl-l] Have your say in the 2013 GMOD Community Survey!
Message-ID: <CA+JTaoy5NZubXo2jQ8oDN20BQ5BAHg3B9ZmYZRJ6f2Ryr+-awQ@mail.gmail.com>

Give us your thoughts on the GMOD project and win a personal DNA test
from 23andMe!

The GMOD project provides tools like GBrowse, Galaxy, MAKER, JBrowse,
Tripal, Apollo, Chado, and many more to a huge community of users and
developers around the world.

To make sure that GMOD is giving you the support you need, we want to
know how you use GMOD, which components you find valuable, your
opinion on support, training, and GMOD's strengths and weaknesses.
Your feedback is vital in helping GMOD to serve its user community
more effectively and to suggest future directions for the project.

Do the survey: http://gmod.org/survey.html

The survey should take between 10 and 15 minutes (including thinking
time), and participants can enter a draw to win "A Journey Through
Your DNA", the personal DNA test from 23andMe (the winner can pick a
$50 Amazon gift voucher if they prefer).

The survey will be open until March 1st. Results will be collated and
discussed at the April 2013 GMOD Meeting in Cambridge, UK, and posted
on the GMOD wiki at http://gmod.org.

Please spread the word to other friends and colleagues who use GMOD:
the more voices we hear, the better the picture we get of the needs of
our users, and the better we can help you!

Do the survey: http://gmod.org/survey.html

If you have any questions or problems with the survey, please email me
-- I will be happy to help out!

Thanks,
Scott


-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   scott at scottcain dot net
GMOD Coordinator (http://gmod.org/)                     216-392-3087
Ontario Institute for Cancer Research


From tiago.hori at gmail.com  Tue Feb  5 10:21:55 2013
From: tiago.hori at gmail.com (Tiago Hori)
Date: Tue, 5 Feb 2013 07:21:55 -0800 (PST)
Subject: [Bioperl-l] Search I::O
Message-ID: <39b1269f-63a7-4b29-af79-8c93ab231abf@googlegroups.com>

Hi All,

I am trying to find the best putative orthologs for 44K Atlantic Salmon 
sequences, and so I need to parse 44K BLAST reports to find the best human 
hit. I am trying to learn Seach::IO, but when I try the first example on 
the HOWTO: use strict;
use Bio::SearchIO;

my $in = new Bio::SearchIO(-format => 'blast'
               -file => 'C001R047.txt');

while( my $result = $in->next_result ) {
  ## $result is a Bio::Search::Result::ResultI compliant object
  while( my $hit = $result->next_hit ) {
    ## $hit is a Bio::Search::Hit::HitI compliant object
    while( my $hsp = $hit->next_hsp ) {
      ## $hsp is a Bio::Search::HSP::HSPI compliant object
      if( $hsp->length('total') > 50 ) {
        if ( $hsp->percent_identity >= 75 ) {
          print "Query=",   $result->query_name,
            " Hit=",        $hit->name,
            " Length=",     $hsp->length('total'),
            " Percent_id=", $hsp->percent_identity, "\n";
        }
      }
    }  
  }
}

I get this error: Odd number of elements in hash assignment at 
/usr/local/share/perl/5.14.2/Bio/SearchIO.pm line 189.

I am using BioPerl version 1.6.901. Is there a format problem with the 
blast reports?

Any help would be greatly appreciated!

T.


From tiago.hori at gmail.com  Tue Feb  5 10:33:32 2013
From: tiago.hori at gmail.com (Tiago Hori)
Date: Tue, 5 Feb 2013 07:33:32 -0800 (PST)
Subject: [Bioperl-l] Search::IO example from HOWTO
Message-ID: <c87907a1-18da-49ed-ad70-55ca7bd27658@googlegroups.com>

Hi All,

I am trying to run tha example from the Search::IO how to use strict;
use Bio::SearchIO;

my $in = new Bio::SearchIO(-format => 'blast'
               -file => 'test.txt');

while( my $result = $in->next_result ) {
  ## $result is a Bio::Search::Result::ResultI compliant object
  while( my $hit = $result->next_hit ) {
    ## $hit is a Bio::Search::Hit::HitI compliant object
    while( my $hsp = $hit->next_hsp ) {
      ## $hsp is a Bio::Search::HSP::HSPI compliant object
      if( $hsp->length('total') > 50 ) {
        if ( $hsp->percent_identity >= 75 ) {
          print "Query=",   $result->query_name,
            " Hit=",        $hit->name,
            " Length=",     $hsp->length('total'),
            " Percent_id=", $hsp->percent_identity, "\n";
        }
      }
    }  
  }
}

And I get this error:Odd number of elements in hash assignment at 
/usr/local/share/perl/5.14.2/Bio/SearchIO.pm line 189.

Can anybody help!

Cheers,

T.


From carandraug+dev at gmail.com  Tue Feb  5 13:56:21 2013
From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=)
Date: Tue, 5 Feb 2013 18:56:21 +0000
Subject: [Bioperl-l] removing packages from bioperl-live
Message-ID: <CAPOrs_1z3xYWVFvObLryf7E4w1oO3O0ZjJ_Cu8HA805=S0Fpzw@mail.gmail.com>

Hi

some of the bioperl-live packages have already been split into
separate repositories. However, they were never actually removed from
bioperl-live. This creates 2 entry points for bug fixes and
implementations. After a chat on #bioperl, I was told to ask here.

Should these be removed? For example, there's bioperl-FeatureIO but
that code alo exists in bioperl-live. Can I remove it from
bioperl-live?

Carn?


From cjfields at illinois.edu  Tue Feb  5 14:34:07 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Tue, 5 Feb 2013 19:34:07 +0000
Subject: [Bioperl-l] Removing Bio::FeatureIO? was Re: removing packages from
 bioperl-live
In-Reply-To: <CAPOrs_1z3xYWVFvObLryf7E4w1oO3O0ZjJ_Cu8HA805=S0Fpzw@mail.gmail.com>
References: <CAPOrs_1z3xYWVFvObLryf7E4w1oO3O0ZjJ_Cu8HA805=S0Fpzw@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu>

Probably should retitle this to ask the question directly (make sure the right radars are pinged).

My vote is yes, it should be removed.  There were a lot of implementation issues with it that ended up becoming problematic.  I do believe it is used, though, so I would like to get additional responses from the community before removing it and pointing to the separate repository (where there has been a lot of experimenting going on).

chris

On Feb 5, 2013, at 12:56 PM, Carn? Draug <carandraug+dev at gmail.com> wrote:

> Hi
> 
> some of the bioperl-live packages have already been split into
> separate repositories. However, they were never actually removed from
> bioperl-live. This creates 2 entry points for bug fixes and
> implementations. After a chat on #bioperl, I was told to ask here.
> 
> Should these be removed? For example, there's bioperl-FeatureIO but
> that code alo exists in bioperl-live. Can I remove it from
> bioperl-live?
> 
> Carn?
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From scott at scottcain.net  Tue Feb  5 14:36:10 2013
From: scott at scottcain.net (Scott Cain)
Date: Tue, 5 Feb 2013 14:36:10 -0500
Subject: [Bioperl-l] Removing Bio::FeatureIO? was Re: removing packages
 from bioperl-live
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_1z3xYWVFvObLryf7E4w1oO3O0ZjJ_Cu8HA805=S0Fpzw@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu>
Message-ID: <CA+JTaowxkgy+2ytqHG-MG6VrOdT7jGLQ9-_TJfVA3COsLgUZYw@mail.gmail.com>

I'm sure it will lead to lots of fun, but I suspect you are right and
it should be removed.  It's time you yank on that bandaid :-)

Scott


On Tue, Feb 5, 2013 at 2:34 PM, Fields, Christopher J
<cjfields at illinois.edu> wrote:
> Probably should retitle this to ask the question directly (make sure the right radars are pinged).
>
> My vote is yes, it should be removed.  There were a lot of implementation issues with it that ended up becoming problematic.  I do believe it is used, though, so I would like to get additional responses from the community before removing it and pointing to the separate repository (where there has been a lot of experimenting going on).
>
> chris
>
> On Feb 5, 2013, at 12:56 PM, Carn? Draug <carandraug+dev at gmail.com> wrote:
>
>> Hi
>>
>> some of the bioperl-live packages have already been split into
>> separate repositories. However, they were never actually removed from
>> bioperl-live. This creates 2 entry points for bug fixes and
>> implementations. After a chat on #bioperl, I was told to ask here.
>>
>> Should these be removed? For example, there's bioperl-FeatureIO but
>> that code alo exists in bioperl-live. Can I remove it from
>> bioperl-live?
>>
>> Carn?
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   scott at scottcain dot net
GMOD Coordinator (http://gmod.org/)                     216-392-3087
Ontario Institute for Cancer Research


From carandraug+dev at gmail.com  Tue Feb  5 15:06:23 2013
From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=)
Date: Tue, 5 Feb 2013 20:06:23 +0000
Subject: [Bioperl-l] dependencies on perl version
Message-ID: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>

Hi

how much perl backwards compatibility does bioperl needs to keep?

If I have something I want to implement and use state (requires
5.010), is it acceptable? 5.010 is already a quite old perl version.
Of course, there are other less elegant ways to implement those
features. If I can't use modern perl stuff, what version number is the
limit?

Carn?


From carandraug+dev at gmail.com  Tue Feb  5 15:10:01 2013
From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=)
Date: Tue, 5 Feb 2013 20:10:01 +0000
Subject: [Bioperl-l] Removing Bio::FeatureIO? was Re: removing packages
 from bioperl-live
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_1z3xYWVFvObLryf7E4w1oO3O0ZjJ_Cu8HA805=S0Fpzw@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu>
Message-ID: <CAPOrs_0qgrs3FKaoyFHL_RmbYJG8jNDfhxW-YddFVUfW3DFn4w@mail.gmail.com>

On 5 February 2013 19:34, Fields, Christopher J <cjfields at illinois.edu> wrote:
> Probably should retitle this to ask the question directly (make sure the right radars are pinged).
>
> My vote is yes, it should be removed.  There were a lot of implementation issues with it that ended up becoming problematic.  I do believe it is used, though, so I would like to get additional responses from the community before removing it and pointing to the separate repository (where there has been a lot of experimenting going on).

Mentioning Bio::FeatureIO was just an example. I meant to ask it as
more general. If the code is already in a separate repository, should
it be removed from bioperl-live?

Carn?


From cjfields at illinois.edu  Tue Feb  5 15:56:48 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Tue, 5 Feb 2013 20:56:48 +0000
Subject: [Bioperl-l] dependencies on perl version
In-Reply-To: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>

Aim for 5.10.1, but be careful of smart-match.  If you do this, make sure to add a 'use 5.010' pragma at the top.  

(for those who don't like this, please speak up.  perl 5.8 has been around almost 8 yrs, we would like to allow using new features if at all possible)

chris

On Feb 5, 2013, at 2:06 PM, Carn? Draug <carandraug+dev at gmail.com> wrote:

> Hi
> 
> how much perl backwards compatibility does bioperl needs to keep?
> 
> If I have something I want to implement and use state (requires
> 5.010), is it acceptable? 5.010 is already a quite old perl version.
> Of course, there are other less elegant ways to implement those
> features. If I can't use modern perl stuff, what version number is the
> limit?
> 
> Carn?
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Tue Feb  5 15:59:38 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Tue, 5 Feb 2013 20:59:38 +0000
Subject: [Bioperl-l] Removing Bio::FeatureIO? was Re: removing packages
 from bioperl-live
In-Reply-To: <CAPOrs_0qgrs3FKaoyFHL_RmbYJG8jNDfhxW-YddFVUfW3DFn4w@mail.gmail.com>
References: <CAPOrs_1z3xYWVFvObLryf7E4w1oO3O0ZjJ_Cu8HA805=S0Fpzw@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu>
	<CAPOrs_0qgrs3FKaoyFHL_RmbYJG8jNDfhxW-YddFVUfW3DFn4w@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1829D@CHIMBX5.ad.uillinois.edu>

On Feb 5, 2013, at 2:10 PM, Carn? Draug <carandraug+dev at gmail.com> wrote:

> On 5 February 2013 19:34, Fields, Christopher J <cjfields at illinois.edu> wrote:
>> Probably should retitle this to ask the question directly (make sure the right radars are pinged).
>> 
>> My vote is yes, it should be removed.  There were a lot of implementation issues with it that ended up becoming problematic.  I do believe it is used, though, so I would like to get additional responses from the community before removing it and pointing to the separate repository (where there has been a lot of experimenting going on).
> 
> Mentioning Bio::FeatureIO was just an example. I meant to ask it as
> more general. If the code is already in a separate repository, should
> it be removed from bioperl-live?
> 
> Carn?

Yes for Bio::FeatureIO, no for Bio::Root::Root and the others at the moment (I want to get a release out by March 1, which I'm planning on announcing later today, so the less disruptive it is the better).  Once we get a new release out we should remove the rest.

chris


From cjfields at illinois.edu  Tue Feb  5 16:53:29 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Tue, 5 Feb 2013 21:53:29 +0000
Subject: [Bioperl-l] Next BioPerl release
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu>

All,

I am scheduling the next BioPerl CPAN release tentatively for March 1.  Any help in triaging bug reports would be greatly appreciated!  

Amongst all other changes, as mentioned in a separate thread we will remove Bio::FeatureIO, now developed in a separate repository:

    https://github.com/bioperl/Bio-FeatureIO

Feedback, suggestions, etc are greatly appreciated.

chris


From miker at htblis.com  Tue Feb  5 19:54:17 2013
From: miker at htblis.com (Michael Rogoff)
Date: Tue, 5 Feb 2013 16:54:17 -0800
Subject: [Bioperl-l] Bio::Graphics error when rendering features with Split
	locations
Message-ID: <C71FF11A-F2E2-4204-9A10-50F5535A0C81@htblis.com>

When trying to render features from a genbank file that include a split location e.g.:

     promoter        join(1000..1080,1..5)
                     /label=PROM1

The following exception is raised:
Can't locate object method "has_tag" via package "Bio::Location::Simple" at lib/perl5/site_perl/5.10.1/Bio/Graphics/Glyph.pm line 704, <GEN0> line 36.

This can be reproduced with the code in the example "Rendering Features from a GenBank or EMBL File" from the Graphics HOW-TO:
http://www.bioperl.org/wiki/HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File

Is there a way to change the script so that split locations would, at the very least, not cause a fatal error?  Is there a different glyph type that needs to be used?  Thanks in advance for any help.

I've attached a simple genbank input that will reproduce the error:

LOCUS       sample2     1080 bp DNA    circular
DEFINITION  Cloning vector sample2
ACCESSION   sample2
VERSION     sample2.1  GI:4352432
COMMENT     Component Fragments
FEATURES               Location/Qualifiers
     terminator      39..328
                     /label=TERM1
                     /note="terminator 1"
     misc_feature    393..488
                     /label=MF1
     CDS             complement(800..900)
                     /label=CDS1
                     /note="resistence gene"
     promoter        join(1000..1080,1..5)
                     /label=PROM1
ORIGIN
        1  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
       61  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      121  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      181  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      241  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      301  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      361  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      421  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      481  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      541  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      601  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      661  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      721  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      781  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      841  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      901  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      961  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
     1021  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
//


P.S.  I think I have traced the source of the problem to Glyph's _subfeat method, which in the case of a feature with split locations is returning location objects instead of feature objects.  Is this a bug?

sub _subfeat {
  my $class   = shift;
  my $feature = shift;

  return $feature->segments     if $feature->can('segments');

  my @split = eval { my $id   = $feature->location->seq_id;
                     my @subs = $feature->location->sub_Location;
                     grep {$id eq $_->seq_id} @subs;
                   };

  return @split if @split;

  # Either the APIs have changed, or I got confused at some point...
  return $feature->get_SeqFeatures         if $feature->can('get_SeqFeatures');
  return $feature->sub_SeqFeature          if $feature->can('sub_SeqFeature');
  return;
}


From l.m.timmermans at students.uu.nl  Tue Feb  5 21:40:27 2013
From: l.m.timmermans at students.uu.nl (Leon Timmermans)
Date: Wed, 6 Feb 2013 03:40:27 +0100
Subject: [Bioperl-l] dependencies on perl version
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
Message-ID: <CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>

On Tue, Feb 5, 2013 at 9:56 PM, Fields, Christopher J
<cjfields at illinois.edu> wrote:
> Aim for 5.10.1, but be careful of smart-match.  If you do this, make sure to add a 'use 5.010' pragma at the top.
>
> (for those who don't like this, please speak up.  perl 5.8 has been around almost 8 yrs, we would like to allow using new features if at all possible)

I *really* hate saying it, but I fear a lot of places are still stuck
on 5.8, in particular on 5.8.8 because of CentOS 5. I know my
department still is and doesn't seem to be in a hurry to upgrade, and
I'm pretty sure it won't be the only one (though personally I use a
self-compiled 5.16).

Leon


From florent.angly at gmail.com  Tue Feb  5 21:51:27 2013
From: florent.angly at gmail.com (Florent Angly)
Date: Wed, 06 Feb 2013 12:51:27 +1000
Subject: [Bioperl-l] Removing Bio::FeatureIO? was Re: removing packages
 from bioperl-live
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1829D@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_1z3xYWVFvObLryf7E4w1oO3O0ZjJ_Cu8HA805=S0Fpzw@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu>
	<CAPOrs_0qgrs3FKaoyFHL_RmbYJG8jNDfhxW-YddFVUfW3DFn4w@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1829D@CHIMBX5.ad.uillinois.edu>
Message-ID: <5111C52F.50101@gmail.com>

On 06/02/13 06:59, Fields, Christopher J wrote:
> On Feb 5, 2013, at 2:10 PM, Carn? Draug <carandraug+dev at gmail.com> wrote:
>
>> On 5 February 2013 19:34, Fields, Christopher J <cjfields at illinois.edu> wrote:
>>> Probably should retitle this to ask the question directly (make sure the right radars are pinged).
>>>
>>> My vote is yes, it should be removed.  There were a lot of implementation issues with it that ended up becoming problematic.  I do believe it is used, though, so I would like to get additional responses from the community before removing it and pointing to the separate repository (where there has been a lot of experimenting going on).
>> Mentioning Bio::FeatureIO was just an example. I meant to ask it as
>> more general. If the code is already in a separate repository, should
>> it be removed from bioperl-live?
>>
>> Carn?
> Yes for Bio::FeatureIO, no for Bio::Root::Root and the others at the moment (I want to get a release out by March 1, which I'm planning on announcing later today, so the less disruptive it is the better).  Once we get a new release out we should remove the rest.
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Sounds good to me (I've been burnt once by the fact that Bio::FeatureIO 
is in two places).
Florent


From florent.angly at gmail.com  Tue Feb  5 21:56:19 2013
From: florent.angly at gmail.com (Florent Angly)
Date: Wed, 06 Feb 2013 12:56:19 +1000
Subject: [Bioperl-l] dependencies on perl version
In-Reply-To: <CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
Message-ID: <5111C653.2010703@gmail.com>

For what it's worth, the current stable version of Debian uses perl 
5.10.1 (http://packages.debian.org/stable/perl/perl).
Florent

On 06/02/13 12:40, Leon Timmermans wrote:
> On Tue, Feb 5, 2013 at 9:56 PM, Fields, Christopher J
> <cjfields at illinois.edu> wrote:
>> Aim for 5.10.1, but be careful of smart-match.  If you do this, make sure to add a 'use 5.010' pragma at the top.
>>
>> (for those who don't like this, please speak up.  perl 5.8 has been around almost 8 yrs, we would like to allow using new features if at all possible)
> I *really* hate saying it, but I fear a lot of places are still stuck
> on 5.8, in particular on 5.8.8 because of CentOS 5. I know my
> department still is and doesn't seem to be in a hurry to upgrade, and
> I'm pretty sure it won't be the only one (though personally I use a
> self-compiled 5.16).
>
> Leon
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From hlapp at drycafe.net  Tue Feb  5 22:27:35 2013
From: hlapp at drycafe.net (Hilmar Lapp)
Date: Tue, 5 Feb 2013 22:27:35 -0500
Subject: [Bioperl-l] Next BioPerl release
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu>
References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu>
Message-ID: <09524241-59F8-4BFF-8054-53CD0A649C11@drycafe.net>


On Feb 5, 2013, at 4:53 PM, Fields, Christopher J wrote:

> I am scheduling the next BioPerl CPAN release tentatively for March 1.

Yay!! Thanks for your leadership again, Chris, and for volunteering your time for the project. If nothing else, and I know this is no compensation really worth speaking of, we owe you beer, and I'll certainly pay my debt to you in Berlin if you come there.

	-hilmar
-- 
===========================================================
: Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net :
===========================================================


From hlapp at drycafe.net  Tue Feb  5 22:32:40 2013
From: hlapp at drycafe.net (Hilmar Lapp)
Date: Tue, 5 Feb 2013 22:32:40 -0500
Subject: [Bioperl-l] dependencies on perl version
In-Reply-To: <5111C653.2010703@gmail.com>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
Message-ID: <A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>

Does anyone know what Ubuntu uses? I've heard lots of other old version problems with CentOS.

8 years is really old, and at some point I fear that weighing backwards compatibility too much just holds us back in a real detrimental way.

	-hilmar

On Feb 5, 2013, at 9:56 PM, Florent Angly wrote:

> For what it's worth, the current stable version of Debian uses perl 5.10.1 (http://packages.debian.org/stable/perl/perl).
> Florent
> 
> On 06/02/13 12:40, Leon Timmermans wrote:
>> On Tue, Feb 5, 2013 at 9:56 PM, Fields, Christopher J
>> <cjfields at illinois.edu> wrote:
>>> Aim for 5.10.1, but be careful of smart-match.  If you do this, make sure to add a 'use 5.010' pragma at the top.
>>> 
>>> (for those who don't like this, please speak up.  perl 5.8 has been around almost 8 yrs, we would like to allow using new features if at all possible)
>> I *really* hate saying it, but I fear a lot of places are still stuck
>> on 5.8, in particular on 5.8.8 because of CentOS 5. I know my
>> department still is and doesn't seem to be in a hurry to upgrade, and
>> I'm pretty sure it won't be the only one (though personally I use a
>> self-compiled 5.16).
>> 
>> Leon
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net :
===========================================================


From cjfields at illinois.edu  Tue Feb  5 22:58:08 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 6 Feb 2013 03:58:08 +0000
Subject: [Bioperl-l] dependencies on perl version
In-Reply-To: <A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE18CBE@CHIMBX5.ad.uillinois.edu>

Re: being held back, I agree.  I don't necessarily want to intentionally break current modules by adding modern code unless it can be demonstrated to be a decent benefit performance-wise, but I don't want to impede new additions by requiring compat with perl 5.8 (hence my suggestion of a 'use 5.01x' pragma when appropriate).

Ubuntu 12.04 LTS is on perl 5.14.2: 

    http://askubuntu.com/questions/80672/what-perl-version-will-be-in-12-04-lts

BTW, I was wrong about perl 5.8 being 8 yrs old; it's almost 11 yrs old (perl 5.8.0 was released on 7/18/2002).  perl 5.8 reached end-of-life in 2008, fixes being only for security reasons.

So, I support dropping perl 5.8 support, but we should have a decent route of use for the folks stuck on old clusters.

chris

On Feb 5, 2013, at 9:32 PM, Hilmar Lapp <hlapp at drycafe.net> wrote:

> Does anyone know what Ubuntu uses? I've heard lots of other old version problems with CentOS.
> 
> 8 years is really old, and at some point I fear that weighing backwards compatibility too much just holds us back in a real detrimental way.
> 
> 	-hilmar
> 
> On Feb 5, 2013, at 9:56 PM, Florent Angly wrote:
> 
>> For what it's worth, the current stable version of Debian uses perl 5.10.1 (http://packages.debian.org/stable/perl/perl).
>> Florent
>> 
>> On 06/02/13 12:40, Leon Timmermans wrote:
>>> On Tue, Feb 5, 2013 at 9:56 PM, Fields, Christopher J
>>> <cjfields at illinois.edu> wrote:
>>>> Aim for 5.10.1, but be careful of smart-match.  If you do this, make sure to add a 'use 5.010' pragma at the top.
>>>> 
>>>> (for those who don't like this, please speak up.  perl 5.8 has been around almost 8 yrs, we would like to allow using new features if at all possible)
>>> I *really* hate saying it, but I fear a lot of places are still stuck
>>> on 5.8, in particular on 5.8.8 because of CentOS 5. I know my
>>> department still is and doesn't seem to be in a hurry to upgrade, and
>>> I'm pretty sure it won't be the only one (though personally I use a
>>> self-compiled 5.16).
>>> 
>>> Leon
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> 
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> -- 
> ===========================================================
> : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net :
> ===========================================================
> 
> 
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From l.m.timmermans at students.uu.nl  Tue Feb  5 23:11:52 2013
From: l.m.timmermans at students.uu.nl (Leon Timmermans)
Date: Wed, 6 Feb 2013 05:11:52 +0100
Subject: [Bioperl-l] dependencies on perl version
In-Reply-To: <A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
Message-ID: <CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>

On Wed, Feb 6, 2013 at 4:32 AM, Hilmar Lapp <hlapp at drycafe.net> wrote:
> Does anyone know what Ubuntu uses?

5.14.2, distrowatch is your friend ;-)

> I've heard lots of other old version problems with CentOS.

I know people who still use CentOS 4 in production :-|

> 8 years is really old, and at some point I fear that weighing backwards compatibility too much just holds us back in a real detrimental way.

CentOS 5 is 6 years old (and will be supported another 4), but CentOS
6 is 'only' 19 months. perl missing a release in the 5.8-5.10
timeframe combined with an unfortunate alignment of its release
schedule with Red Hat's don't do us any favors here.

Leon


From cjfields at illinois.edu  Tue Feb  5 23:14:24 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 6 Feb 2013 04:14:24 +0000
Subject: [Bioperl-l] dependencies on perl version
In-Reply-To: <CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE18E52@CHIMBX5.ad.uillinois.edu>

On Feb 5, 2013, at 8:40 PM, Leon Timmermans <l.m.timmermans at students.uu.nl> wrote:

> On Tue, Feb 5, 2013 at 9:56 PM, Fields, Christopher J
> <cjfields at illinois.edu> wrote:
>> Aim for 5.10.1, but be careful of smart-match.  If you do this, make sure to add a 'use 5.010' pragma at the top.
>> 
>> (for those who don't like this, please speak up.  perl 5.8 has been around almost 8 yrs, we would like to allow using new features if at all possible)
> 
> I *really* hate saying it, but I fear a lot of places are still stuck
> on 5.8, in particular on 5.8.8 because of CentOS 5. I know my
> department still is and doesn't seem to be in a hurry to upgrade, and
> I'm pretty sure it won't be the only one (though personally I use a
> self-compiled 5.16).
> 
> Leon

We had the same problem for a while, but our sysadmins were willing to set up perl 5.12 (at that time) loadable as a module (we can of course set up a local perl as well).  We're now using a sysadmin-installed perl 5.16 with our current cluster.

chris


From cjfields at illinois.edu  Tue Feb  5 23:24:31 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 6 Feb 2013 04:24:31 +0000
Subject: [Bioperl-l] dependencies on perl version
In-Reply-To: <CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>

On Feb 5, 2013, at 10:11 PM, Leon Timmermans <l.m.timmermans at students.uu.nl> wrote:

> On Wed, Feb 6, 2013 at 4:32 AM, Hilmar Lapp <hlapp at drycafe.net> wrote:
>> Does anyone know what Ubuntu uses?
> 
> 5.14.2, distrowatch is your friend ;-)
> 
>> I've heard lots of other old version problems with CentOS.
> 
> I know people who still use CentOS 4 in production :-|
> 
>> 8 years is really old, and at some point I fear that weighing backwards compatibility too much just holds us back in a real detrimental way.
> 
> CentOS 5 is 6 years old (and will be supported another 4), but CentOS
> 6 is 'only' 19 months. perl missing a release in the 5.8-5.10
> timeframe combined with an unfortunate alignment of its release
> schedule with Red Hat's don't do us any favors here.
> 
> Leon

Right, it took ~8 yrs to go from 5.8 to 5.10.  I'd like to point out that Python users are in the same boat: the Python version for CentOS 5 is 2.4.3, and Biopython requires a minimum of python 2.5 (and recommends python 2.7).  

We can always state that perl 5.8 is supported for the upcoming Bioperl release, but we're dropping v5.8 support for any future releases.

chris


From l.m.timmermans at students.uu.nl  Tue Feb  5 23:33:57 2013
From: l.m.timmermans at students.uu.nl (Leon Timmermans)
Date: Wed, 6 Feb 2013 05:33:57 +0100
Subject: [Bioperl-l] dependencies on perl version
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
Message-ID: <CAC1jpXAjt8m9Go9YkGFOUkxw92FUoLFbs0Q_fys-f_gyAwX8yw@mail.gmail.com>

On Wed, Feb 6, 2013 at 5:24 AM, Fields, Christopher J
<cjfields at illinois.edu> wrote:
> Right, it took ~8 yrs to go from 5.8 to 5.10.  I'd like to point out that Python users are in the same boat: the Python version for CentOS 5 is 2.4.3, and Biopython requires a minimum of python 2.5 (and recommends python 2.7).
>
> We can always state that perl 5.8 is supported for the upcoming Bioperl release, but we're dropping v5.8 support for any future releases.

Sounds reasonable. These things shouldn't come as a surprise.

I suspect that the thing that will save us is that most of these
people install it once and then never upgrade.

Leon


From hartzell at alerce.com  Wed Feb  6 12:58:07 2013
From: hartzell at alerce.com (George Hartzell)
Date: Wed, 6 Feb 2013 09:58:07 -0800
Subject: [Bioperl-l] dependencies on perl version
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
Message-ID: <20754.39343.128576.743448@gargle.gargle.HOWL>

Fields, Christopher J writes:
 > [...]
 > Right, it took ~8 yrs to go from 5.8 to 5.10.  I'd like to point
 > out that Python users are in the same boat: the Python version for
 > CentOS 5 is 2.4.3, and Biopython requires a minimum of python 2.5
 > (and recommends python 2.7).   
 > 
 > We can always state that perl 5.8 is supported for the upcoming
 > Bioperl release, but we're dropping v5.8 support for any future
 > releases. 

Do more than drop support for 5.8.

The Perl community has put a transparent and predictable process in
place for releasing [generally] better versions of the language.  It
means that Perl has a chance of continuing to be relevant, attracting
new talent and actually *fixing* some of the s&%t that gives Perl a
bad rap.  It gives people something to plan around, no one should be
surprised that v 5.X.Y is coming out in mid 20ZZ.

BioPerl should do the same thing, declare a release policy that trails
along with the Perl release schedule.  Keep it simple and no one can
argue with it.  Support Perl releases as long as the releases
themselves are supported.

Rather than expending energy supporting out of date platforms, put the
energy into being modern (or Modern...), better distro building and
packaging, testing, documentation and releasing so that the process of
staying current is painless.

Look forward.  Keep it interesting and fun.

Everyone running Mac OS 9 on their Pismo, raise your hand.  Anyone
make their living running sequencing gels in Plexiglas doohickeys on
their lab bench?

I'm not suggesting that the BioPerl community is free to make
arbitrary and capricious changes that makes it difficult for *anyone*
to get anything done.  Churn is a waste of time.

But why should the all-volunteer BioPerl community be stuck supporting
code from 12 years ago because it's cost effective for someone else to
avoid spending *their* $/time/people to stay up to date.

Those sites that value stability/maturity/stagnation so highly have
already accepted the cost/difficulty of nailing one of their feet to
the floor as they try to run forward.  They recognize and depend on
the benefits of having that stable base but generally they've also
accepted the costs associated with their restrictive choices.  They
know how to pull in separate kernel/driver updates so that they can
actually run on nearly modern hardware.  They know, and live with, the
fact that they're not going to have access to the shiny new stuff.
And they know how to stay up to date, when they need to, with the
software that their users need to be competitive (e.g. BioConductor
and R).

As long as (if/when...) updating a BioPerl release is something that
can reliably happen with a few cpanm invocations then the sites that
otherwise favor punctuated equilibrium will learn to handle gradual
change.

Those folks that are "stuck" on older releases always have the option
of supporting professional Perl programmers to keep older releases
going, backport changes, etc....  They're already buying support for
their platforms (or freeloading and coping), let them put bread on the
table at one of the bioinformatics consultancies or labs if they have
something special they need.

Have fun.  Use sharp tools.  Do cool science.  Build cool things.  No
one is paying you to be backwards compatible with the previous
millennium.

g.


From amackey at virginia.edu  Wed Feb  6 13:47:46 2013
From: amackey at virginia.edu (Aaron Mackey)
Date: Wed, 6 Feb 2013 13:47:46 -0500
Subject: [Bioperl-l] dependencies on perl version
In-Reply-To: <20754.39343.128576.743448@gargle.gargle.HOWL>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
Message-ID: <CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>

Huzzah!

--
Aaron J. Mackey, PhD
Assistant Professor
Center for Public Health Genomics
University of Virginia
amackey at virginia.edu
http://www.cphg.virginia.edu/mackey


On Wed, Feb 6, 2013 at 12:58 PM, George Hartzell <hartzell at alerce.com>wrote:

> Fields, Christopher J writes:
>  > [...]
>  > Right, it took ~8 yrs to go from 5.8 to 5.10.  I'd like to point
>  > out that Python users are in the same boat: the Python version for
>  > CentOS 5 is 2.4.3, and Biopython requires a minimum of python 2.5
>  > (and recommends python 2.7).
>  >
>  > We can always state that perl 5.8 is supported for the upcoming
>  > Bioperl release, but we're dropping v5.8 support for any future
>  > releases.
>
> Do more than drop support for 5.8.
>
> The Perl community has put a transparent and predictable process in
> place for releasing [generally] better versions of the language.  It
> means that Perl has a chance of continuing to be relevant, attracting
> new talent and actually *fixing* some of the s&%t that gives Perl a
> bad rap.  It gives people something to plan around, no one should be
> surprised that v 5.X.Y is coming out in mid 20ZZ.
>
> BioPerl should do the same thing, declare a release policy that trails
> along with the Perl release schedule.  Keep it simple and no one can
> argue with it.  Support Perl releases as long as the releases
> themselves are supported.
>
> Rather than expending energy supporting out of date platforms, put the
> energy into being modern (or Modern...), better distro building and
> packaging, testing, documentation and releasing so that the process of
> staying current is painless.
>
> Look forward.  Keep it interesting and fun.
>
> Everyone running Mac OS 9 on their Pismo, raise your hand.  Anyone
> make their living running sequencing gels in Plexiglas doohickeys on
> their lab bench?
>
> I'm not suggesting that the BioPerl community is free to make
> arbitrary and capricious changes that makes it difficult for *anyone*
> to get anything done.  Churn is a waste of time.
>
> But why should the all-volunteer BioPerl community be stuck supporting
> code from 12 years ago because it's cost effective for someone else to
> avoid spending *their* $/time/people to stay up to date.
>
> Those sites that value stability/maturity/stagnation so highly have
> already accepted the cost/difficulty of nailing one of their feet to
> the floor as they try to run forward.  They recognize and depend on
> the benefits of having that stable base but generally they've also
> accepted the costs associated with their restrictive choices.  They
> know how to pull in separate kernel/driver updates so that they can
> actually run on nearly modern hardware.  They know, and live with, the
> fact that they're not going to have access to the shiny new stuff.
> And they know how to stay up to date, when they need to, with the
> software that their users need to be competitive (e.g. BioConductor
> and R).
>
> As long as (if/when...) updating a BioPerl release is something that
> can reliably happen with a few cpanm invocations then the sites that
> otherwise favor punctuated equilibrium will learn to handle gradual
> change.
>
> Those folks that are "stuck" on older releases always have the option
> of supporting professional Perl programmers to keep older releases
> going, backport changes, etc....  They're already buying support for
> their platforms (or freeloading and coping), let them put bread on the
> table at one of the bioinformatics consultancies or labs if they have
> something special they need.
>
> Have fun.  Use sharp tools.  Do cool science.  Build cool things.  No
> one is paying you to be backwards compatible with the previous
> millennium.
>
> g.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From tiago.hori at gmail.com  Wed Feb  6 08:25:41 2013
From: tiago.hori at gmail.com (Tiago Hori)
Date: Wed, 6 Feb 2013 05:25:41 -0800 (PST)
Subject: [Bioperl-l] Problems installing Bio::Tools::Run:StandAloneBlastPlus
Message-ID: <9b488c6e-34b3-4269-a7ac-e2206720939a@googlegroups.com>

Hi Guys,

I am trying to install the module Bio::Tools::Run:StandAloneBlastPlus, but 
it has been hard so far.

I managed to install and compile samtools, after finding all the 
dependencies, but I am still missing something! I posted the complete 
report below!

Any help, would be great!

Cheers,

T.

cpan[1]> install Bio::Tools::Run::StandAloneBlastPlus
Reading '/home/tiagohori/.cpan/Metadata'
  Database was generated on Tue, 05 Feb 2013 18:41:03 GMT
Running install for module 'Bio::Tools::Run::StandAloneBlastPlus'
Running make for C/CJ/CJFIELDS/BioPerl-Run-1.006900.tar.gz
Checksum for 
/home/tiagohori/.cpan/sources/authors/id/C/CJ/CJFIELDS/BioPerl-Run-1.006900.tar.gz 
ok
Scanning cache /home/tiagohori/.cpan/build for sizes
..................................------------------------------------------DONE
DEL(1/20): /home/tiagohori/.cpan/build/BioPerl-Run-1.006900-qpHfzz 
DEL(2/20): /home/tiagohori/.cpan/build/BioPerl-Run-1.006900-qpHfzz.yml 
DEL(3/20): /home/tiagohori/.cpan/build/BioPerl-Run-1.006900-nMOXgO 
DEL(4/20): /home/tiagohori/.cpan/build/BioPerl-Run-1.006900-nMOXgO.yml 
DEL(5/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-bgBQyC 
DEL(6/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-bgBQyC.yml 
DEL(7/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-Ki3dbt 
DEL(8/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-Ki3dbt.yml 
DEL(9/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-ciM7U4 
DEL(10/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-ciM7U4.yml 
DEL(11/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-oDyi_5 
DEL(12/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-oDyi_5.yml 
DEL(13/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-AQiiAn 
DEL(14/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-AQiiAn.yml 
DEL(15/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-0H2Z9o 
DEL(16/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-0H2Z9o.yml 
DEL(17/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-c_8A_U 
DEL(18/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-c_8A_U.yml 
DEL(19/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-lWtV8v 
DEL(20/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-lWtV8v.yml 

  CPAN.pm: Building C/CJ/CJFIELDS/BioPerl-Run-1.006900.tar.gz

Install scripts? y/n [n ]
n 
Do you want to run tests that require connection to servers across the 
internet
(likely to cause some failures)? y/n [n ]
n 
  - will not run internet-requiring tests
Created MYMETA.yml and MYMETA.json
Creating new 'Build' script for 'BioPerl-Run' version '1.006900'
Building BioPerl-Run
  CJFIELDS/BioPerl-Run-1.006900.tar.gz
  ./Build -- OK
Running Build test
t/Amap.t ...................... 1/18 # Required executable for 
Bio::Tools::Run::Alignment::Amap is not present
t/Amap.t ...................... ok     
t/AnalysisFactory_soap.t ...... skipped: Network tests have not been 
requested
t/Analysis_soap.t ............. skipped: Network tests have not been 
requested
t/BEDTools.t .................. 3/423 # Required executable for 
Bio::Tools::Run::BEDTools is not present
t/BEDTools.t .................. ok       
t/BWA.t ....................... 1/36 # Required executable for 
Bio::Tools::Run::BWA is not present
t/BWA.t ....................... ok     
t/Blat.t ...................... 1/33 # Required executable for 
Bio::Tools::Run::Alignment::Blat is not present
# Looks like you planned 33 tests but ran 20.
t/Blat.t ...................... Dubious, test returned 255 (wstat 65280, 
0xff00)
Failed 13/33 subtests 
(less 15 skipped subtests: 5 okay)
t/Bowtie.t .................... 1/73 # Required executable for 
Bio::Tools::Run::Bowtie is not present
t/Bowtie.t .................... ok     
t/Cap3.t ...................... 1/91 # Required executable for 
Bio::Tools::Run::Cap3 is not present
t/Cap3.t ...................... ok     
t/Clustalw.t .................. 1/45 # Required executable for 
Bio::Tools::Run::Alignment::Clustalw is not present
t/Clustalw.t .................. ok     
t/Coil.t ...................... 2/6 # Required executable for 
Bio::Tools::Run::Coil is not present
t/Coil.t ...................... ok   
t/Consense.t .................. 1/9 # Required executable for 
Bio::Tools::Run::Phylo::Phylip::Consense is not present
t/Consense.t .................. ok   
t/DBA.t ....................... 1/18 # Required executable for 
Bio::Tools::Run::Alignment::DBA is not present
t/DBA.t ....................... ok     
t/DrawGram.t .................. 1/6 # Required executable for 
Bio::Tools::Run::Phylo::Phylip::DrawGram is not present
t/DrawGram.t .................. ok   
t/DrawTree.t .................. 1/6 # Required executable for 
Bio::Tools::Run::Phylo::Phylip::DrawTree is not present
t/DrawTree.t .................. ok   
t/EMBOSS.t .................... ok     
t/Ensembl.t ................... skipped: Network tests have not been 
requested
t/Eponine.t ................... 1/7 # Looks like you planned 7 tests but 
ran 2.
t/Eponine.t ................... Dubious, test returned 255 (wstat 65280, 
0xff00)
Failed 5/7 subtests 
t/Exonerate.t ................. 1/89 # Required executable for 
Bio::Tools::Run::Alignment::Exonerate is not present
t/Exonerate.t ................. ok     
t/FootPrinter.t ............... 1/24 # Required executable for 
Bio::Tools::Run::FootPrinter is not present
t/FootPrinter.t ............... ok     
t/Genemark.hmm.prokaryotic.t .. 1/99 # Required environment variable 
$GENEMARK_MODELS is not set
t/Genemark.hmm.prokaryotic.t .. ok     
t/Genewise.t .................. 1/20 # Required executable for 
Bio::Tools::Run::Genewise is not present
t/Genewise.t .................. ok     
t/Genscan.t ................... 1/6 # Required environment variable 
$GENSCANDIR is not set
t/Genscan.t ................... ok   
t/Gerp.t ...................... 1/33 # Required executable for 
Bio::Tools::Run::Phylo::Gerp is not present
t/Gerp.t ...................... ok     
t/Glimmer2.t .................. 1/217 # Required executable for 
Bio::Tools::Run::Glimmer is not present
t/Glimmer2.t .................. ok       
t/Glimmer3.t .................. 1/111 # Required executable for 
Bio::Tools::Run::Glimmer is not present
t/Glimmer3.t .................. ok       
t/Gumby.t ..................... 1/124 # Required executable for 
Bio::Tools::Run::Phylo::Gumby is not present
t/Gumby.t ..................... ok       
t/Hmmer.t ..................... 1/27 # Required executable for 
Bio::Tools::Run::Hmmer is not present
t/Hmmer.t ..................... ok     
t/Hyphy.t ..................... 2/15 # Required executable for 
Bio::Tools::Run::Phylo::Hyphy::SLAC is not present
t/Hyphy.t ..................... ok     
t/Infernal.t .................. 1/43 # Required executable for 
Bio::Tools::Run::Infernal is not present
t/Infernal.t .................. ok     
t/Kalign.t .................... 1/8 # Required executable for 
Bio::Tools::Run::Alignment::Kalign is not present
t/Kalign.t .................... ok   
t/LVB.t ....................... 1/19 # Required executable for 
Bio::Tools::Run::Phylo::LVB is not present
t/LVB.t ....................... ok     
t/Lagan.t ..................... 1/12 # Required executable for 
Bio::Tools::Run::Alignment::Lagan is not present
t/Lagan.t ..................... ok     
t/MAFFT.t ..................... 1/17 # Required executable for 
Bio::Tools::Run::Alignment::MAFFT is not present
t/MAFFT.t ..................... ok     
t/MCS.t ....................... 1/24 # Required executable for 
Bio::Tools::Run::MCS is not present
t/MCS.t ....................... ok     
t/Maq.t ....................... 1/51 # Required executable for 
Bio::Tools::Run::Maq is not present
t/Maq.t ....................... ok     
t/Match.t ..................... 1/7 # Required executable for 
Bio::Tools::Run::Match is not present
t/Match.t ..................... ok   
t/Mdust.t ..................... 1/5 # Required executable for 
Bio::Tools::Run::Mdust is not present
t/Mdust.t ..................... ok   
t/Meme.t ...................... 1/25 # Required executable for 
Bio::Tools::Run::Meme is not present
t/Meme.t ...................... ok     
t/Minimo.t .................... 1/72 # Required executable for 
Bio::Tools::Run::Minimo is not present
t/Minimo.t .................... ok     
t/Molphy.t .................... 1/10 # Required executable for 
Bio::Tools::Run::Phylo::Molphy::ProtML is not present
t/Molphy.t .................... ok     
t/Muscle.t .................... 1/16 # Required executable for 
Bio::Tools::Run::Alignment::Muscle is not present
t/Muscle.t .................... ok     
t/Neighbor.t .................. 1/17 # Required executable for 
Bio::Tools::Run::Phylo::Phylip::Neighbor is not present
t/Neighbor.t .................. ok     
t/Newbler.t ................... 1/98 # Required executable for 
Bio::Tools::Run::Newbler is not present
t/Newbler.t ................... ok     
t/Njtree.t .................... 1/6 # Required executable for 
Bio::Tools::Run::Phylo::Njtree::Best is not present
t/Njtree.t .................... ok   
t/PAML.t ...................... 1/28 # Required executable for 
Bio::Tools::Run::Phylo::PAML::Codeml is not present
t/PAML.t ...................... ok     
t/Pal2Nal.t ................... 1/9 # Required executable for 
Bio::Tools::Run::Alignment::Pal2Nal is not present
t/Pal2Nal.t ................... ok   
t/PhastCons.t ................. 1/181 # Required executable for 
Bio::Tools::Run::Phylo::Phast::PhastCons is not present
t/PhastCons.t ................. ok       
t/Phrap.t ..................... 1/127 # Required executable for 
Bio::Tools::Run::Phrap is not present
t/Phrap.t ..................... ok       
t/Phyml.t ..................... 1/47 # Required executable for 
Bio::Tools::Run::Phylo::Phyml is not present
t/Phyml.t ..................... ok     
t/Primate.t ................... 1/8 # Required executable for 
Bio::Tools::Run::Primate is not present
t/Primate.t ................... ok   
t/Primer3.t ................... 1/9 # Required executable for 
Bio::Tools::Run::Primer3 is not present
t/Primer3.t ................... ok   
t/Prints.t .................... 1/7 # Required executable for 
Bio::Tools::Run::Prints is not present
t/Prints.t .................... ok   
t/Probalign.t ................. 1/13 # Required executable for 
Bio::Tools::Run::Alignment::Probalign is not present
t/Probalign.t ................. ok     
t/Probcons.t .................. 1/11 # Required executable for 
Bio::Tools::Run::Alignment::Probcons is not present
t/Probcons.t .................. ok     
t/Profile.t ................... 1/7 # Required executable for 
Bio::Tools::Run::Profile is not present
t/Profile.t ................... ok   
t/Promoterwise.t .............. 1/9 # Required executable for 
Bio::Tools::Run::Promoterwise is not present
t/Promoterwise.t .............. ok   
t/ProtDist.t .................. 1/14 # Required executable for 
Bio::Tools::Run::Phylo::Phylip::ProtDist is not present
t/ProtDist.t .................. ok     
t/ProtPars.t .................. 1/11 # Required executable for 
Bio::Tools::Run::Phylo::Phylip::ProtPars is not present
t/ProtPars.t .................. ok     
t/Pseudowise.t ................ 1/18 # Required executable for 
Bio::Tools::Run::Pseudowise is not present
t/Pseudowise.t ................ ok     
t/QuickTree.t ................. 1/13 # Required executable for 
Bio::Tools::Run::Phylo::QuickTree is not present
t/QuickTree.t ................. ok     
t/RepeatMasker.t .............. 1/12 RepeatMasker program not found as  or 
not executable. 
# Required executable for Bio::Tools::Run::RepeatMasker is not present
t/RepeatMasker.t .............. ok     
t/SABlastPlus.t ............... 1/65 # Required executable for 
Bio::Tools::Run::BlastPlus is not present
# Looks like you planned 65 tests but ran 63.
t/SABlastPlus.t ............... Dubious, test returned 255 (wstat 65280, 
0xff00)
Failed 2/65 subtests 
(less 59 skipped subtests: 4 okay)
t/SLR.t ....................... 1/7 # Required executable for 
Bio::Tools::Run::Phylo::SLR is not present
t/SLR.t ....................... ok   
t/Samtools.t .................. ok     
t/Seg.t ....................... 1/8 # Required executable for 
Bio::Tools::Run::Seg is not present
t/Seg.t ....................... ok   
t/Semphy.t .................... 1/19 # Required executable for 
Bio::Tools::Run::Phylo::Semphy is not present
t/Semphy.t .................... ok     
t/SeqBoot.t ................... 1/9 # Required executable for 
Bio::Tools::Run::Phylo::Phylip::SeqBoot is not present
t/SeqBoot.t ................... ok   
t/Signalp.t ................... 1/7 # Required executable for 
Bio::Tools::Run::Signalp is not present
t/Signalp.t ................... ok   
t/Sim4.t ...................... 1/23 # Required executable for 
Bio::Tools::Run::Alignment::Sim4 is not present
t/Sim4.t ...................... ok     
t/Simprot.t ................... 1/6 # Required executable for 
Bio::Tools::Run::Simprot is not present
t/Simprot.t ................... ok   
t/SoapEU-function.t ........... skipped: The optional module Bio::DB::ESoap 
(or dependencies thereof) was not installed
t/SoapEU-unit.t ............... skipped: The optional module Bio::DB::ESoap 
(or dependencies thereof) was not installed
t/StandAloneFasta.t ........... 1/15 # Required executable for 
Bio::Tools::Run::Alignment::StandAloneFasta is not present
t/StandAloneFasta.t ........... ok     
t/TCoffee.t ................... 1/27 # Required executable for 
Bio::Tools::Run::Alignment::TCoffee is not present
t/TCoffee.t ................... ok     
t/TigrAssembler.t ............. 1/88 # Required executable for 
Bio::Tools::Run::TigrAssembler is not present
# Required executable for Bio::Tools::Run::TigrAssembler is not present
t/TigrAssembler.t ............. ok     
t/Tmhmm.t ..................... 1/9 # Required executable for 
Bio::Tools::Run::Tmhmm is not present
t/Tmhmm.t ..................... ok   
t/TribeMCL.t .................. ok     
t/Vista.t ..................... ok   
t/gmap-run.t .................. 1/8 # Required executable for 
Bio::Tools::Run::Alignment::Gmap is not present
t/gmap-run.t .................. ok   
t/tRNAscanSE.t ................ 1/12 # Required executable for 
Bio::Tools::Run::tRNAscanSE is not present
t/tRNAscanSE.t ................ ok     

Test Summary Report
-------------------
t/Blat.t                    (Wstat: 65280 Tests: 20 Failed: 0)
  Non-zero exit status: 255
  Parse errors: Bad plan.  You planned 33 tests but ran 20.
t/Eponine.t                 (Wstat: 65280 Tests: 2 Failed: 0)
  Non-zero exit status: 255
  Parse errors: Bad plan.  You planned 7 tests but ran 2.
t/SABlastPlus.t             (Wstat: 65280 Tests: 63 Failed: 0)
  Non-zero exit status: 255
  Parse errors: Bad plan.  You planned 65 tests but ran 63.
Files=80, Tests=2876, 39 wallclock secs ( 0.54 usr  0.23 sys + 32.54 cusr 
 4.94 csys = 38.25 CPU)
Result: FAIL
Failed 3/80 test programs. 0/2876 subtests failed.
  CJFIELDS/BioPerl-Run-1.006900.tar.gz
  ./Build test -- NOT OK
//hint// to see the cpan-testers results for installing this module, try:
  reports CJFIELDS/BioPerl-Run-1.006900.tar.gz
Running Build install
  make test had returned bad status, won't install without force


From guy.leonard at gmail.com  Wed Feb  6 13:35:38 2013
From: guy.leonard at gmail.com (guy.leonard at gmail.com)
Date: Wed, 6 Feb 2013 10:35:38 -0800 (PST)
Subject: [Bioperl-l] Next BioPerl release
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu>
References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu>
Message-ID: <3e4d717e-b58a-4bfd-943d-6f213bfae260@googlegroups.com>

Nice, super work. 

Will there be a rough list of feature changes/addition/deprecation, or 
shall I consult git logs?

On Tuesday, 5 February 2013 21:53:29 UTC, Christopher Fields wrote:
>
> All, 
>
> I am scheduling the next BioPerl CPAN release tentatively for March 1. 
>  Any help in triaging bug reports would be greatly appreciated!   
>
> Amongst all other changes, as mentioned in a separate thread we will 
> remove Bio::FeatureIO, now developed in a separate repository: 
>
>     https://github.com/bioperl/Bio-FeatureIO 
>
> Feedback, suggestions, etc are greatly appreciated. 
>
> chris 
> _______________________________________________ 
> Bioperl-l mailing list 
> Biop... at lists.open-bio.org <javascript:> 
> http://lists.open-bio.org/mailman/listinfo/bioperl-l 
>


From guy.leonard at gmail.com  Wed Feb  6 13:35:38 2013
From: guy.leonard at gmail.com (guy.leonard at gmail.com)
Date: Wed, 6 Feb 2013 10:35:38 -0800 (PST)
Subject: [Bioperl-l] Next BioPerl release
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu>
References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu>
Message-ID: <3e4d717e-b58a-4bfd-943d-6f213bfae260@googlegroups.com>

Nice, super work. 

Will there be a rough list of feature changes/addition/deprecation, or 
shall I consult git logs?

On Tuesday, 5 February 2013 21:53:29 UTC, Christopher Fields wrote:
>
> All, 
>
> I am scheduling the next BioPerl CPAN release tentatively for March 1. 
>  Any help in triaging bug reports would be greatly appreciated!   
>
> Amongst all other changes, as mentioned in a separate thread we will 
> remove Bio::FeatureIO, now developed in a separate repository: 
>
>     https://github.com/bioperl/Bio-FeatureIO 
>
> Feedback, suggestions, etc are greatly appreciated. 
>
> chris 
> _______________________________________________ 
> Bioperl-l mailing list 
> Biop... at lists.open-bio.org <javascript:> 
> http://lists.open-bio.org/mailman/listinfo/bioperl-l 
>


From sidd.basu at gmail.com  Wed Feb  6 14:36:17 2013
From: sidd.basu at gmail.com (Siddhartha Basu)
Date: Wed, 6 Feb 2013 13:36:17 -0600
Subject: [Bioperl-l]  Re: Next BioPerl release
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu>
References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu>
Message-ID: <5112b0b3.a5dc320a.4105.1fe3@mx.google.com>

Hi, 

On Tue, 05 Feb 2013, Fields, Christopher J wrote:

> All,
> 
> I am scheduling the next BioPerl CPAN release tentatively for March 1.  Any help in triaging bug reports would be greatly appreciated!  
> 
> Amongst all other changes, as mentioned in a separate thread we will remove Bio::FeatureIO, now developed in a separate repository:
> 
>     https://github.com/bioperl/Bio-FeatureIO
> 
> Feedback, suggestions, etc are greatly appreciated.

Here are CI build report on 5.12, 5.14 and 5.16 using travis. 
https://api.travis-ci.org/jobs/4623997/log.txt?deansi=true
https://api.travis-ci.org/jobs/4623998/log.txt?deansi=true
https://api.travis-ci.org/jobs/4623999/log.txt?deansi=true

Could not get 5.10 to work on travis. Though i activated the (--network)
option,  it still didn't run one of the test that needs network. Also, initially got
confused by the fact that though it has dist.ini,  the tests still has
to run through Build.PL. Running **dzil test** do not work.

Hope this helps.

thanks, 
-siddhartha

> 
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Wed Feb  6 14:46:49 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 6 Feb 2013 19:46:49 +0000
Subject: [Bioperl-l] Next BioPerl release
In-Reply-To: <3e4d717e-b58a-4bfd-943d-6f213bfae260@googlegroups.com>
References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu>
	<3e4d717e-b58a-4bfd-943d-6f213bfae260@googlegroups.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1A109@CHIMBX5.ad.uillinois.edu>

We've been a little better at keeping track of significant changes this time 'round.  There aren't a lot of major updates, but it's important to make sure we get a release out to ensure everyone (not just those familiar with git) can access them.

chris

On Feb 6, 2013, at 12:35 PM, <guy.leonard at gmail.com>
 wrote:

> Nice, super work. 
> 
> Will there be a rough list of feature changes/addition/deprecation, or 
> shall I consult git logs?
> 
> On Tuesday, 5 February 2013 21:53:29 UTC, Christopher Fields wrote:
>> 
>> All, 
>> 
>> I am scheduling the next BioPerl CPAN release tentatively for March 1. 
>> Any help in triaging bug reports would be greatly appreciated!   
>> 
>> Amongst all other changes, as mentioned in a separate thread we will 
>> remove Bio::FeatureIO, now developed in a separate repository: 
>> 
>>    https://github.com/bioperl/Bio-FeatureIO 
>> 
>> Feedback, suggestions, etc are greatly appreciated. 
>> 
>> chris 
>> _______________________________________________ 
>> Bioperl-l mailing list 
>> Biop... at lists.open-bio.org <javascript:> 
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l 
>> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Wed Feb  6 14:54:58 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 6 Feb 2013 19:54:58 +0000
Subject: [Bioperl-l] Next BioPerl release
In-Reply-To: <5112b0b3.a5dc320a.4105.1fe3@mx.google.com>
References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu>
	<5112b0b3.a5dc320a.4105.1fe3@mx.google.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1A1AB@CHIMBX5.ad.uillinois.edu>

On Feb 6, 2013, at 1:36 PM, Siddhartha Basu <sidd.basu at gmail.com>
 wrote:

> Hi, 
> 
> On Tue, 05 Feb 2013, Fields, Christopher J wrote:
> 
>> All,
>> 
>> I am scheduling the next BioPerl CPAN release tentatively for March 1.  Any help in triaging bug reports would be greatly appreciated!  
>> 
>> Amongst all other changes, as mentioned in a separate thread we will remove Bio::FeatureIO, now developed in a separate repository:
>> 
>>    https://github.com/bioperl/Bio-FeatureIO
>> 
>> Feedback, suggestions, etc are greatly appreciated.
> 
> Here are CI build report on 5.12, 5.14 and 5.16 using travis. 
> https://api.travis-ci.org/jobs/4623997/log.txt?deansi=true
> https://api.travis-ci.org/jobs/4623998/log.txt?deansi=true
> https://api.travis-ci.org/jobs/4623999/log.txt?deansi=true
> 
> Could not get 5.10 to work on travis. Though i activated the (--network)
> option,  it still didn't run one of the test that needs network. Also, initially got
> confused by the fact that though it has dist.ini,  the tests still has
> to run through Build.PL. Running **dzil test** do not work.
> 
> Hope this helps.
> 
> thanks, 
> -siddhartha

Just to point out, that was for Bio-FeatureIO.  Truthfully I'm not worried about that one yet; got to get over Mt. Everest first (the main release).  

Build.PL is there mainly as a convenience for users w/o Dist::Zilla, which, last I recall, had a higher dependency list than even BioPerl (though I may be mistaken).  I'll probably have to set up a Build.PL that can be clobbered by Dist::Zilla as needed.  Or we can just get rid of it and insist that dev. code has to be added via 'use lib' or PERL5LIB, and not allow installation.

chris


From sidd.basu at gmail.com  Wed Feb  6 15:26:06 2013
From: sidd.basu at gmail.com (Siddhartha Basu)
Date: Wed, 6 Feb 2013 14:26:06 -0600
Subject: [Bioperl-l]  Re: Next BioPerl release
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1A1AB@CHIMBX5.ad.uillinois.edu>
References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu>
	<5112b0b3.a5dc320a.4105.1fe3@mx.google.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1A1AB@CHIMBX5.ad.uillinois.edu>
Message-ID: <5112bc60.c69e320a.1e98.2028@mx.google.com>

On Wed, 06 Feb 2013, Fields, Christopher J wrote:

> On Feb 6, 2013, at 1:36 PM, Siddhartha Basu <sidd.basu at gmail.com>
>  wrote:
> 
> > Hi, 
> > 
> > On Tue, 05 Feb 2013, Fields, Christopher J wrote:
> > 
> >> All,
> >> 
> >> I am scheduling the next BioPerl CPAN release tentatively for March 1.  Any help in triaging bug reports would be greatly appreciated!  
> >> 
> >> Amongst all other changes, as mentioned in a separate thread we will remove Bio::FeatureIO, now developed in a separate repository:
> >> 
> >>    https://github.com/bioperl/Bio-FeatureIO
> >> 
> >> Feedback, suggestions, etc are greatly appreciated.
> > 
> > Here are CI build report on 5.12, 5.14 and 5.16 using travis. 
> > https://api.travis-ci.org/jobs/4623997/log.txt?deansi=true
> > https://api.travis-ci.org/jobs/4623998/log.txt?deansi=true
> > https://api.travis-ci.org/jobs/4623999/log.txt?deansi=true
> > 
> > Could not get 5.10 to work on travis. Though i activated the (--network)
> > option,  it still didn't run one of the test that needs network. Also, initially got
> > confused by the fact that though it has dist.ini,  the tests still has
> > to run through Build.PL. Running **dzil test** do not work.
> > 
> > Hope this helps.
> > 
> > thanks, 
> > -siddhartha
> 
> Just to point out, that was for Bio-FeatureIO.  Truthfully I'm not worried about that one yet; got to get over Mt. Everest first (the main release).  
So,  what are steps left for getting the release out to CPAN. Like are
there lot of feature branches still left to be merged,  are there a lot
of unit tests still not passing. Just trying to figure out anyway i
could be of any help to expedite the release process. However,  if they
are already taken care of,  please ignore.

> 
> Build.PL is there mainly as a convenience for users w/o Dist::Zilla, which, last I recall, had a higher dependency list than even BioPerl (though I may be mistaken).  I'll probably have to set up a Build.PL that can be clobbered by Dist::Zilla as needed.  
As far as the error i encountered, presence of Build.PL was blocking dzil
build/release process. And by default,  dzil expects to generate
Build.PL during its build/release process. However,  i am not sure which
mode is the most suitable for bioperl devs.
> Or we can just get rid of it and insist that dev. code has to be added via 'use lib' or PERL5LIB, and not allow installation.

thanks, 
-siddhartha

> 
> chris


From hlapp at drycafe.net  Wed Feb  6 16:30:33 2013
From: hlapp at drycafe.net (Hilmar Lapp)
Date: Wed, 6 Feb 2013 16:30:33 -0500
Subject: [Bioperl-l] dependencies on perl version
In-Reply-To: <20754.39343.128576.743448@gargle.gargle.HOWL>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
Message-ID: <A78F0D43-8296-45CF-9409-320D1FE7CA2F@drycafe.net>

Great points, George, and you're making a very compelling argument. I'm in total agreement. It's almost becoming a reason to having to be embarrassed to still be programming in Perl these days, so one might as well have fun while it lasts.

	-hilmar

On Feb 6, 2013, at 12:58 PM, George Hartzell wrote:

> Fields, Christopher J writes:
>> [...]
>> Right, it took ~8 yrs to go from 5.8 to 5.10.  I'd like to point
>> out that Python users are in the same boat: the Python version for
>> CentOS 5 is 2.4.3, and Biopython requires a minimum of python 2.5
>> (and recommends python 2.7).   
>> 
>> We can always state that perl 5.8 is supported for the upcoming
>> Bioperl release, but we're dropping v5.8 support for any future
>> releases. 
> 
> Do more than drop support for 5.8.
> 
> The Perl community has put a transparent and predictable process in
> place for releasing [generally] better versions of the language.  It
> means that Perl has a chance of continuing to be relevant, attracting
> new talent and actually *fixing* some of the s&%t that gives Perl a
> bad rap.  It gives people something to plan around, no one should be
> surprised that v 5.X.Y is coming out in mid 20ZZ.
> 
> BioPerl should do the same thing, declare a release policy that trails
> along with the Perl release schedule.  Keep it simple and no one can
> argue with it.  Support Perl releases as long as the releases
> themselves are supported.
> 
> Rather than expending energy supporting out of date platforms, put the
> energy into being modern (or Modern...), better distro building and
> packaging, testing, documentation and releasing so that the process of
> staying current is painless.
> 
> Look forward.  Keep it interesting and fun.
> 
> Everyone running Mac OS 9 on their Pismo, raise your hand.  Anyone
> make their living running sequencing gels in Plexiglas doohickeys on
> their lab bench?
> 
> I'm not suggesting that the BioPerl community is free to make
> arbitrary and capricious changes that makes it difficult for *anyone*
> to get anything done.  Churn is a waste of time.
> 
> But why should the all-volunteer BioPerl community be stuck supporting
> code from 12 years ago because it's cost effective for someone else to
> avoid spending *their* $/time/people to stay up to date.
> 
> Those sites that value stability/maturity/stagnation so highly have
> already accepted the cost/difficulty of nailing one of their feet to
> the floor as they try to run forward.  They recognize and depend on
> the benefits of having that stable base but generally they've also
> accepted the costs associated with their restrictive choices.  They
> know how to pull in separate kernel/driver updates so that they can
> actually run on nearly modern hardware.  They know, and live with, the
> fact that they're not going to have access to the shiny new stuff.
> And they know how to stay up to date, when they need to, with the
> software that their users need to be competitive (e.g. BioConductor
> and R).
> 
> As long as (if/when...) updating a BioPerl release is something that
> can reliably happen with a few cpanm invocations then the sites that
> otherwise favor punctuated equilibrium will learn to handle gradual
> change.
> 
> Those folks that are "stuck" on older releases always have the option
> of supporting professional Perl programmers to keep older releases
> going, backport changes, etc....  They're already buying support for
> their platforms (or freeloading and coping), let them put bread on the
> table at one of the bioinformatics consultancies or labs if they have
> something special they need.
> 
> Have fun.  Use sharp tools.  Do cool science.  Build cool things.  No
> one is paying you to be backwards compatible with the previous
> millennium.
> 
> g.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net :
===========================================================


From cjfields at illinois.edu  Wed Feb  6 17:11:06 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 6 Feb 2013 22:11:06 +0000
Subject: [Bioperl-l] BioPerl long-term, was Re:  dependencies on perl version
In-Reply-To: <CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>

George,

Should put your post on a pedestal :)

tl;dr version: I completely agree, but we need help in order to do this.

Long(-winded) version:

I agree completely, backwards compatibility is killing us.  But, we do need current and new people to get involved and help drive this forward.  We need people on all fronts, from coding and bug fixes to documentation and web site maintenance.  I've been driving this bus for a number of years now.  Not getting tired yet, but I am getting substantially busier with my current endeavors, so my time spent working on BioPerl has dwindled considerably.  Any additional support or sharing of responsibilities will help tremendously in keeping up momentum (if someone else wants to take the wheel for a bit, please let me know :).  

If we follow the perl release route, we should streamline the release process (think Dist::Zilla), end support of older versions of Perl, and work on a sustainable release schedule.  The fact that we have so many of us so-called 'old folks' speaking up in favor of this is a very good sign.  We do need a bit more than that; we need help.  BioPerl is a very large project.

A key point we need to address, which is very important for the future of BioPerl.  I use Perl quite a bit in my current work (dabble with Ruby and Python as well when I have to).  BioPerl?  A little, but not as much as I could.  

Shocked?  The main three reason I don't use it 'in anger':  performance, performance, and performance.  It is very important that we make a concerted effort to address this at all levels.  It could be as simple as completely separating parsing from object creation (where the bulk of performance problems seem to lie, but not all of them).  

A specific example: Heng Li once tested the performance of FASTQ parsing (perl, python, bioperl, biopython, his C code, etc). BioPerl's FASTQ couldn't even be measured; IIRC it went on for many hours until he killed it.  This was with the older version of the parser, but I'm willing to bet the newer one I wrote isn't any better.

This. needs. to. change.

I see no problem in stating any generic parsing and low-level interfaces are just as much a part of what BioPerl encompasses as the higher-level Bio::* classes themselves.  Steve and Jason were on to something with SearchIO; it's maybe not as performant as we would like, but it certainly is more flexible in terms of what can be done, b/c it separates out low-level parsing from object creation.  That's the general model we should look at.  There is a good reason Biopython is following this model with their SearchIO implementation (Peter C, are you reading this?)

We have a lot of very talented people involved with this project, both on the purely computational and purely biological end as well as the folks like me who straddle the two domains.  A lot of good code out there that can be used, wrapped, taken advantage of, including everything we currently have in BioPerl.  Let's come up with something that both works and works well, that people can use on a regular basis, even at a low level if they choose.  That alone would dissuade new users from writing up (yet another) custom FASTA/FASTQ/BLAST/GenBank/etc parser b/c the BioPerl one takes millennia to finish.  

A few examples on this front: Rob Buels created a generic parser for GFF3 (Bio::GFF3::LowLevel) with very few dependencies, we wrap this with the newer Bio::FeatureIO code.  Leon has Bio::SFF.  Lincoln of course wrote Bio::DB::Sam and Bio::DB::BigFile.  I have started a wrapper around Heng's FASTQ/FASTA parsing code (kseq), it seems to work quite well (~20M FASTQ in 30 sec last I recall?).  

So:

If it means targeting performance, backwards-compatibility be damned (using Devel::NYTProf?), we do that.

If it means creating a new Bio-NGS repo to focus some of these efforts, so be it.

If it means we get away from the Java-based interface stuff in favor of something more Perl-like (roles anyone?), then I'm all for it.

If it means we modularize BioPerl so this can be done, well, you probably know where I stand (yes).

If it means this is to be BioPerl 2.0, then let's move that direction, sooner than later.

But I can't do it alone.  We (not just me, but we) need to drive the direction we take.

First one who codes gets the gold ring.

chris

On Feb 6, 2013, at 12:47 PM, Aaron Mackey <amackey at virginia.edu>
 wrote:

> Huzzah!
> 
> --
> Aaron J. Mackey, PhD
> Assistant Professor
> Center for Public Health Genomics
> University of Virginia
> amackey at virginia.edu
> http://www.cphg.virginia.edu/mackey
> 
> 
> On Wed, Feb 6, 2013 at 12:58 PM, George Hartzell <hartzell at alerce.com> wrote:
> Fields, Christopher J writes:
>  > [...]
>  > Right, it took ~8 yrs to go from 5.8 to 5.10.  I'd like to point
>  > out that Python users are in the same boat: the Python version for
>  > CentOS 5 is 2.4.3, and Biopython requires a minimum of python 2.5
>  > (and recommends python 2.7).
>  >
>  > We can always state that perl 5.8 is supported for the upcoming
>  > Bioperl release, but we're dropping v5.8 support for any future
>  > releases.
> 
> Do more than drop support for 5.8.
> 
> The Perl community has put a transparent and predictable process in
> place for releasing [generally] better versions of the language.  It
> means that Perl has a chance of continuing to be relevant, attracting
> new talent and actually *fixing* some of the s&%t that gives Perl a
> bad rap.  It gives people something to plan around, no one should be
> surprised that v 5.X.Y is coming out in mid 20ZZ.
> 
> BioPerl should do the same thing, declare a release policy that trails
> along with the Perl release schedule.  Keep it simple and no one can
> argue with it.  Support Perl releases as long as the releases
> themselves are supported.
> 
> Rather than expending energy supporting out of date platforms, put the
> energy into being modern (or Modern...), better distro building and
> packaging, testing, documentation and releasing so that the process of
> staying current is painless.
> 
> Look forward.  Keep it interesting and fun.
> 
> Everyone running Mac OS 9 on their Pismo, raise your hand.  Anyone
> make their living running sequencing gels in Plexiglas doohickeys on
> their lab bench?
> 
> I'm not suggesting that the BioPerl community is free to make
> arbitrary and capricious changes that makes it difficult for *anyone*
> to get anything done.  Churn is a waste of time.
> 
> But why should the all-volunteer BioPerl community be stuck supporting
> code from 12 years ago because it's cost effective for someone else to
> avoid spending *their* $/time/people to stay up to date.
> 
> Those sites that value stability/maturity/stagnation so highly have
> already accepted the cost/difficulty of nailing one of their feet to
> the floor as they try to run forward.  They recognize and depend on
> the benefits of having that stable base but generally they've also
> accepted the costs associated with their restrictive choices.  They
> know how to pull in separate kernel/driver updates so that they can
> actually run on nearly modern hardware.  They know, and live with, the
> fact that they're not going to have access to the shiny new stuff.
> And they know how to stay up to date, when they need to, with the
> software that their users need to be competitive (e.g. BioConductor
> and R).
> 
> As long as (if/when...) updating a BioPerl release is something that
> can reliably happen with a few cpanm invocations then the sites that
> otherwise favor punctuated equilibrium will learn to handle gradual
> change.
> 
> Those folks that are "stuck" on older releases always have the option
> of supporting professional Perl programmers to keep older releases
> going, backport changes, etc....  They're already buying support for
> their platforms (or freeloading and coping), let them put bread on the
> table at one of the bioinformatics consultancies or labs if they have
> something special they need.
> 
> Have fun.  Use sharp tools.  Do cool science.  Build cool things.  No
> one is paying you to be backwards compatible with the previous
> millennium.
> 
> g.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From cjfields at illinois.edu  Wed Feb  6 17:34:42 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 6 Feb 2013 22:34:42 +0000
Subject: [Bioperl-l] BioPerl long-term,
 was Re:  dependencies on perl version
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1AF0C@CHIMBX5.ad.uillinois.edu>

I want to clarify, parser optimization isn't the only point we need to focus on by any means (and may not be the main one).  There is a lot of room for improvement top to bottom, that was one specific example I have long held to be an issue.

-c

On Feb 6, 2013, at 4:11 PM, "Fields, Christopher J" <cjfields at illinois.edu> wrote:

> Shocked?  The main three reason I don't use it 'in anger':  performance, performance, and performance.  It is very important that we make a concerted effort to address this at all levels.  It could be as simple as completely separating parsing from object creation (where the bulk of performance problems seem to lie, but not all of them).  
...


From p.j.a.cock at googlemail.com  Wed Feb  6 17:43:13 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Wed, 6 Feb 2013 22:43:13 +0000
Subject: [Bioperl-l] BioPerl long-term,
	was Re: dependencies on perl version
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
Message-ID: <CAKVJ-_6v2r4R=F-sEAtC9TCLsuU1VxNi6vk-E4gsd2e=Ri0pjQ@mail.gmail.com>

On Wed, Feb 6, 2013 at 10:11 PM, Fields, Christopher J
<cjfields at illinois.edu> wrote:
>
> I see no problem in stating any generic parsing and low-level interfaces
> are just as much a part of what BioPerl encompasses as the higher-level
> Bio::* classes themselves.  Steve and Jason were on to something with
> SearchIO; it's maybe not as performant as we would like, but it certainly
> is more flexible in terms of what can be done, b/c it separates out
> low-level parsing from object creation.  That's the general model we
> should look at.  There is a good reason Biopython is following this
> model with their SearchIO implementation (Peter C, are you reading this?)

Actually I don't think we did end up with that kind of separation in the
Biopython SearchIO - which is not so say it isn't an excellent model
to follow. Rather the Biopython SearchIO (like the BioPerl one) had
as the first goal a consistent object model across assorted file
formats.

The idea of a low level minimal overhead parsers (which are very
format specific), on which a heavier but consistent object model
can be built might be a good balance - the high level API has the
connivence, but if you give that up you can have more speed.
That's what I recommend with FASTQ and Biopython, e.g.
http://news.open-bio.org/news/2009/09/biopython-fast-fastq/

>
> I have started a wrapper around Heng's FASTQ/FASTA parsing
> code (kseq), it seems to work quite well (~20M FASTQ in 30 sec
> last I recall?).
>

I'd have to dig through my emails, but I think the BioRuby guys
looked at that too - as I recall while it was fast, the error handling
left something to be desired. Email me directly or on the BioRuby
list if you want to follow up on that.

Regards,

Peter


From cjfields at illinois.edu  Wed Feb  6 17:53:21 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 6 Feb 2013 22:53:21 +0000
Subject: [Bioperl-l] FASTQ, was Re:  BioPerl long-term,
 was Re: dependencies on perl version
In-Reply-To: <CAKVJ-_6v2r4R=F-sEAtC9TCLsuU1VxNi6vk-E4gsd2e=Ri0pjQ@mail.gmail.com>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
	<CAKVJ-_6v2r4R=F-sEAtC9TCLsuU1VxNi6vk-E4gsd2e=Ri0pjQ@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu>

On Feb 6, 2013, at 4:43 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:

> On Wed, Feb 6, 2013 at 10:11 PM, Fields, Christopher J
> <cjfields at illinois.edu> wrote:
>> 
>> I see no problem in stating any generic parsing and low-level interfaces
>> are just as much a part of what BioPerl encompasses as the higher-level
>> Bio::* classes themselves.  Steve and Jason were on to something with
>> SearchIO; it's maybe not as performant as we would like, but it certainly
>> is more flexible in terms of what can be done, b/c it separates out
>> low-level parsing from object creation.  That's the general model we
>> should look at.  There is a good reason Biopython is following this
>> model with their SearchIO implementation (Peter C, are you reading this?)
> 
> Actually I don't think we did end up with that kind of separation in the
> Biopython SearchIO - which is not so say it isn't an excellent model
> to follow. Rather the Biopython SearchIO (like the BioPerl one) had
> as the first goal a consistent object model across assorted file
> formats.
> 
> The idea of a low level minimal overhead parsers (which are very
> format specific), on which a heavier but consistent object model
> can be built might be a good balance - the high level API has the
> connivence, but if you give that up you can have more speed.
> That's what I recommend with FASTQ and Biopython, e.g.
> http://news.open-bio.org/news/2009/09/biopython-fast-fastq/
> 
>> 
>> I have started a wrapper around Heng's FASTQ/FASTA parsing
>> code (kseq), it seems to work quite well (~20M FASTQ in 30 sec
>> last I recall?).
>> 
> 
> I'd have to dig through my emails, but I think the BioRuby guys
> looked at that too - as I recall while it was fast, the error handling
> left something to be desired. Email me directly or on the BioRuby
> list if you want to follow up on that.
> 
> Regards,
> 
> Peter

I did a little on this, worth following up on, but I pulled the FASTQ test examples you created from the paper to test it out.  IIRC it parsed where it needed to, but I'm not sure how it handled bad sequences, so yes, worth looking into.  Maybe worth moving to open-bio-l for broader discussion.

chris


From whereverroadgoes at gmail.com  Wed Feb  6 16:59:04 2013
From: whereverroadgoes at gmail.com (Slym)
Date: Wed, 6 Feb 2013 13:59:04 -0800 (PST)
Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases
In-Reply-To: <87txpr26jj.fsf@topper.koldfront.dk>
References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com>
	<CAJ57qHHphLgEyfkEEyt2HVh+RahSWpiuhuaA08vi5ZxMwDDgTg@mail.gmail.com>
	<b2154001-d1eb-4266-a491-108d3e6ae77d@googlegroups.com>
	<CAJ57qHG9zFomG1wB4fN7hZZaByvP_EhxOHRTt2OrOZz__WgawQ@mail.gmail.com>
	<d5e347d1-cbaa-498a-9b64-a5242fdc4dd8@googlegroups.com>
	<87txpr26jj.fsf@topper.koldfront.dk>
Message-ID: <411e920d-e614-417d-9198-78bef9adba16@googlegroups.com>

Everything's working now! Thank you very much, especially to you Adam!


>


From carandraug+dev at gmail.com  Wed Feb  6 20:38:20 2013
From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=)
Date: Thu, 7 Feb 2013 01:38:20 +0000
Subject: [Bioperl-l] dependencies on perl version
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
Message-ID: <CAPOrs_0esYVUe_0gZHdAtk4orJQMO82fLjnfNL3Nap=BqX7RWw@mail.gmail.com>

On 5 February 2013 20:56, Fields, Christopher J <cjfields at illinois.edu> wrote:
> On Feb 5, 2013, at 2:06 PM, Carn? Draug <carandraug+dev at gmail.com> wrote:
>> how much perl backwards compatibility does bioperl needs to keep?
>
> Aim for 5.10.1, but be careful of smart-match.

Well, I solved my problem differently and ended up not needing any of
the new features. But next time I'll know. Thanks

Carn?


From pcantalupo at gmail.com  Wed Feb  6 23:04:08 2013
From: pcantalupo at gmail.com (Paul Cantalupo)
Date: Wed, 6 Feb 2013 23:04:08 -0500
Subject: [Bioperl-l] bug 3376 status needs updated
Message-ID: <CAJqbkv77bC3eWGsaOwwXFnGMrAZjVJSSU97CCRwJmMMPLQRjTQ@mail.gmail.com>

Hi,

A few months ago, I fixed bug 3376 (
https://github.com/bioperl/bioperl-live/commit/142e5d79e3a6593db32bf0af99048f47d01bd3f2).
The Redmine bug page (https://redmine.open-bio.org/issues/3376) hasn't been
updated to resolved or closed. Should I do this or is Chris the only one
who does that?

Thank you,

Paul


From cjfields at illinois.edu  Wed Feb  6 23:20:30 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Thu, 7 Feb 2013 04:20:30 +0000
Subject: [Bioperl-l] bug 3376 status needs updated
In-Reply-To: <CAJqbkv77bC3eWGsaOwwXFnGMrAZjVJSSU97CCRwJmMMPLQRjTQ@mail.gmail.com>
References: <CAJqbkv77bC3eWGsaOwwXFnGMrAZjVJSSU97CCRwJmMMPLQRjTQ@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1B45C@CHIMBX5.ad.uillinois.edu>

No, go ahead and close it.  Let me know if you run into perm. problems with it.

chris

On Feb 6, 2013, at 10:04 PM, Paul Cantalupo <pcantalupo at gmail.com>
 wrote:

> Hi,
> 
> A few months ago, I fixed bug 3376 (
> https://github.com/bioperl/bioperl-live/commit/142e5d79e3a6593db32bf0af99048f47d01bd3f2).
> The Redmine bug page (https://redmine.open-bio.org/issues/3376) hasn't been
> updated to resolved or closed. Should I do this or is Chris the only one
> who does that?
> 
> Thank you,
> 
> Paul
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From l.m.timmermans at students.uu.nl  Thu Feb  7 04:07:57 2013
From: l.m.timmermans at students.uu.nl (Leon Timmermans)
Date: Thu, 7 Feb 2013 10:07:57 +0100
Subject: [Bioperl-l] Next BioPerl release
In-Reply-To: <5112bc60.c69e320a.1e98.2028@mx.google.com>
References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu>
	<5112b0b3.a5dc320a.4105.1fe3@mx.google.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1A1AB@CHIMBX5.ad.uillinois.edu>
	<5112bc60.c69e320a.1e98.2028@mx.google.com>
Message-ID: <CAC1jpXDQG8NwaPKd8PEVqWs7NWHHAkrGaasCeJ+bKVy1z0he1Q@mail.gmail.com>

On Wed, Feb 6, 2013 at 9:26 PM, Siddhartha Basu <sidd.basu at gmail.com> wrote:
> As far as the error i encountered, presence of Build.PL was blocking dzil
> build/release process. And by default,  dzil expects to generate
> Build.PL during its build/release process. However,  i am not sure which
> mode is the most suitable for bioperl devs.

You can prune the Build.PL, and then let dzil add its own. We wouldn't
be the first to do that sort of thing.

Leon


From amackey at virginia.edu  Thu Feb  7 10:25:07 2013
From: amackey at virginia.edu (Aaron Mackey)
Date: Thu, 7 Feb 2013 10:25:07 -0500
Subject: [Bioperl-l] FASTQ, was Re: BioPerl long-term,
 was Re: dependencies on perl version
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
	<CAKVJ-_6v2r4R=F-sEAtC9TCLsuU1VxNi6vk-E4gsd2e=Ri0pjQ@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu>
Message-ID: <CAErFSojxeHBTcNK0GiYQ8D-MbPgzMvZ8xfnbeVU0-KaCNq7ZXw@mail.gmail.com>

You might also want to consider a lazy/pull-based parser to defer
parsing/object-building for pieces of the object that don't get used.  This
also usually provides some error tolerance.

-Aaron

--
Aaron J. Mackey, PhD
Assistant Professor
Center for Public Health Genomics
University of Virginia
amackey at virginia.edu
http://www.cphg.virginia.edu/mackey


On Wed, Feb 6, 2013 at 5:53 PM, Fields, Christopher J <cjfields at illinois.edu
> wrote:

> On Feb 6, 2013, at 4:43 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
>
> > On Wed, Feb 6, 2013 at 10:11 PM, Fields, Christopher J
> > <cjfields at illinois.edu> wrote:
> >>
> >> I see no problem in stating any generic parsing and low-level interfaces
> >> are just as much a part of what BioPerl encompasses as the higher-level
> >> Bio::* classes themselves.  Steve and Jason were on to something with
> >> SearchIO; it's maybe not as performant as we would like, but it
> certainly
> >> is more flexible in terms of what can be done, b/c it separates out
> >> low-level parsing from object creation.  That's the general model we
> >> should look at.  There is a good reason Biopython is following this
> >> model with their SearchIO implementation (Peter C, are you reading
> this?)
> >
> > Actually I don't think we did end up with that kind of separation in the
> > Biopython SearchIO - which is not so say it isn't an excellent model
> > to follow. Rather the Biopython SearchIO (like the BioPerl one) had
> > as the first goal a consistent object model across assorted file
> > formats.
> >
> > The idea of a low level minimal overhead parsers (which are very
> > format specific), on which a heavier but consistent object model
> > can be built might be a good balance - the high level API has the
> > connivence, but if you give that up you can have more speed.
> > That's what I recommend with FASTQ and Biopython, e.g.
> > http://news.open-bio.org/news/2009/09/biopython-fast-fastq/
> >
> >>
> >> I have started a wrapper around Heng's FASTQ/FASTA parsing
> >> code (kseq), it seems to work quite well (~20M FASTQ in 30 sec
> >> last I recall?).
> >>
> >
> > I'd have to dig through my emails, but I think the BioRuby guys
> > looked at that too - as I recall while it was fast, the error handling
> > left something to be desired. Email me directly or on the BioRuby
> > list if you want to follow up on that.
> >
> > Regards,
> >
> > Peter
>
> I did a little on this, worth following up on, but I pulled the FASTQ test
> examples you created from the paper to test it out.  IIRC it parsed where
> it needed to, but I'm not sure how it handled bad sequences, so yes, worth
> looking into.  Maybe worth moving to open-bio-l for broader discussion.
>
> chris
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From tiago.hori at gmail.com  Thu Feb  7 09:58:37 2013
From: tiago.hori at gmail.com (Tiago Hori)
Date: Thu, 7 Feb 2013 06:58:37 -0800 (PST)
Subject: [Bioperl-l] Search I::O
In-Reply-To: <6B0BCF1B-4B67-4697-9B34-8F822B4DC565@gmail.com>
References: <39b1269f-63a7-4b29-af79-8c93ab231abf@googlegroups.com>
	<6B0BCF1B-4B67-4697-9B34-8F822B4DC565@gmail.com>
Message-ID: <e5d61704-086a-4434-ae80-434252d1f55e@googlegroups.com>

Thanks, Jason! It is working Now.

So here is what I am trying to accomplish. For a given Blastx report, I 
want to extract the best BLASTx hit that is human, and does not contain 
unnamed or Predicted. I got very close, but I still can't get it to give me 
only the top BLAST hit, it gives me all blast hits that meet my criteria. I 
tried using "last" to stop it from looping through the hits, once it found 
a human one, but it didn't work. Can someone help? Here is my code so far 
(mostly stolen for the wiki).

use strict;
use Bio::SearchIO; 

my $in = new Bio::SearchIO(-format => 'blast', 
                           -file   => 'testsalmon.txt');
while( my $result = $in->next_result ) {
 ## $result is a Bio::Search::Result::ResultI compliant object
  while( my $hit = $result->next_hit ) {
  ## $hit is a Bio::Search::Hit::HitI compliant object    
    if( $hit->description !~ /[Uu]nnamed|PREDICTED|hypothetical/){        
      if( $hit->description =~ /Homo sapiens/){  
         while( my $hsp = $hit->next_hsp ) {
          ## $hsp is a Bio::Search::HSP::HSPI compliant object
              if( $hsp->length('total') > 50 ) {
                if ( $hsp->percent_identity >= 30) {
              if( $hsp->evalue <= 1e-05){
               print "Query=",   $result->query_name,"\t",
                     " Description=",    $hit->description,"\t",
                     " Hit=",        $hit->name,"\t",
                     " Length=",     $hsp->length('total'),"\t",
                     " Percent_id=", $hsp->percent_identity,"\t",
          }
        }
          }
     }
      }
    }
  }
}


T.


On Wednesday, February 6, 2013 6:46:47 PM UTC-3:30, Jason Stajich wrote:
>
> you are missing a comma after the -format => 'blast' 
> should be 
> my $in = Bio::SearchIO->new(-format => 'blast',   
>   -file => 'XXX' ); 
>
>
> On Feb 5, 2013, at 7:21 AM, Tiago Hori <tiago... at gmail.com <javascript:>> 
> wrote: 
>
> > Hi All, 
> > 
> > I am trying to find the best putative orthologs for 44K Atlantic Salmon 
> > sequences, and so I need to parse 44K BLAST reports to find the best 
> human 
> > hit. I am trying to learn Seach::IO, but when I try the first example on 
> > the HOWTO: use strict; 
> > use Bio::SearchIO; 
> > 
> > my $in = new Bio::SearchIO(-format => 'blast' 
> >               -file => 'C001R047.txt'); 
> > 
> > while( my $result = $in->next_result ) { 
> >  ## $result is a Bio::Search::Result::ResultI compliant object 
> >  while( my $hit = $result->next_hit ) { 
> >    ## $hit is a Bio::Search::Hit::HitI compliant object 
> >    while( my $hsp = $hit->next_hsp ) { 
> >      ## $hsp is a Bio::Search::HSP::HSPI compliant object 
> >      if( $hsp->length('total') > 50 ) { 
> >        if ( $hsp->percent_identity >= 75 ) { 
> >          print "Query=",   $result->query_name, 
> >            " Hit=",        $hit->name, 
> >            " Length=",     $hsp->length('total'), 
> >            " Percent_id=", $hsp->percent_identity, "\n"; 
> >        } 
> >      } 
> >    }   
> >  } 
> > } 
> > 
> > I get this error: Odd number of elements in hash assignment at 
> > /usr/local/share/perl/5.14.2/Bio/SearchIO.pm line 189. 
> > 
> > I am using BioPerl version 1.6.901. Is there a format problem with the 
> > blast reports? 
> > 
> > Any help would be greatly appreciated! 
> > 
> > T. 
> > _______________________________________________ 
> > Bioperl-l mailing list 
> > Biop... at lists.open-bio.org <javascript:> 
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l 
>
> Jason Stajich 
> jason.... at gmail.com <javascript:> 
> ja... at bioperl.org <javascript:> 
>
>


From cjfields at illinois.edu  Thu Feb  7 10:56:04 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Thu, 7 Feb 2013 15:56:04 +0000
Subject: [Bioperl-l] FASTQ, was Re: BioPerl long-term,
 was Re: dependencies on perl version
In-Reply-To: <CAErFSojxeHBTcNK0GiYQ8D-MbPgzMvZ8xfnbeVU0-KaCNq7ZXw@mail.gmail.com>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
	<CAKVJ-_6v2r4R=F-sEAtC9TCLsuU1VxNi6vk-E4gsd2e=Ri0pjQ@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu>
	<CAErFSojxeHBTcNK0GiYQ8D-MbPgzMvZ8xfnbeVU0-KaCNq7ZXw@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1C42B@CHIMBX5.ad.uillinois.edu>

This will likely be the approach for more NGS-friendly Bio::Seq class.  Calculation of the PHRED scores could also be deferred until needed.

seqtk has some C-based methods that we could possibly take advantage of, but will have to look into it.

chris

On Feb 7, 2013, at 9:25 AM, Aaron Mackey <amackey at virginia.edu> wrote:

> You might also want to consider a lazy/pull-based parser to defer parsing/object-building for pieces of the object that don't get used.  This also usually provides some error tolerance.
> 
> -Aaron
> 
> --
> Aaron J. Mackey, PhD
> Assistant Professor
> Center for Public Health Genomics
> University of Virginia
> amackey at virginia.edu
> http://www.cphg.virginia.edu/mackey
> 
> 
> On Wed, Feb 6, 2013 at 5:53 PM, Fields, Christopher J <cjfields at illinois.edu> wrote:
> On Feb 6, 2013, at 4:43 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> 
> > On Wed, Feb 6, 2013 at 10:11 PM, Fields, Christopher J
> > <cjfields at illinois.edu> wrote:
> >>
> >> I see no problem in stating any generic parsing and low-level interfaces
> >> are just as much a part of what BioPerl encompasses as the higher-level
> >> Bio::* classes themselves.  Steve and Jason were on to something with
> >> SearchIO; it's maybe not as performant as we would like, but it certainly
> >> is more flexible in terms of what can be done, b/c it separates out
> >> low-level parsing from object creation.  That's the general model we
> >> should look at.  There is a good reason Biopython is following this
> >> model with their SearchIO implementation (Peter C, are you reading this?)
> >
> > Actually I don't think we did end up with that kind of separation in the
> > Biopython SearchIO - which is not so say it isn't an excellent model
> > to follow. Rather the Biopython SearchIO (like the BioPerl one) had
> > as the first goal a consistent object model across assorted file
> > formats.
> >
> > The idea of a low level minimal overhead parsers (which are very
> > format specific), on which a heavier but consistent object model
> > can be built might be a good balance - the high level API has the
> > connivence, but if you give that up you can have more speed.
> > That's what I recommend with FASTQ and Biopython, e.g.
> > http://news.open-bio.org/news/2009/09/biopython-fast-fastq/
> >
> >>
> >> I have started a wrapper around Heng's FASTQ/FASTA parsing
> >> code (kseq), it seems to work quite well (~20M FASTQ in 30 sec
> >> last I recall?).
> >>
> >
> > I'd have to dig through my emails, but I think the BioRuby guys
> > looked at that too - as I recall while it was fast, the error handling
> > left something to be desired. Email me directly or on the BioRuby
> > list if you want to follow up on that.
> >
> > Regards,
> >
> > Peter
> 
> I did a little on this, worth following up on, but I pulled the FASTQ test examples you created from the paper to test it out.  IIRC it parsed where it needed to, but I'm not sure how it handled bad sequences, so yes, worth looking into.  Maybe worth moving to open-bio-l for broader discussion.
> 
> chris
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From amackey at virginia.edu  Thu Feb  7 11:09:14 2013
From: amackey at virginia.edu (Aaron Mackey)
Date: Thu, 7 Feb 2013 11:09:14 -0500
Subject: [Bioperl-l] FASTQ, was Re: BioPerl long-term,
 was Re: dependencies on perl version
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1C42B@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
	<CAKVJ-_6v2r4R=F-sEAtC9TCLsuU1VxNi6vk-E4gsd2e=Ri0pjQ@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu>
	<CAErFSojxeHBTcNK0GiYQ8D-MbPgzMvZ8xfnbeVU0-KaCNq7ZXw@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1C42B@CHIMBX5.ad.uillinois.edu>
Message-ID: <CAErFSoitVuxPbBHbHcEh=dZ+A8qPjjmNvF14iYBVK=FKRKL5ig@mail.gmail.com>

e.g., a pull-based FASTQ parser that did nothing else at the top level but
"chunk" the file into as-yet-unparsed four-line blobs could appear to work
very fast, if the user code did nothing but count the number of entries:

  while (my $seq = $seqio->nextseq) { $ct++ };

in other words, you defer *everything* except the minimal amount of
parsing/logic required to detect object boundaries.

This is, in fact, the exact opposite of the event-based SearchIO "push"
parsers, which always perform the most parsing possible, despite the user
never accessing most of the material.

Lastly, with respect to performance, if the parsing/object building
operation is not simply IO bound, then parallel parser/object-building CPU
threads could be considered, which could then dynamically adapt to
pre-parse attributes (e.g. quality scores) that the calling code was
actually using.  What's the state of thread-safe Perl these days?

-Aaron


On Thu, Feb 7, 2013 at 10:56 AM, Fields, Christopher J <
cjfields at illinois.edu> wrote:

> This will likely be the approach for more NGS-friendly Bio::Seq class.
>  Calculation of the PHRED scores could also be deferred until needed.
>
> seqtk has some C-based methods that we could possibly take advantage of,
> but will have to look into it.
>
> chris
>
> On Feb 7, 2013, at 9:25 AM, Aaron Mackey <amackey at virginia.edu> wrote:
>
> > You might also want to consider a lazy/pull-based parser to defer
> parsing/object-building for pieces of the object that don't get used.  This
> also usually provides some error tolerance.
> >
> > -Aaron
>


From sidd.basu at gmail.com  Thu Feb  7 11:38:47 2013
From: sidd.basu at gmail.com (Siddhartha Basu)
Date: Thu, 7 Feb 2013 10:38:47 -0600
Subject: [Bioperl-l]  Re: FASTQ, was Re:BioPerl long-term,
	was Re:	dependencies on perl version
In-Reply-To: <CAErFSoitVuxPbBHbHcEh=dZ+A8qPjjmNvF14iYBVK=FKRKL5ig@mail.gmail.com>
References: <CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
	<CAKVJ-_6v2r4R=F-sEAtC9TCLsuU1VxNi6vk-E4gsd2e=Ri0pjQ@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu>
	<CAErFSojxeHBTcNK0GiYQ8D-MbPgzMvZ8xfnbeVU0-KaCNq7ZXw@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1C42B@CHIMBX5.ad.uillinois.edu>
	<CAErFSoitVuxPbBHbHcEh=dZ+A8qPjjmNvF14iYBVK=FKRKL5ig@mail.gmail.com>
Message-ID: <5113d899.ea64320a.489a.262d@mx.google.com>

Another approach might be use map-reduce(Hadoop) if possible. I have
seen one implementation in biopython's GFF3 parser.
http://bcbio.wordpress.com/2009/03/22/mapreduce-implementation-of-gff-parsing-for-biopython/

-siddhartha


On Thu, 07 Feb 2013, Aaron Mackey wrote:

> e.g., a pull-based FASTQ parser that did nothing else at the top level but
> "chunk" the file into as-yet-unparsed four-line blobs could appear to work
> very fast, if the user code did nothing but count the number of entries:
> 
>   while (my $seq = $seqio->nextseq) { $ct++ };
> 
> in other words, you defer *everything* except the minimal amount of
> parsing/logic required to detect object boundaries.
> 
> This is, in fact, the exact opposite of the event-based SearchIO "push"
> parsers, which always perform the most parsing possible, despite the user
> never accessing most of the material.
> 
> Lastly, with respect to performance, if the parsing/object building
> operation is not simply IO bound, then parallel parser/object-building CPU
> threads could be considered, which could then dynamically adapt to
> pre-parse attributes (e.g. quality scores) that the calling code was
> actually using.  What's the state of thread-safe Perl these days?
> 
> -Aaron
> 
> 
> On Thu, Feb 7, 2013 at 10:56 AM, Fields, Christopher J <
> cjfields at illinois.edu> wrote:
> 
> > This will likely be the approach for more NGS-friendly Bio::Seq class.
> >  Calculation of the PHRED scores could also be deferred until needed.
> >
> > seqtk has some C-based methods that we could possibly take advantage of,
> > but will have to look into it.
> >
> > chris
> >
> > On Feb 7, 2013, at 9:25 AM, Aaron Mackey <amackey at virginia.edu> wrote:
> >
> > > You might also want to consider a lazy/pull-based parser to defer
> > parsing/object-building for pieces of the object that don't get used.  This
> > also usually provides some error tolerance.
> > >
> > > -Aaron
> >
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Thu Feb  7 11:55:53 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Thu, 7 Feb 2013 16:55:53 +0000
Subject: [Bioperl-l] FASTQ, was Re:BioPerl long-term,
	was Re:	dependencies on perl version
In-Reply-To: <5113d899.ea64320a.489a.262d@mx.google.com>
References: <CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
	<CAKVJ-_6v2r4R=F-sEAtC9TCLsuU1VxNi6vk-E4gsd2e=Ri0pjQ@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu>
	<CAErFSojxeHBTcNK0GiYQ8D-MbPgzMvZ8xfnbeVU0-KaCNq7ZXw@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1C42B@CHIMBX5.ad.uillinois.edu>
	<CAErFSoitVuxPbBHbHcEh=dZ+A8qPjjmNvF14iYBVK=FKRKL5ig@mail.gmail.com>
	<5113d899.ea64320a.489a.262d@mx.google.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1C7B8@CHIMBX5.ad.uillinois.edu>

I think we will want to allow for a multitude of implementations.  SeqIO already allows for that to a degree, but multiple backend implementations (say, different ways of parsing/processing FASTQ and others) isn't supported yet.

chris

On Feb 7, 2013, at 10:38 AM, Siddhartha Basu <sidd.basu at gmail.com> wrote:

> Another approach might be use map-reduce(Hadoop) if possible. I have
> seen one implementation in biopython's GFF3 parser.
> http://bcbio.wordpress.com/2009/03/22/mapreduce-implementation-of-gff-parsing-for-biopython/
> 
> -siddhartha
> 
> 
> On Thu, 07 Feb 2013, Aaron Mackey wrote:
> 
>> e.g., a pull-based FASTQ parser that did nothing else at the top level but
>> "chunk" the file into as-yet-unparsed four-line blobs could appear to work
>> very fast, if the user code did nothing but count the number of entries:
>> 
>>  while (my $seq = $seqio->nextseq) { $ct++ };
>> 
>> in other words, you defer *everything* except the minimal amount of
>> parsing/logic required to detect object boundaries.
>> 
>> This is, in fact, the exact opposite of the event-based SearchIO "push"
>> parsers, which always perform the most parsing possible, despite the user
>> never accessing most of the material.
>> 
>> Lastly, with respect to performance, if the parsing/object building
>> operation is not simply IO bound, then parallel parser/object-building CPU
>> threads could be considered, which could then dynamically adapt to
>> pre-parse attributes (e.g. quality scores) that the calling code was
>> actually using.  What's the state of thread-safe Perl these days?
>> 
>> -Aaron
>> 
>> 
>> On Thu, Feb 7, 2013 at 10:56 AM, Fields, Christopher J <
>> cjfields at illinois.edu> wrote:
>> 
>>> This will likely be the approach for more NGS-friendly Bio::Seq class.
>>> Calculation of the PHRED scores could also be deferred until needed.
>>> 
>>> seqtk has some C-based methods that we could possibly take advantage of,
>>> but will have to look into it.
>>> 
>>> chris
>>> 
>>> On Feb 7, 2013, at 9:25 AM, Aaron Mackey <amackey at virginia.edu> wrote:
>>> 
>>>> You might also want to consider a lazy/pull-based parser to defer
>>> parsing/object-building for pieces of the object that don't get used.  This
>>> also usually provides some error tolerance.
>>>> 
>>>> -Aaron
>>> 
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Thu Feb  7 12:01:07 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Thu, 7 Feb 2013 17:01:07 +0000
Subject: [Bioperl-l] FASTQ, was Re: BioPerl long-term,
 was Re: dependencies on perl version
In-Reply-To: <CAErFSoitVuxPbBHbHcEh=dZ+A8qPjjmNvF14iYBVK=FKRKL5ig@mail.gmail.com>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
	<CAKVJ-_6v2r4R=F-sEAtC9TCLsuU1VxNi6vk-E4gsd2e=Ri0pjQ@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu>
	<CAErFSojxeHBTcNK0GiYQ8D-MbPgzMvZ8xfnbeVU0-KaCNq7ZXw@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1C42B@CHIMBX5.ad.uillinois.edu>
	<CAErFSoitVuxPbBHbHcEh=dZ+A8qPjjmNvF14iYBVK=FKRKL5ig@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1C7EF@CHIMBX5.ad.uillinois.edu>

re: thread-safe perl, so-so at best from what I understand.

chris

On Feb 7, 2013, at 10:09 AM, Aaron Mackey <amackey at virginia.edu> wrote:

> e.g., a pull-based FASTQ parser that did nothing else at the top level but "chunk" the file into as-yet-unparsed four-line blobs could appear to work very fast, if the user code did nothing but count the number of entries:
> 
>   while (my $seq = $seqio->nextseq) { $ct++ };
> 
> in other words, you defer *everything* except the minimal amount of parsing/logic required to detect object boundaries.
> 
> This is, in fact, the exact opposite of the event-based SearchIO "push" parsers, which always perform the most parsing possible, despite the user never accessing most of the material.
> 
> Lastly, with respect to performance, if the parsing/object building operation is not simply IO bound, then parallel parser/object-building CPU threads could be considered, which could then dynamically adapt to pre-parse attributes (e.g. quality scores) that the calling code was actually using.  What's the state of thread-safe Perl these days?
> 
> -Aaron
> 
> 
> On Thu, Feb 7, 2013 at 10:56 AM, Fields, Christopher J <cjfields at illinois.edu> wrote:
> This will likely be the approach for more NGS-friendly Bio::Seq class.  Calculation of the PHRED scores could also be deferred until needed.
> 
> seqtk has some C-based methods that we could possibly take advantage of, but will have to look into it.
> 
> chris
> 
> On Feb 7, 2013, at 9:25 AM, Aaron Mackey <amackey at virginia.edu> wrote:
> 
> > You might also want to consider a lazy/pull-based parser to defer parsing/object-building for pieces of the object that don't get used.  This also usually provides some error tolerance.
> >
> > -Aaron


From hartzell at alerce.com  Thu Feb  7 16:36:24 2013
From: hartzell at alerce.com (George Hartzell)
Date: Thu, 7 Feb 2013 13:36:24 -0800
Subject: [Bioperl-l]  BioPerl long-term,
	was Re:  dependencies on perl version
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
Message-ID: <20756.7768.125680.662488@gargle.gargle.HOWL>

Fields, Christopher J writes:
 > George,
 > 
 > Should put your post on a pedestal :)
 > 
 > tl;dr version: I completely agree, but we need help in order to do this.
 > [...]

And therein lies the [a] problem.  Don't look at me....

I'm not coding on bioinformatics problems these days (though I'm
available...) so _maybe_ I shouldn't have gotten up on the soapbox.

But I'm so sick of getting into arguments (or walking away from
them...) with Ruby and Python [and lisp and *PHP*] fans; Perl is dead,
you can't write good code in Perl, look - Ruby has GEMS!, etc...

Perl of the olden days was an easy language in which to write really
shitty code.  Even the Perl of the BioPerl heyday wasn't really much
help; role your own OO, role your own distro-building, mountains of
monkey-work to provide consistent POD, versioning, etc...

But that's not the Perl that I use.  I have Moose and Moo.  TAP and
the things built on it.  Dist::Zilla.  PerlTidy.  PerlCritic.  cpanm.
MetaCPAN.  Pinto.  GitHub.  Perlbrew.  Wow.

It isn't any harder to write good code, for measures that I care
about, using Perl than it is *any* of the other similar languages.

And it's just as easy, and happens just as frequently, for people to
write shitty (undocumented, untested, poorly managed, poorly packaged,
...) stuff in the other languages.

GET OFF MY LAWN, KID! (Yeah, I know...)

But BioPerl *is* dying.  You might be standing on the shoulders of
giants when you use it to solve a problem, but you *definitely* have
those same giants (and their extended families) on your shoulders
every time I see you try move the project forward.  All of that
history has become the tail that's wagging the dog.

If all y'all are going to keep the thing alive, moving forward and
contributing to new great works then make Apple your hero.  Deprecate
the stuff that's holding you back, give folks a path forward and move
on.

Have fun.  Use sharp tools.  Do cool science.  Build cool things.
Advance your careers (forgot that one last time).  Be reasonable and
professional.

Supporting last year's projects is someone else's business
opportunity.

g.

ps.  Are all y'all following this thread?

     http://news.ycombinator.com/item?id=5123022

Maybe someone should search down for this bit: "Where to start? Any
list of this [sic] projects?" and insert a plug for the various
open-bio projects.  (But "someone" doesn't work here, he said...).


From cjfields at illinois.edu  Thu Feb  7 18:12:19 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Thu, 7 Feb 2013 23:12:19 +0000
Subject: [Bioperl-l] BioPerl long-term,
 was Re:  dependencies on perl version
In-Reply-To: <20756.7768.125680.662488@gargle.gargle.HOWL>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
	<20756.7768.125680.662488@gargle.gargle.HOWL>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1D071@CHIMBX5.ad.uillinois.edu>

On Feb 7, 2013, at 3:36 PM, George Hartzell <hartzell at alerce.com> wrote:

> Fields, Christopher J writes:
>> George,
>> 
>> Should put your post on a pedestal :)
>> 
>> tl;dr version: I completely agree, but we need help in order to do this.
>> [...]
> 
> And therein lies the [a] problem.  Don't look at me....
> 
> I'm not coding on bioinformatics problems these days (though I'm
> available...) so _maybe_ I shouldn't have gotten up on the soapbox.
> 
> But I'm so sick of getting into arguments (or walking away from
> them...) with Ruby and Python [and lisp and *PHP*] fans; Perl is dead,
> you can't write good code in Perl, look - Ruby has GEMS!, etc?

Right, but that's a perception not just in the Bio* world.  It's larger and more pervasive than that.  

> Perl of the olden days was an easy language in which to write really
> shitty code.  Even the Perl of the BioPerl heyday wasn't really much
> help; role your own OO, role your own distro-building, mountains of
> monkey-work to provide consistent POD, versioning, etc...
> 
> But that's not the Perl that I use.  I have Moose and Moo.  TAP and
> the things built on it.  Dist::Zilla.  PerlTidy.  PerlCritic.  cpanm.
> MetaCPAN.  Pinto.  GitHub.  Perlbrew.  Wow.

Yes, and that is the direction we need to go in.

> It isn't any harder to write good code, for measures that I care
> about, using Perl than it is *any* of the other similar languages.
> 
> And it's just as easy, and happens just as frequently, for people to
> write shitty (undocumented, untested, poorly managed, poorly packaged,
> ...) stuff in the other languages.

Oh, I know.  I'm working on some very nice looking but terribly implemented Python code now.

> GET OFF MY LAWN, KID! (Yeah, I know...)
> 
> But BioPerl *is* dying.  You might be standing on the shoulders of
> giants when you use it to solve a problem, but you *definitely* have
> those same giants (and their extended families) on your shoulders
> every time I see you try move the project forward.  All of that
> history has become the tail that's wagging the dog.

Yep.

> If all y'all are going to keep the thing alive, moving forward and
> contributing to new great works then make Apple your hero.  Deprecate
> the stuff that's holding you back, give folks a path forward and move
> on.

That's fine.

> Have fun.  Use sharp tools.  Do cool science.  Build cool things.
> Advance your careers (forgot that one last time).  Be reasonable and
> professional.
> 
> Supporting last year's projects is someone else's business
> opportunity.
> 
> g.

Right, but this isn't just my show.  I can't do this alone; it's simply too much code and I don't have even 1/4 the time I used to have.

> ps.  Are all y'all following this thread?
> 
>     http://news.ycombinator.com/item?id=5123022
> 
> Maybe someone should search down for this bit: "Where to start? Any
> list of this [sic] projects?" and insert a plug for the various
> open-bio projects.  (But "someone" doesn't work here, he said?).

Read the original guy's post.  He's completely delusional (okay, maybe not *completely*, but he comes across as quite bitter and unrealistic).  

Frankly I don't feel so bad if he wants to leave.  He doesn't like messy things.  Biology is messy, if one doesn't understand that then computational biology is not for them.

chris


From carandraug+dev at gmail.com  Thu Feb  7 23:12:22 2013
From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=)
Date: Fri, 8 Feb 2013 04:12:22 +0000
Subject: [Bioperl-l] BioPerl long-term,
	was Re: dependencies on perl version
Message-ID: <CAPOrs_1+oYc20aMvUKOKdeX78XwdZaduh7LKeEG=UQrRgYB6+A@mail.gmail.com>

On 6 February 2013 22:11, "Fields, Christopher J" <cjfields at illinois.edu> wrote:
> [...]
> So:
>
> If it means targeting performance, backwards-compatibility be damned (using Devel::NYTProf?), we do that.
>
> If it means creating a new Bio-NGS repo to focus some of these efforts, so be it.
>
> If it means we get away from the Java-based interface stuff in favor of something more Perl-like (roles anyone?), then I'm all for it.
>
> If it means we modularize BioPerl so this can be done, well, you probably know where I stand (yes).
>
> If it means this is to be BioPerl 2.0, then let's move that direction, sooner than later.
>
> But I can't do it alone.  We (not just me, but we) need to drive the direction we take.
>
> First one who codes gets the gold ring.

Hi

I know I'm not much involved with bioperl development but here's my
suggestion as maintainer of another quite modular free software
project. I swear I'm not promoting it. Skip to the last paragraph for
the very short version.

Octave Forge is now a collection of packages for GNU Octave, each
released independently whenever its maintainer sees fit. But it wasn't
like that before. For a long time, everything was released at the same
time, there was no independent packages. Then it was decided to split
it into sections: main, extra and nonfree (free software dependent on
non-free libraries, now purged), and inside those, it was split into
packages, each with its own maintainer. But some packages were (and
are) more active that the others. Some packages even came from single
contributions and we never heard from the authors again. And so, with
time, cruft settled in.

We didn't want to remove the code, but no one was interested or
comfortable enough on the field, to fix it either. Packages that had a
much more active development were being dragged down by code that no
one was maintaining. So we broke with that and each package is now
released independently. We have packages that haven't been released in
3 years yes, but that just shows the packages that no one cares about.
Those have been marked as unmaintained and anyone can come around and
make a release if they care about it.

As the maintainer of the project, I do *not* make the releases of the
packages. The package maintainers prepares everything and uploads
them, I only run a handful of tests (takes me 10min), upload it to our
server, and make the official announcement. I am also the maintainer
of one of the packages, and have often made releases of unmaintained
packages because I needed it. That's to show, if they are important
enough for someone, they will get a release somehow. If they are not
important, why would we waste our time on them anyway? We now around 5
package releases per month, many of them being minor releases with a
handful of bug fixes. Preparing a release of a small package is much
easier and much less trouble than preparing a giant release
encompassing all of them at the same time.

Short version:
I'd recommend to split the project into much smaller ones. Some of the
small ones will wither and die but those are the less important ones,
and will allow the others, the ones that people care about, freedom to
grow faster. Bioperl would still be just one project, that
incorporates a hundred or so of smaller modules. Let those who care
the most about a specific module to take care of it and make the
releases. Releasing a module becomes much simpler, which means more
releases, more activity, and the smaller code base for each module
also make it less intimidating for new contributors.

Carn?


From hartzell at alerce.com  Fri Feb  8 01:17:17 2013
From: hartzell at alerce.com (George Hartzell)
Date: Thu, 7 Feb 2013 22:17:17 -0800
Subject: [Bioperl-l] injecting a bit of levity....
Message-ID: <20756.39021.553502.116384@gargle.gargle.HOWL>


Perl's not dead.  It's FAMOUS!

  http://imgs.xkcd.com/comics/perl_problems.png

g.


From carandraug+dev at gmail.com  Fri Feb  8 01:57:30 2013
From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=)
Date: Fri, 8 Feb 2013 06:57:30 +0000
Subject: [Bioperl-l] getting a Bio::Search::HSP::HSPI from Bio::SimpleAlign
 (to find differences between sequences)
Message-ID: <CAPOrs_084-eh9kq=uWk19jvLagKKGr2qOs3HpGLpBt7YOLaO4A@mail.gmail.com>

Hi

I already have a Bio::SimpleAlign object (got it after using TCoffee
through bioperl-run module) and I'm trying to get a
Bio::Search::HSP::HSPI object from a pair of the aligned sequences.
How can I do this? I want to use the seq_inds method to compare the
sequences.

Here's my actual problem just in case I should be trying to fix it
some other way. I have a bunch of sequences from protein isoforms.
They have small differences between them, point-mutations, small
insertions or deletions, nothing too big. I want to make a table of
the mutations that each of them has against the consensus sequence. I
already made the alignment and got have the consensus with
"$align->consensus_string". Now, I want to get something like:

isoform1: Ala67Gly, His90_Met91insGln
isoform2: ....

The seq_inds method from the Bio::Search::HSP::HSPI class seems to do
the part of finding the differences, but how can I get one? I can't
find it on the documentation.

Any tips, and even showing a different approach to my problem, are
most appreciated. Thanks,

Carn?


From l.m.timmermans at students.uu.nl  Fri Feb  8 06:18:58 2013
From: l.m.timmermans at students.uu.nl (Leon Timmermans)
Date: Fri, 8 Feb 2013 12:18:58 +0100
Subject: [Bioperl-l] BioPerl long-term,
	was Re: dependencies on perl version
In-Reply-To: <20756.7768.125680.662488@gargle.gargle.HOWL>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
	<20756.7768.125680.662488@gargle.gargle.HOWL>
Message-ID: <CAC1jpXA-bu20fP0WsRi=bJKxnBkfL=KJyB5n8h_XMh6eTOq3uQ@mail.gmail.com>

On Thu, Feb 7, 2013 at 10:36 PM, George Hartzell <hartzell at alerce.com> wrote:
> But I'm so sick of getting into arguments (or walking away from
> them...) with Ruby and Python [and lisp and *PHP*] fans; Perl is dead,
> you can't write good code in Perl, look - Ruby has GEMS!, etc...
>
> Perl of the olden days was an easy language in which to write really
> shitty code.  Even the Perl of the BioPerl heyday wasn't really much
> help; role your own OO, role your own distro-building, mountains of
> monkey-work to provide consistent POD, versioning, etc...
>
> But that's not the Perl that I use.  I have Moose and Moo.  TAP and
> the things built on it.  Dist::Zilla.  PerlTidy.  PerlCritic.  cpanm.
> MetaCPAN.  Pinto.  GitHub.  Perlbrew.  Wow.

I share that experience.

> But BioPerl *is* dying.  You might be standing on the shoulders of
> giants when you use it to solve a problem, but you *definitely* have
> those same giants (and their extended families) on your shoulders
> every time I see you try move the project forward.  All of that
> history has become the tail that's wagging the dog.

I share your sentiment. Most of BioPerl is architected so badly I
can't stomach it most days, and I've worked on hairy codebases
included perl itself. There's just too much sick and wrong. It's like
hundreds of dot-com-era cgi scripts.

The problem (which is common in scientific computing) is that once
code works it's effectively abandoned. BioPerl is essentially a
gathering of more than a thousand such modules.

> If all y'all are going to keep the thing alive, moving forward and
> contributing to new great works then make Apple your hero.  Deprecate
> the stuff that's holding you back, give folks a path forward and move
> on.

That would be lovely, but who is going to do that? We're suffering
from the tragedy of the commons.

> Have fun.  Use sharp tools.  Do cool science.  Build cool things.
> Advance your careers (forgot that one last time).  Be reasonable and
> professional.

Sounds like good advice to me :-)

> Supporting last year's projects is someone else's business
> opportunity.

True!

> ps.  Are all y'all following this thread?
>
>      http://news.ycombinator.com/item?id=5123022
>
> Maybe someone should search down for this bit: "Where to start? Any
> list of this [sic] projects?" and insert a plug for the various
> open-bio projects.  (But "someone" doesn't work here, he said...).

Interesting discussion, though the original post is too cynical even
for my taste.

Leon


From cjfields at illinois.edu  Fri Feb  8 09:08:56 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Fri, 8 Feb 2013 14:08:56 +0000
Subject: [Bioperl-l] BioPerl long-term,
 was Re: dependencies on perl version
In-Reply-To: <CAC1jpXA-bu20fP0WsRi=bJKxnBkfL=KJyB5n8h_XMh6eTOq3uQ@mail.gmail.com>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
	<20756.7768.125680.662488@gargle.gargle.HOWL>
	<CAC1jpXA-bu20fP0WsRi=bJKxnBkfL=KJyB5n8h_XMh6eTOq3uQ@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1DA2D@CHIMBX5.ad.uillinois.edu>

On Feb 8, 2013, at 5:18 AM, Leon Timmermans <l.m.timmermans at students.uu.nl> wrote:

> On Thu, Feb 7, 2013 at 10:36 PM, George Hartzell <hartzell at alerce.com> wrote:
>> But I'm so sick of getting into arguments (or walking away from
>> them...) with Ruby and Python [and lisp and *PHP*] fans; Perl is dead,
>> you can't write good code in Perl, look - Ruby has GEMS!, etc...
>> 
>> Perl of the olden days was an easy language in which to write really
>> shitty code.  Even the Perl of the BioPerl heyday wasn't really much
>> help; role your own OO, role your own distro-building, mountains of
>> monkey-work to provide consistent POD, versioning, etc...
>> 
>> But that's not the Perl that I use.  I have Moose and Moo.  TAP and
>> the things built on it.  Dist::Zilla.  PerlTidy.  PerlCritic.  cpanm.
>> MetaCPAN.  Pinto.  GitHub.  Perlbrew.  Wow.
> 
> I share that experience.
> 
>> But BioPerl *is* dying.  You might be standing on the shoulders of
>> giants when you use it to solve a problem, but you *definitely* have
>> those same giants (and their extended families) on your shoulders
>> every time I see you try move the project forward.  All of that
>> history has become the tail that's wagging the dog.
> 
> I share your sentiment. Most of BioPerl is architected so badly I
> can't stomach it most days, and I've worked on hairy codebases
> included perl itself. There's just too much sick and wrong. It's like
> hundreds of dot-com-era cgi scripts.
> 
> The problem (which is common in scientific computing) is that once
> code works it's effectively abandoned. BioPerl is essentially a
> gathering of more than a thousand such modules.

Yep, the progression from 'it works' to 'it works very well' tends to have very high activation energy.  Many of the fixes tend to be more bandaids (get it working) than fundamental surgery.  I tried my hand at this, got a few things done.

>> If all y'all are going to keep the thing alive, moving forward and
>> contributing to new great works then make Apple your hero.  Deprecate
>> the stuff that's holding you back, give folks a path forward and move
>> on.
> 
> That would be lovely, but who is going to do that? We're suffering
> from the tragedy of the commons.

Spot on, but we could break that path for the time being.  I think BioPerl as is will have to be in maintenance mode; we need a new effort to break with older perl, older practices.  

>> Have fun.  Use sharp tools.  Do cool science.  Build cool things.
>> Advance your careers (forgot that one last time).  Be reasonable and
>> professional.
> 
> Sounds like good advice to me :-)
> 
>> Supporting last year's projects is someone else's business
>> opportunity.
> 
> True!

We just need to make a bioperl 1.x branch for the maintenance bit, rechristen 'master' as 'v2', and just move on to fixing the f****** code.  Let's move on that.

>> ps.  Are all y'all following this thread?
>> 
>>     http://news.ycombinator.com/item?id=5123022
>> 
>> Maybe someone should search down for this bit: "Where to start? Any
>> list of this [sic] projects?" and insert a plug for the various
>> open-bio projects.  (But "someone" doesn't work here, he said...).
> 
> Interesting discussion, though the original post is too cynical even
> for my taste.
> 
> Leon

Yes, that's not unusual unfortunately.  We have a number of physicists and mathematicians here who have started their initial forays into computational biology, they're all startled at how noisy it is and how messy code can.  Of course their disciplines have had the benefit of teaching students how to (somewhat decently) code for the last 40 years.

chris


From l.m.timmermans at students.uu.nl  Fri Feb  8 07:08:06 2013
From: l.m.timmermans at students.uu.nl (Leon Timmermans)
Date: Fri, 8 Feb 2013 13:08:06 +0100
Subject: [Bioperl-l] BioPerl long-term,
	was Re: dependencies on perl version
In-Reply-To: <CAPOrs_1+oYc20aMvUKOKdeX78XwdZaduh7LKeEG=UQrRgYB6+A@mail.gmail.com>
References: <CAPOrs_1+oYc20aMvUKOKdeX78XwdZaduh7LKeEG=UQrRgYB6+A@mail.gmail.com>
Message-ID: <CAC1jpXAZJK=B_GDOTb=zznj=p+bmTQq9QrD6Lkw+do7kM89K2w@mail.gmail.com>

On Fri, Feb 8, 2013 at 5:12 AM, Carn? Draug <carandraug+dev at gmail.com> wrote:
> Short version:
> I'd recommend to split the project into much smaller ones. Some of the
> small ones will wither and die but those are the less important ones,
> and will allow the others, the ones that people care about, freedom to
> grow faster. Bioperl would still be just one project, that
> incorporates a hundred or so of smaller modules. Let those who care
> the most about a specific module to take care of it and make the
> releases. Releasing a module becomes much simpler, which means more
> releases, more activity, and the smaller code base for each module
> also make it less intimidating for new contributors.

That has been a goal for some time now, but it's fairly complicated.
Not only do we have a LOT of modules (bioperl-live alone is more than
900), they also have complicated dependencies. I've attached the
results of my static dependency analysis of bioperl-live. I suspect
this split-up needs to done by automated graph analysis, it's too much
to do by hand.

Leon
-------------- next part --------------
A non-text attachment was scrubbed...
Name: deps.dot
Type: application/octet-stream
Size: 93463 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20130208/bdbbda1e/attachment-0002.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: deps.png
Type: image/png
Size: 6694525 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20130208/bdbbda1e/attachment-0002.png>

From sebastien.moretti at unil.ch  Fri Feb  8 11:19:29 2013
From: sebastien.moretti at unil.ch (=?ISO-8859-1?Q?Moretti_S=E9bastien?=)
Date: Fri, 08 Feb 2013 17:19:29 +0100
Subject: [Bioperl-l] PhyloXML
Message-ID: <51152591.9010402@unil.ch>

Hi

I would like to add some XML to an existing PhyloXML tree.

No problem to read and write it.
I would like to add <name>smthg</name> after the <phylogeny> tag as in 
http://www.phyloxml.org/examples_syntax/phyloxml_syntax_example_1.html
but get problems with add_phyloXML_annotation() :

Can't locate object method "annotation" via package "Bio::Tree::Tree" at
         /software/share/perl5/vendor_perl/Bio/TreeIO/phyloxml.pm line 
984, <GEN0> line 1 (#1)
     (F) You called a method correctly, and it correctly indicated a package
     functioning as a class, but that package doesn't define that particular
     method, nor does any of its base classes.  See perlobj.

Uncaught exception from user code:
         Can't locate object method "annotation" via package 
"Bio::Tree::Tree" at 
/software/share/perl5/vendor_perl/Bio/TreeIO/phyloxml.pm line 984, 
<GEN0> line 1.
  at /software/share/perl5/vendor_perl/Bio/TreeIO/phyloxml.pm line 984
 
Bio::TreeIO::phyloxml::element_default('Bio::TreeIO::phyloxml=HASH(0x134b1268)') 
called at /software/share/perl5/vendor_perl/Bio/TreeIO/phyloxml.pm line 670
 
Bio::TreeIO::phyloxml::processXMLNode('Bio::TreeIO::phyloxml=HASH(0x134b1268)') 
called at /software/share/perl5/vendor_perl/Bio/TreeIO/phyloxml.pm line 309
 
Bio::TreeIO::phyloxml::add_phyloXML_annotation('Bio::TreeIO::phyloxml=HASH(0x134b1268)', 
'-obj', 'Bio::Tree::Tree=HASH(0x13525258)', '-xml', '<name>SUMF 
family</name>') called at ./add_annotation_to_phyloxml.pl line 40


I think I do something wrong but what ?
Here is the code

my $treeio = new Bio::TreeIO(-file   => "$infile",
                              -format => 'phyloxml',
                             );
my $tree = $treeio->next_tree;

# Add annotation
$treeio->add_phyloXML_annotation(-obj => $tree,
                                  -xml => '<name>SUMF family</name>',
                                 );

-- 
S?bastien Moretti


From cjfields at illinois.edu  Sat Feb  9 01:25:17 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Sat, 9 Feb 2013 06:25:17 +0000
Subject: [Bioperl-l] BioPerl future
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1F217@CHIMBX5.ad.uillinois.edu>

All,

(cross-posting to gmod-gbrowse)

I want to gauge the community's thoughts on a few things.  At the moment I think we can safely say that BioPerl 1.x is in maintenance mode.  By 'maintenance mode', I mean that we can only do so much with it w/o breaking backwards compatibility with old scripts.  We need a way forward so that we can address fundamental problems within the core codebase, namely speed.

I am thinking at the moment of pushing a 'v1' branch next week after I make an official announcement, with a new 1.6 release coming out from that branch (as already announced, tentatively scheduled for March 1).  That frees up master for any code development, removal of modules/cruft, etc.  This will open an initial path forward and at least enable us to do more.  Make sense?  This of course means that any code reliant on v1 should pull from that branch instead of 'master'.  

Thoughts?  

chris


From cjfields at illinois.edu  Sat Feb  9 01:43:24 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Sat, 9 Feb 2013 06:43:24 +0000
Subject: [Bioperl-l] BioPerl long-term,
 was Re: dependencies on perl version
In-Reply-To: <CAC1jpXAZJK=B_GDOTb=zznj=p+bmTQq9QrD6Lkw+do7kM89K2w@mail.gmail.com>
References: <CAPOrs_1+oYc20aMvUKOKdeX78XwdZaduh7LKeEG=UQrRgYB6+A@mail.gmail.com>
	<CAC1jpXAZJK=B_GDOTb=zznj=p+bmTQq9QrD6Lkw+do7kM89K2w@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1F2C6@CHIMBX5.ad.uillinois.edu>

On Feb 8, 2013, at 6:08 AM, Leon Timmermans <l.m.timmermans at students.uu.nl> wrote:

> On Fri, Feb 8, 2013 at 5:12 AM, Carn? Draug <carandraug+dev at gmail.com> wrote:
>> Short version:
>> I'd recommend to split the project into much smaller ones. Some of the
>> small ones will wither and die but those are the less important ones,
>> and will allow the others, the ones that people care about, freedom to
>> grow faster. Bioperl would still be just one project, that
>> incorporates a hundred or so of smaller modules. Let those who care
>> the most about a specific module to take care of it and make the
>> releases. Releasing a module becomes much simpler, which means more
>> releases, more activity, and the smaller code base for each module
>> also make it less intimidating for new contributors.
> 
> That has been a goal for some time now, but it's fairly complicated.
> Not only do we have a LOT of modules (bioperl-live alone is more than
> 900), they also have complicated dependencies. I've attached the
> results of my static dependency analysis of bioperl-live. I suspect
> this split-up needs to done by automated graph analysis, it's too much
> to do by hand.
> 
> Leon
> <deps.dot><deps.png>

Leon, 

I'm hoping we can do this sooner than later.  In fact, if we proceed with make a 'v1' branch or something similar, we can start extricating out code sooner than later (next few weeks).

chris


From cjfields at illinois.edu  Sat Feb  9 08:51:35 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Sat, 9 Feb 2013 13:51:35 +0000
Subject: [Bioperl-l] [Gmod-gbrowse] BioPerl future
Message-ID: <prc698q0fqtymq1n70jhdi5w.1360417710993@email.android.com>

Sheldon,

The branch is where the old (v1.x) code would reside.  Master branch would be v2.

Chris


Sent via phone


-------- Original message --------
From: Sheldon McKay <sheldon.mckay at gmail.com>
Date:
To: "Fields, Christopher J" <cjfields at illinois.edu>
Cc: BioPerl List <Bioperl-l at lists.open-bio.org>,gmod-gbrowse at lists.sourceforge.net
Subject: Re: [Gmod-gbrowse] BioPerl future


Hi Chris,

This sounds like a good idea.  I think it will eventually allow bioperl to evolve into a leaner, meaner package that would be more likely to be adopted by new or isolated bioinformaticians, who tend to be put off by the size and complexity of bioperl as it now stands.

One question I have is whether the name of branch v1 might be perceived as a step backward.  How about v2?

Sheldon

On Saturday, February 9, 2013, Fields, Christopher J wrote:
All,

(cross-posting to gmod-gbrowse)

I want to gauge the community's thoughts on a few things.  At the moment I think we can safely say that BioPerl 1.x is in maintenance mode.  By 'maintenance mode', I mean that we can only do so much with it w/o breaking backwards compatibility with old scripts.  We need a way forward so that we can address fundamental problems within the core codebase, namely speed.

I am thinking at the moment of pushing a 'v1' branch next week after I make an official announcement, with a new 1.6 release coming out from that branch (as already announced, tentatively scheduled for March 1).  That frees up master for any code development, removal of modules/cruft, etc.  This will open an initial path forward and at least enable us to do more.  Make sense?  This of course means that any code reliant on v1 should pull from that branch instead of 'master'.

Thoughts?

chris
------------------------------------------------------------------------------
Free Next-Gen Firewall Hardware Offer
Buy your Sophos next-gen firewall before the end March 2013
and get the hardware for free! Learn more.
http://p.sf.net/sfu/sophos-d2d-feb
_______________________________________________
Gmod-gbrowse mailing list
Gmod-gbrowse at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse


--
Sheldon McKay, PhD
Computational Biologist
DNA Learning Center
Cold Spring Harbor Laboratory
1 Bungtown Rd
Cold Spring Harbor, NY 11724
(516) 367-5185
www.dnalc.org<http://www.dnalc.org>


From sheldon.mckay at gmail.com  Sat Feb  9 08:04:50 2013
From: sheldon.mckay at gmail.com (Sheldon McKay)
Date: Sat, 9 Feb 2013 08:04:50 -0500
Subject: [Bioperl-l] [Gmod-gbrowse] BioPerl future
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1F217@CHIMBX5.ad.uillinois.edu>
References: <118F034CF4C3EF48A96F86CE585B94BF6CE1F217@CHIMBX5.ad.uillinois.edu>
Message-ID: <CAEs59kkOhJ-czn_aXOcP+yOszQdGGLgaAMNp+u_0MqS=xXapng@mail.gmail.com>

Hi Chris,

This sounds like a good idea.  I think it will eventually allow bioperl to
evolve into a leaner, meaner package that would be more likely to be
adopted by new or isolated bioinformaticians, who tend to be put off by
the size and complexity of bioperl as it now stands.

One question I have is whether the name of branch v1 might be perceived as
a step backward.  How about v2?

Sheldon

On Saturday, February 9, 2013, Fields, Christopher J wrote:

> All,
>
> (cross-posting to gmod-gbrowse)
>
> I want to gauge the community's thoughts on a few things.  At the moment I
> think we can safely say that BioPerl 1.x is in maintenance mode.  By
> 'maintenance mode', I mean that we can only do so much with it w/o breaking
> backwards compatibility with old scripts.  We need a way forward so that we
> can address fundamental problems within the core codebase, namely speed.
>
> I am thinking at the moment of pushing a 'v1' branch next week after I
> make an official announcement, with a new 1.6 release coming out from that
> branch (as already announced, tentatively scheduled for March 1).  That
> frees up master for any code development, removal of modules/cruft, etc.
>  This will open an initial path forward and at least enable us to do more.
>  Make sense?  This of course means that any code reliant on v1 should pull
> from that branch instead of 'master'.
>
> Thoughts?
>
> chris
>
> ------------------------------------------------------------------------------
> Free Next-Gen Firewall Hardware Offer
> Buy your Sophos next-gen firewall before the end March 2013
> and get the hardware for free! Learn more.
> http://p.sf.net/sfu/sophos-d2d-feb
> _______________________________________________
> Gmod-gbrowse mailing list
> Gmod-gbrowse at lists.sourceforge.net <javascript:;>
> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
>


-- 
Sheldon McKay, PhD
Computational Biologist
DNA Learning Center
Cold Spring Harbor Laboratory
1 Bungtown Rd
Cold Spring Harbor, NY 11724
(516) 367-5185
www.dnalc.org


From cjfields at illinois.edu  Sat Feb  9 23:25:14 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Sun, 10 Feb 2013 04:25:14 +0000
Subject: [Bioperl-l] BioPerl future
In-Reply-To: <2F53583C-9E7D-4D6A-A4C2-E5C27DDBA493@illinois.edu>
References: <2F53583C-9E7D-4D6A-A4C2-E5C27DDBA493@illinois.edu>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1FC4C@CHIMBX5.ad.uillinois.edu>

Apologies if you receive this twice. I never received the replies from the gbrowse list through bioperl-l so it is possible there were mail issues last night.

------------------------

All,

(cross-posting to gmod-gbrowse)

I want to gauge the community's thoughts on a few things.  At the moment I think we can safely say that BioPerl 1.x is in maintenance mode.  By 'maintenance mode', I mean that we can only do so much with it w/o breaking backwards compatibility with old scripts.  We need a way forward so that we can address fundamental problems within the core codebase, namely speed.

I am thinking at the moment of pushing a 'v1' branch next week after I make an official announcement, with a new 1.6 release coming out from that branch (as already announced, tentatively scheduled for March 1).  That frees up master for any code development, removal of modules/cruft, etc.  This will open an initial path forward and at least enable us to do more.  Make sense?  This of course means that any code reliant on v1 should pull from that branch instead of 'master'.  

Thoughts?  

chris


From genehack at genehack.org  Sat Feb  9 23:36:07 2013
From: genehack at genehack.org (John SJ Anderson)
Date: Sat, 9 Feb 2013 20:36:07 -0800
Subject: [Bioperl-l] BioPerl future
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1FC4C@CHIMBX5.ad.uillinois.edu>
References: <2F53583C-9E7D-4D6A-A4C2-E5C27DDBA493@illinois.edu>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1FC4C@CHIMBX5.ad.uillinois.edu>
Message-ID: <668BED38-61AE-4D21-A3BD-B7AEC9361EF6@genehack.org>

On Feb 9, 2013, at 8:25 PM, "Fields, Christopher J" <cjfields at illinois.edu> wrote:

> Thoughts?  

+1

The other thing to maybe give some advance thought to is organization of the new development. Maybe instead of one big repository, we can encourage the more loosely coupled small pieces that everybody seems to realize we need by having more, smaller repositories. 

j.

-- 
John SJ Anderson // genehack at genehack.org


From carandraug+dev at gmail.com  Sun Feb 10 13:40:33 2013
From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=)
Date: Sun, 10 Feb 2013 18:40:33 +0000
Subject: [Bioperl-l] BioPerl future
Message-ID: <CAPOrs_21WBiRwngD8_U4di_0WnXCz8cUHjv+oL6_m_UadBMfDg@mail.gmail.com>

On 10 February 2013 17:00,  <bioperl-l-request at lists.open-bio.org> wrote:
> Message: 3
> Date: Sat, 9 Feb 2013 20:36:07 -0800
> From: John SJ Anderson <genehack at genehack.org>
> Subject: Re: [Bioperl-l] BioPerl future
> To: "Fields, Christopher J" <cjfields at illinois.edu>
> Cc: BioPerl List <Bioperl-l at lists.open-bio.org>
> Message-ID: <668BED38-61AE-4D21-A3BD-B7AEC9361EF6 at genehack.org>
> Content-Type: text/plain; charset=us-ascii
>
> On Feb 9, 2013, at 8:25 PM, "Fields, Christopher J" <cjfields at illinois.edu> wrote:
>
>> Thoughts?
>
> +1
>
> The other thing to maybe give some advance thought to is organization of the new development. Maybe instead of one big repository, we can encourage the more loosely coupled small pieces that everybody seems to realize we need by having more, smaller repositories.

For those interested, I have just added instructions on the wiki on
how to split a subset of modules, tests, files, etc from the
bioperl-live repository into a new repository while keeping their old
history.

http://www.bioperl.org/wiki/Using_Git/Advanced#Split_a_module_from_bioperl-live

Carn?


From cjfields at illinois.edu  Sun Feb 10 15:08:35 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Sun, 10 Feb 2013 20:08:35 +0000
Subject: [Bioperl-l] BioPerl future
In-Reply-To: <CAPOrs_21WBiRwngD8_U4di_0WnXCz8cUHjv+oL6_m_UadBMfDg@mail.gmail.com>
References: <CAPOrs_21WBiRwngD8_U4di_0WnXCz8cUHjv+oL6_m_UadBMfDg@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE20632@CHIMBX5.ad.uillinois.edu>

On Feb 10, 2013, at 12:40 PM, Carn? Draug <carandraug+dev at gmail.com> wrote:

> On 10 February 2013 17:00,  <bioperl-l-request at lists.open-bio.org> wrote:
>> Message: 3
>> Date: Sat, 9 Feb 2013 20:36:07 -0800
>> From: John SJ Anderson <genehack at genehack.org>
>> Subject: Re: [Bioperl-l] BioPerl future
>> To: "Fields, Christopher J" <cjfields at illinois.edu>
>> Cc: BioPerl List <Bioperl-l at lists.open-bio.org>
>> Message-ID: <668BED38-61AE-4D21-A3BD-B7AEC9361EF6 at genehack.org>
>> Content-Type: text/plain; charset=us-ascii
>> 
>> On Feb 9, 2013, at 8:25 PM, "Fields, Christopher J" <cjfields at illinois.edu> wrote:
>> 
>>> Thoughts?
>> 
>> +1
>> 
>> The other thing to maybe give some advance thought to is organization of the new development. Maybe instead of one big repository, we can encourage the more loosely coupled small pieces that everybody seems to realize we need by having more, smaller repositories.
> 
> For those interested, I have just added instructions on the wiki on
> how to split a subset of modules, tests, files, etc from the
> bioperl-live repository into a new repository while keeping their old
> history.
> 
> http://www.bioperl.org/wiki/Using_Git/Advanced#Split_a_module_from_bioperl-live
> 
> Carn?

It's probably worth looking at this page as well, then:

http://www.bioperl.org/wiki/BioPerl_Modularization

We should probably merge the two.

chris


From hlapp at drycafe.net  Sun Feb 10 20:03:34 2013
From: hlapp at drycafe.net (Hilmar Lapp)
Date: Sun, 10 Feb 2013 20:03:34 -0500
Subject: [Bioperl-l] PhyloXML
In-Reply-To: <51152591.9010402@unil.ch>
References: <51152591.9010402@unil.ch>
Message-ID: <F041F111-CF8F-4096-9968-5F8CA5DCA866@drycafe.net>

On Feb 8, 2013, at 11:19 AM, Moretti S?bastien <sebastien.moretti at unil.ch> wrote:

> # Add annotation
> $treeio->add_phyloXML_annotation(-obj => $tree,
>                                -xml => '<name>SUMF family</name>',
>                               );

If you really have $treeio in your script in this line and not $tree, then that's at least one problem. But the stack suggests that the above isn't the exact line in your script - can you confirm that?

	-hilmar

-- 
===========================================================
: Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net :
===========================================================


From sebastien.moretti at unil.ch  Mon Feb 11 02:08:22 2013
From: sebastien.moretti at unil.ch (=?ISO-8859-1?Q?S=E9bastien_MORETTI?=)
Date: Mon, 11 Feb 2013 08:08:22 +0100
Subject: [Bioperl-l] PhyloXML
In-Reply-To: <F041F111-CF8F-4096-9968-5F8CA5DCA866@drycafe.net>
References: <51152591.9010402@unil.ch>
	<F041F111-CF8F-4096-9968-5F8CA5DCA866@drycafe.net>
Message-ID: <511898E6.7060400@unil.ch>

>> # Add annotation
>> $treeio->add_phyloXML_annotation(-obj => $tree,
>>                                 -xml => '<name>SUMF family</name>',
>>                                );
>
> If you really have $treeio in your script in this line and not $tree, then that's at least one problem. But the stack suggests that the above isn't the exact line in your script - can you confirm that?
>
> 	-hilmar

I replaced $treeio by $tree in the above line but still get an error.
Don't see what you mean by "the stack suggests that the above isn't the 
exact line in your script"

The only think I changed is the length of the xml string I try to 
insert. But get the same error with an empty xml string.


my $treeio = new Bio::TreeIO(-file   => "$infile",
                              -format => 'phyloxml',
                             );
my $tree = $treeio->next_tree;

# Add annotation
$tree->add_phyloXML_annotation(-obj => $tree,
                                -xml => '<name>SUMF family</name>',
                               );

Can't locate object method "add_phyloXML_annotation" via package
	"Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, <GEN0> 
line 1 (#1)
     (F) You called a method correctly, and it correctly indicated a package
     functioning as a class, but that package doesn't define that particular
     method, nor does any of its base classes.  See perlobj.

Uncaught exception from user code:
	Can't locate object method "add_phyloXML_annotation" via package 
"Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, <GEN0> line 1.
  at ./add_annotation_to_phyloxml.pl line 40


-- 
S?bastien Moretti
Department of Ecology and Evolution,
Biophore, University of Lausanne,
CH-1015 Lausanne, Switzerland
Tel.: +41 (21) 692 4221/4079
http://bioinfo.unil.ch/


From saladi1 at illinois.edu  Tue Feb 12 16:24:34 2013
From: saladi1 at illinois.edu (Shyam Saladi)
Date: Tue, 12 Feb 2013 13:24:34 -0800
Subject: [Bioperl-l] Bio::Tools::SeqStats->count_codons
Message-ID: <CAARX5cX31P-SwDAb1mfiCTUG00bBq_m37Eb3rBemSqD1TBo_nw@mail.gmail.com>

Hi,

I am using the count_codons method from Bio::Tools::SeqStats and keep
getting "AMBIGUOUS" codons, but I can't figure out why exactly.

When I translate the same sequence that gives the error using another
standard utility like (ExPASy - Translate), it seems to work alright.

An example sequence is below. Could anyone lend some insight?

Thanks,
Shyam


AAA     AAC     AAG     AAT     ACA     ACC     ACG     ACT     AGA     AGC
    AGT     *AMBIGUOUS*       ATA     ATC     ATG     ATT     CAA     CAC
  CAG     CAT     CCA     CCC     CCG     CCT     CGA     CGC     CGG
CGT     CTA     CTC     CTG     CTT     GAA     GAC     GAG     GAT     GCA
    GCC     GCG     GCT     GGA     GGC     GGG     GGT     GTA     GTC
GTG     GTT     TAA     TAC     TAT     TCA     TCC     TCG     TCT     TGG
    TGT     TTA     TTC     TTG     TTT     count   filename
1.722488038277511961722488038277511961722
2.966507177033492822966507177033492822967
1.531100478468899521531100478468899521531
0.9569377990430622009569377990430622009569
 0.4784688995215311004784688995215311004785
 1.722488038277511961722488038277511961722
1.33971291866028708133971291866028708134
 1.913875598086124401913875598086124401914
0.1913875598086124401913875598086124401914
 0.7655502392344497607655502392344497607656
 1.435406698564593301435406698564593301435       *
0.09569377990430622009569377990430622009569*
0.3827751196172248803827751196172248803828
 2.488038277511961722488038277511961722488
3.349282296650717703349282296650717703349
3.636363636363636363636363636363636363636
2.870813397129186602870813397129186602871
0.3827751196172248803827751196172248803828
 1.626794258373205741626794258373205741627
0.4784688995215311004784688995215311004785
 1.722488038277511961722488038277511961722
0.5741626794258373205741626794258373205742
 1.052631578947368421052631578947368421053
1.244019138755980861244019138755980861244
0.3827751196172248803827751196172248803828
 0.7655502392344497607655502392344497607656
 0.1913875598086124401913875598086124401914
 2.488038277511961722488038277511961722488
0.4784688995215311004784688995215311004785
 0.6698564593301435406698564593301435406699
 2.105263157894736842105263157894736842105
0.8612440191387559808612440191387559808612
 2.870813397129186602870813397129186602871
1.435406698564593301435406698564593301435
1.722488038277511961722488038277511961722
2.775119617224880382775119617224880382775
2.00956937799043062200956937799043062201
 2.488038277511961722488038277511961722488
3.540669856459330143540669856459330143541
2.00956937799043062200956937799043062201
 0.1913875598086124401913875598086124401914
 2.392344497607655502392344497607655502392
0.8612440191387559808612440191387559808612
 5.454545454545454545454545454545454545455
1.913875598086124401913875598086124401914
0.8612440191387559808612440191387559808612
 4.593301435406698564593301435406698564593
2.679425837320574162679425837320574162679
0.09569377990430622009569377990430622009569
1.148325358851674641148325358851674641148
1.148325358851674641148325358851674641148
0.8612440191387559808612440191387559808612
 0.4784688995215311004784688995215311004785
 2.105263157894736842105263157894736842105
0.9569377990430622009569377990430622009569
 0.9569377990430622009569377990430622009569
 0.09569377990430622009569377990430622009569
2.679425837320574162679425837320574162679
2.966507177033492822966507177033492822967
3.062200956937799043062200956937799043062
2.775119617224880382775119617224880382775       1045    temp.seq

ATGGCACGTTTTTTTATTGATCGTCCCATCTTTGCGTGGGTGATCGCCTTAATTATTATGTTGGCGGGGGTGCTTTCAATTCGCACCCTGCCGGTTTCTCAATATCCCAGCATTGCACCGCCAACCGTGGTGATCAGTGCTAACTACCCTGGTGCATCGGCCAAGATTGTTGAAGACTCAGTGACTCAGGTGATTGAGCAACGCATGAAGGGTATCGATCACCTACGTTATATTGCCTCAACCAGCGATAGTTTCGGTAATGCTGAAATCACTTTGACCTTCAATGCCGAAGCCGATCCTGATATTGCTCAGGTACAAGTTCAGAACAAATTGCAGGGTGCAATGACCCTGTTACCACAAGAGGTACAGGCTCAAGGGGTTGACGTTAACAAATCAAGTTCTGGCTTYTTGATGGTGCTGGGTTTCGTATCGACTGACGGTTCCTTAGATAAAGGCGACATCGCCGACTATGTGGGTGCAAACGTACAAGATCCCATGAGCCGTGTACCGGGCGTGGGTGAAATTCAGCTGTTTGGTGCCCAATATGCGATGCGTATATGGCTTGATCCTTTAAAACTGACTCAATATAACTTGACCAGTTTAGAGGTGATCTCGGCGATTCGTGCTCAAAACGCGCAGGTGTCTGCGGGTCAGTTGGGTGGTACGCCGTCAATTCAAGGGCAAGAACTTAACGCCACTGTTTCGGCGCAAAGTCGTTTGCAAACCCCTGAAGAGTTTCGCAAGATTATCCTGAAGTCTGATACTTCGGGTGCGAATGTGTTCCTCGGTGATGTGGCGCGCGTAGAGTTAGGTTCAGAGAGTTATGCCGTTGTCTCGTTCTACAATGGTAAGCCTGCTACTGGTTTAGCGATTAAACTGGCGACAGGCGCAAACGCGTTGGATACCGCTGAAGCTGTTCGTGATAAAGTTGAAGAATTGCGACCTTTCTTCCCGCAAGGGTTGGATGTTGTTTATCCCTACGATACTACGCCATTCGTTGAGAAATCGATAGAAGGCGTGGTACACACCCTGCTCGAAGCGATTGTTCTGGTGTTTGTCATCATGTACCTCTTCCTGCAAAACTTCCGTGCGACCTTAATTCCGACGATTGCGGTACCAGTGGTCTTGCTGGGAACGTTTGCGATTTTGTCGGCCACGGGCTTCTCTATCAACACCCTTACCATGTTTGCTATGGTGCTGGCGATTGGTCTGTTGGTGGACGACGCCATCGTGGTGGTTGAAAACGTTGAGCGGGTGATGTCGGAAGAAGGGTTGAGCCCACTCGAAGCGACTCGTAAATCGATGGATCAAATCACTGGCGCCTTAGTTGGTATTGGTTTGACGTTATCTGCTGTATTTGTGCCAATGGCATTTATGTCGGGTTCTACTGGGGTCATTTACCGTCAGTTCTCGATCACTATCGTGTCTGCGATGGCATTGTCGGTATTAGTGGCCTTGATTTTAACGCCGGCACTTTGTGCCACTATGTTAAAACCCGTGCAGAAGGGACATGGTCATATTGAAACCGGTTTCTTCGGTTGGTTTAACCGTAACTTTGATCGCTTAACTAACCGTTACGAATCCAGTGTGGCGGGCATAGTGAAGCGTGGCTTTAGAGTCATGATGATTTATGTGGCTTTAGTGGTCGCCGTCGGTTGGATCTTCATGCGTATGCCAACTGCATTCTTACCCGATGAAGACCAAGGTATCTTGTTTACGCAGGCGATTTTGCCAACAAACTCGACTCAAGAAAGTACCCTCAAAGTGCTGGATAAGGTATCCGATCACTTCATGGCTGAAGAAGGCGTGAGATCGGTATTCAGCGTGGCGGGCTTTAGCTTTGCGGGTCAAGGCCAAAACATGGGTATCGCTTTCGTTGGCTTGAAGGATTGGTCAGAGCGTGAAGCACCTGGTATGGATGTGCAGTCTATTGCGGGTCGTGCTATGGGTGCCTTTAGTCAAATTAAAGACGCCTTCGTATTTGCCTTCGTACCACCTGCGGTTATTGAGCTGGGTACGGCGAATGGTTTTGACATGTACCTGCAAGATAAAAACGGTCAAGGCCACGATAAGTTAATAGCGGCTCGTAACCAATTGCTGGGTATGGCGGCTCAGAATCCAAACCTTATGGGTGTTCGCCCTAATGGTCAGGAAGATGCGCCAATCTATCAATTGCATATTGATCATGCAAAGTTGAGCGCATTAGGCGTTGATATTGCTAACGTTAACAGTGTGTTGGCAACTGCTTGGGGTGGTTCCTATGTGAACGATTTTATCGACCGCGGCCGTGTGAAAAAGGTATTTGTGCAAGGTGATGCCCAATACCGTATGCAGCCTGAAGACCTCAACACTTGGTACGTGCGTAACAACAAGGGTGACATGGTGCCATTTTCGGCCTTTGCAACAGGTTCTTGGGAATACGGCTCACCGCGTCTAGAACGTTTTAACGGTTTACCAGCGGTGAATATTCAAGGCGCAACTGCACCAGGCTTTAGTACGGGTGCTGCCATGACTATCATGGAGGACTTAGTTAAGCAGCTACCACCTGGCTTTGGCATCGAGTGGAACGGCTTATCCTACGAGGAACGTTTATCGGGTAACCAAGCACCAGCCTTGTATGCGTTGTCGATTCTGGTGGTATTCCTTGTATTAGCAGCCTTGTATGAAAGCTGGTCAGTACCGTTTGCGGTTATCCTTGTGGTTCCATTGGGGATTATCGGTGCTCTATTGGCGATGAATGGTCGAGGCTTGCCTAACGACGTGTTCTTCCAAGTGGGTCTGTTAACAACGGTTGGTTTGGCAACCAAGAACGCCATCTTGATTGTGGAATTTGCAAAAGAATTCTACGAGAAGGGGGCGGGTCTGGTTGAGGCGACCTTACATGCGGTCCGCGTGCGTTTACGTCCGATTTTAATGACGTCGCTCGCTTTTGGTCTGGGGGTTGTACCGCTAGCCATTAGTACAGGTGTGGGTTCGGGCAGTCAGAACGCCATTGGTACCGGTGTACTTGGCGGTATGATGAGTTCGACCTTCTTAGGTATCTTCTTCGTGCCACTGTTCTTCGTCATTGTTGAGCGGATCTTCAGTAAACGAGAGCGAAAAGCGAAAGAGAAAAATCCTACGTCGACGGATTAA


From bosborne11 at verizon.net  Tue Feb 12 21:30:08 2013
From: bosborne11 at verizon.net (Brian Osborne)
Date: Tue, 12 Feb 2013 21:30:08 -0500
Subject: [Bioperl-l] Bio::Tools::SeqStats->count_codons
In-Reply-To: <CAARX5cX31P-SwDAb1mfiCTUG00bBq_m37Eb3rBemSqD1TBo_nw@mail.gmail.com>
References: <CAARX5cX31P-SwDAb1mfiCTUG00bBq_m37Eb3rBemSqD1TBo_nw@mail.gmail.com>
Message-ID: <C13C35A7-4DBE-4797-A584-DCB6AF772D25@verizon.net>

Shyam,

An ambiguous codon would be one that has a character other than [ACTGU] in it. I see '!' in your sequences, that would create an ambiguous codon.

Brian O.


On Feb 12, 2013, at 4:24 PM, Shyam Saladi <saladi1 at illinois.edu> wrote:

> Hi,
> 
> I am using the count_codons method from Bio::Tools::SeqStats and keep
> getting "AMBIGUOUS" codons, but I can't figure out why exactly.
> 
> When I translate the same sequence that gives the error using another
> standard utility like (ExPASy - Translate), it seems to work alright.
> 
> An example sequence is below. Could anyone lend some insight?
> 
> Thanks,
> Shyam
> 
> 
> 
> AAA     AAC     AAG     AAT     ACA     ACC     ACG     ACT     AGA     AGC
>    AGT     *AMBIGUOUS*       ATA     ATC     ATG     ATT     CAA     CAC
>  CAG     CAT     CCA     CCC     CCG     CCT     CGA     CGC     CGG
> CGT     CTA     CTC     CTG     CTT     GAA     GAC     GAG     GAT     GCA
>    GCC     GCG     GCT     GGA     GGC     GGG     GGT     GTA     GTC
> GTG     GTT     TAA     TAC     TAT     TCA     TCC     TCG     TCT     TGG
>    TGT     TTA     TTC     TTG     TTT     count   filename
> 1.722488038277511961722488038277511961722
> 2.966507177033492822966507177033492822967
> 1.531100478468899521531100478468899521531
> 0.9569377990430622009569377990430622009569
> 0.4784688995215311004784688995215311004785
> 1.722488038277511961722488038277511961722
> 1.33971291866028708133971291866028708134
> 1.913875598086124401913875598086124401914
> 0.1913875598086124401913875598086124401914
> 0.7655502392344497607655502392344497607656
> 1.435406698564593301435406698564593301435       *
> 0.09569377990430622009569377990430622009569*
> 0.3827751196172248803827751196172248803828
> 2.488038277511961722488038277511961722488
> 3.349282296650717703349282296650717703349
> 3.636363636363636363636363636363636363636
> 2.870813397129186602870813397129186602871
> 0.3827751196172248803827751196172248803828
> 1.626794258373205741626794258373205741627
> 0.4784688995215311004784688995215311004785
> 1.722488038277511961722488038277511961722
> 0.5741626794258373205741626794258373205742
> 1.052631578947368421052631578947368421053
> 1.244019138755980861244019138755980861244
> 0.3827751196172248803827751196172248803828
> 0.7655502392344497607655502392344497607656
> 0.1913875598086124401913875598086124401914
> 2.488038277511961722488038277511961722488
> 0.4784688995215311004784688995215311004785
> 0.6698564593301435406698564593301435406699
> 2.105263157894736842105263157894736842105
> 0.8612440191387559808612440191387559808612
> 2.870813397129186602870813397129186602871
> 1.435406698564593301435406698564593301435
> 1.722488038277511961722488038277511961722
> 2.775119617224880382775119617224880382775
> 2.00956937799043062200956937799043062201
> 2.488038277511961722488038277511961722488
> 3.540669856459330143540669856459330143541
> 2.00956937799043062200956937799043062201
> 0.1913875598086124401913875598086124401914
> 2.392344497607655502392344497607655502392
> 0.8612440191387559808612440191387559808612
> 5.454545454545454545454545454545454545455
> 1.913875598086124401913875598086124401914
> 0.8612440191387559808612440191387559808612
> 4.593301435406698564593301435406698564593
> 2.679425837320574162679425837320574162679
> 0.09569377990430622009569377990430622009569
> 1.148325358851674641148325358851674641148
> 1.148325358851674641148325358851674641148
> 0.8612440191387559808612440191387559808612
> 0.4784688995215311004784688995215311004785
> 2.105263157894736842105263157894736842105
> 0.9569377990430622009569377990430622009569
> 0.9569377990430622009569377990430622009569
> 0.09569377990430622009569377990430622009569
> 2.679425837320574162679425837320574162679
> 2.966507177033492822966507177033492822967
> 3.062200956937799043062200956937799043062
> 2.775119617224880382775119617224880382775       1045    temp.seq
> 
> ATGGCACGTTTTTTTATTGATCGTCCCATCTTTGCGTGGGTGATCGCCTTAATTATTATGTTGGCGGGGGTGCTTTCAATTCGCACCCTGCCGGTTTCTCAATATCCCAGCATTGCACCGCCAACCGTGGTGATCAGTGCTAACTACCCTGGTGCATCGGCCAAGATTGTTGAAGACTCAGTGACTCAGGTGATTGAGCAACGCATGAAGGGTATCGATCACCTACGTTATATTGCCTCAACCAGCGATAGTTTCGGTAATGCTGAAATCACTTTGACCTTCAATGCCGAAGCCGATCCTGATATTGCTCAGGTACAAGTTCAGAACAAATTGCAGGGTGCAATGACCCTGTTACCACAAGAGGTACAGGCTCAAGGGGTTGACGTTAACAAATCAAGTTCTGGCTTYTTGATGGTGCTGGGTTTCGTATCGACTGACGGTTCCTTAGATAAAGGCGACATCGCCGACTATGTGGGTGCAAACGTACAAGATCCCATGAGCCGTGTACCGGGCGTGGGTGAAATTCAGCTGTTTGGTGCCCAATATGCGATGCGTATATGGCTTGATCCTTTAAAACTGACTCAATATAACTTGACCAGTTTAGAGGTGATCTCGGCGATTCGTGCTCAAAACGCGCAGGTGTCTGCGGGTCAGTTGGGTGGTACGCCGTCAATTCAAGGGCAAGAACTTAACGCCACTGTTTCGGCGCAAAGTCGTTTGCAAACCCCTGAAGAGTTTCGCAAGATTATCCTGAAGTCTGATACTTCGGGTGCGAATGTGTTCCTCGGTGATGTGGCGCGCGTAGAGTTAGGTTCAGAGAGTTATGCCGTTGTCTCGTTCTACAATGGTAAGCCTGCTACTGGTTTAGCGATTAAACTGGCGACAGGCGCAAACGCGTTGGATACCGCTGAAGCTGTTCGTGATAAAGTTGAAGAATTGCGACCTTTCTTCCCGCAAGGGTTGGATGTTGTTTATCCCTACGATACTAC!
> GCCATTCGTTGAGAAATCGATAGAAGGCGTGGTACACACCCTGCTCGAAGCGATTGTTCTGGTGTTTGTCATCATGTACCTCTTCCTGCAAAACTTCCGTGCGACCTTAATTCCGACGATTGCGGTACCAGTGGTCTTGCTGGGAACGTTTGCGATTTTGTCGGCCACGGGCTTCTCTATCAACACCCTTACCATGTTTGCTATGGTGCTGGCGATTGGTCTGTTGGTGGACGACGCCATCGTGGTGGTTGAAAACGTTGAGCGGGTGATGTCGGAAGAAGGGTTGAGCCCACTCGAAGCGACTCGTAAATCGATGGATCAAATCACTGGCGCCTTAGTTGGTATTGGTTTGACGTTATCTGCTGTATTTGTGCCAATGGCATTTATGTCGGGTTCTACTGGGGTCATTTACCGTCAGTTCTCGATCACTATCGTGTCTGCGATGGCATTGTCGGTATTAGTGGCCTTGATTTTAACGCCGGCACTTTGTGCCACTATGTTAAAACCCGTGCAGAAGGGACATGGTCATATTGAAACCGGTTTCTTCGGTTGGTTTAACCGTAACTTTGATCGCTTAACTAACCGTTACGAATCCAGTGTGGCGGGCATAGTGAAGCGTGGCTTTAGAGTCATGATGATTTATGTGGCTTTAGTGGTCGCCGTCGGTTGGATCTTCATGCGTATGCCAACTGCATTCTTACCCGATGAAGACCAAGGTATCTTGTTTACGCAGGCGATTTTGCCAACAAACTCGACTCAAGAAAGTACCCTCAAAGTGCTGGATAAGGTATCCGATCACTTCATGGCTGAAGAAGGCGTGAGATCGGTATTCAGCGTGGCGGGCTTTAGCTTTGCGGGTCAAGGCCAAAACATGGGTATCGCTTTCGTTGGCTTGAAGGATTGGTCAGAGCGTGAAGCACCTGGTATGGATGTGCAGTCTATTGCGGGTCGTGCTATGGGTGCCTTTAGTCAAATTAAAGACGCCTTC!
> GTATTTGCCTTCGTACCACCTGCGGTTATTGAGCTGGGTACGGCGAATGGTTTTGACATGTACCTGCAAG
> ATAAAAACGGTCAAGGCCACGATAAGTTAATAGCGGCTCGTAACCAATTGCTGGGTATGGCGGCTCAGAATCCAAACCTTATGGGTGTTCGCCCTAATGGTCAGGAAGATGCGCCAATCTATCAATTGCATATTGATCATGCAAAGTTGAGCGCATTAGGCGTTGATATTGCTAACGTTAACAGTGTGTTGGCAACTGCTTGGGGTGGTTCCTATGTGAACGATTTTATCGACCGCGGCCGTGTGAAAAAGGTATTTGTGCAAGGTGATGCCCAATACCGTATGCAGCCTGAAGACCTCAACACTTGGTACGTGCGTAACAACAAGGGTGACATGGTGCCATTTTCGGCCTTTGCAACAGGTTCTTGGGAATACGGCTCACCGCGTCTAGAACGTTTTAACGGTTTACCAGCGGTGAATATTCAAGGCGCAACTGCACCAGGCTTTAGTACGGGTGCTGCCATGACTATCATGGAGGACTTAGTTAAGCAGCTACCACCTGGCTTTGGCATCGAGTGGAACGGCTTATCCTACGAGGAACGTTTATCGGGTAACCAAGCACCAGCCTTGTATGCGTTGTCGATTCTGGTGGTATTCCTTGTATTAGCAGCCTTGTATGAAAGCTGGTCAGTACCGTTTGCGGTTATCCTTGTGGTTCCATTGGGGATTATCGGTGCTCTATTGGCGATGAATGGTCGAGGCTTGCCTAACGACGTGTTCTTCCAAGTGGGTCTGTTAACAACGGTTGGTTTGGCAACCAAGAACGCCATCTTGATTGTGGAATTTGCAAAAGAATTCTACGAGAAGGGGGCGGGTCTGGTTGAGGCGACCTTACATGCGGTCCGCGTGCGTTTACGTCCGATTTTAATGACGTCGCTCGCTTTTGGTCTGGGGGTTGTACCGCTAGCCATTAGTACAGGTGTGGGTTCGGGCAGTCAGAACGCCATTGGTACCGGTGTACTTGGCGGTATGATGAGTTCGACCTTCTTA!
> GGTATCTTCTTCGTGCCACTGTTCTTCGTCATTGTTGAGCGGATCTTCAGTAAACGAGAGCGAAAAGCGAAAGAGAAAAATCCTACGTCGACGGATTAA
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Wed Feb 13 10:18:10 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 13 Feb 2013 15:18:10 +0000
Subject: [Bioperl-l] [ANNOUNCEMENT] BioPerl Future Development
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE24CF5@CHIMBX5.ad.uillinois.edu>

All,

tl;dr: A lot of change is coming.  Be forewarned and be prepared.

This is an 'official' announcement to the BioPerl community on future BioPerl plans.  We have decided to move continued maintenance of Bioperl release series over to the new 'v1' branch.  This branch will be the point where any future versions of 1.6.x code will be released, starting with the (already-scheduled) March 1 release.  The 'master' branch will become the main focal point for future development of BioPerl going into an eventual v2 release, with a focus on performance enhancements, addressing newer technologies like NGS and large data, code cleanup, and simplifying the code base.

We welcome any help with code improvements. GMOD folks? Want to help? This is a good opportunity to address BioPerl short-comings in the code base! 

What this means for anyone using BioPerl currently:

1) We anticipate significant issues if you are relying on the 'master' branch for anything.  To inelegantly state it, the core developers are taking back the 'master' branch for future development. Please please please do not rely on the 'master' branch for stable code; if you are reliant on the BioPerl 1.6.x, make sure to use 'v1'.  We can revisit whether to make 'v1' the default checkout branch if/when the need arises.

2) Expect not to find some modules.  We will be migrating modules requiring external dependencies and other associated chunks of the code base out into their own repositories over the next year to help future maintenance; the eventual intent is to release all of these independently on CPAN.  We will completely remove all code previously marked as deprecated, and we may immediately deprecate additional modules if needed (this will of course be discussed on list).

3) Expect version numbering to change significantly.  Because we are releasing code in separate repositories, I fully expect downstream versioning problems if we stick with the current system (e.g. all bioperl-live modules having the same version).  It will be too much of a headache to sync versions for all modules as this will entail making a full release of all bioperl code, one of the main reasons we are splitting out code to begin with.  At the moment, no specific versioning scheme has been chosen, though I *highly* recommend using X.Y versioning for simplicity (e.g. no more 3-point versions).  This is the standard that Lincoln has adopted for Bio::Graphics and GBrowse.

4) Expect quick deprecation of methods within modules as needed.  These should of course be brought up to the mail list prior to actual implementation, but I would anticipate some things changing as we try to adopt a more consistent method naming scheme.

5) The same steps outlined for bioperl-live will apply for bioperl-run modules.  We will have to decide the best approach to use for those, e.g. whether to separate them out based on task (alignment), application group (NGS, BLAST, RNA), etc. and how these may fit organically with bioperl-live modules where appropriate.

6) Do not expect a new CPAN release of such code until Dec 2013.  Even then it will be in an alpha stage.  We are all busy campers.

We do not anticipate significant changes to bioperl-network or bioperl-db at this time beyond updating them to deal with new changes. 

I'm sure there are many other points that need to be discussed.   Please reply over the next week if you have any concerns. 

chris


From cjfields at illinois.edu  Wed Feb 13 11:01:07 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 13 Feb 2013 16:01:07 +0000
Subject: [Bioperl-l] Test-pls ignore
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE2506D@CHIMBX5.ad.uillinois.edu>

testing the mail list to see if it is working.

-c


From sebastien.moretti at unil.ch  Wed Feb 13 11:21:23 2013
From: sebastien.moretti at unil.ch (=?ISO-8859-1?Q?Moretti_S=E9bastien?=)
Date: Wed, 13 Feb 2013 17:21:23 +0100
Subject: [Bioperl-l] PhyloXML
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE24E8D@CHIMBX5.ad.uillinois.edu>
References: <51152591.9010402@unil.ch>
	<F041F111-CF8F-4096-9968-5F8CA5DCA866@drycafe.net>
	<511898E6.7060400@unil.ch>
	<118F034CF4C3EF48A96F86CE585B94BF6CE24E8D@CHIMBX5.ad.uillinois.edu>
Message-ID: <511BBD83.2000708@unil.ch>

>>>> # Add annotation
>>>> $treeio->add_phyloXML_annotation(-obj => $tree,
>>>>                                 -xml => '<name>SUMF family</name>',
>>>>                                );
>>>
>>> If you really have $treeio in your script in this line and not $tree, then that's at least one problem. But the stack suggests that the above isn't the exact line in your script - can you confirm that?
>>>
>>> 	-hilmar
>>
>> I replaced $treeio by $tree in the above line but still get an error.
>> Don't see what you mean by "the stack suggests that the above isn't the exact line in your script"
>>
>> The only think I changed is the length of the xml string I try to insert. But get the same error with an empty xml string.
>>
>>
>>
>> my $treeio = new Bio::TreeIO(-file   => "$infile",
>>                              -format => 'phyloxml',
>>                             );
>> my $tree = $treeio->next_tree;
>>
>> # Add annotation
>> $tree->add_phyloXML_annotation(-obj => $tree,
>>                                -xml => '<name>SUMF family</name>',
>>                               );
>>
>> Can't locate object method "add_phyloXML_annotation" via package
>> 	"Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, <GEN0> line 1 (#1)
>>     (F) You called a method correctly, and it correctly indicated a package
>>     functioning as a class, but that package doesn't define that particular
>>     method, nor does any of its base classes.  See perlobj.
>>
>> Uncaught exception from user code:
>> 	Can't locate object method "add_phyloXML_annotation" via package "Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, <GEN0> line 1.
>> at ./add_annotation_to_phyloxml.pl line 40
>
> Will have to look into this.  One problem we have is that phyloXML support has dwindled, so if anyone wants to take this on I would be more than happy to help them get started.
>
> chris

You mean that BioPerl 1.6.901 has not a full support of PhyloXML ?
The problem I have is "expected" ?

-- 
S?bastien Moretti


From cjfields at illinois.edu  Wed Feb 13 10:47:17 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 13 Feb 2013 15:47:17 +0000
Subject: [Bioperl-l] PhyloXML
In-Reply-To: <511898E6.7060400@unil.ch>
References: <51152591.9010402@unil.ch>
	<F041F111-CF8F-4096-9968-5F8CA5DCA866@drycafe.net>
	<511898E6.7060400@unil.ch>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE24E8D@CHIMBX5.ad.uillinois.edu>

On Feb 11, 2013, at 1:08 AM, S?bastien MORETTI <sebastien.moretti at unil.ch> wrote:

>>> # Add annotation
>>> $treeio->add_phyloXML_annotation(-obj => $tree,
>>>                                -xml => '<name>SUMF family</name>',
>>>                               );
>> 
>> If you really have $treeio in your script in this line and not $tree, then that's at least one problem. But the stack suggests that the above isn't the exact line in your script - can you confirm that?
>> 
>> 	-hilmar
> 
> I replaced $treeio by $tree in the above line but still get an error.
> Don't see what you mean by "the stack suggests that the above isn't the exact line in your script"
> 
> The only think I changed is the length of the xml string I try to insert. But get the same error with an empty xml string.
> 
> 
> 
> my $treeio = new Bio::TreeIO(-file   => "$infile",
>                             -format => 'phyloxml',
>                            );
> my $tree = $treeio->next_tree;
> 
> # Add annotation
> $tree->add_phyloXML_annotation(-obj => $tree,
>                               -xml => '<name>SUMF family</name>',
>                              );
> 
> Can't locate object method "add_phyloXML_annotation" via package
> 	"Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, <GEN0> line 1 (#1)
>    (F) You called a method correctly, and it correctly indicated a package
>    functioning as a class, but that package doesn't define that particular
>    method, nor does any of its base classes.  See perlobj.
> 
> Uncaught exception from user code:
> 	Can't locate object method "add_phyloXML_annotation" via package "Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, <GEN0> line 1.
> at ./add_annotation_to_phyloxml.pl line 40
> 
> 
> 
> -- 
> S?bastien Moretti
> Department of Ecology and Evolution,
> Biophore, University of Lausanne,
> CH-1015 Lausanne, Switzerland
> Tel.: +41 (21) 692 4221/4079
> http://bioinfo.unil.ch/\

Will have to look into this.  One problem we have is that phyloXML support has dwindled, so if anyone wants to take this on I would be more than happy to help them get started.

chris


From carandraug+dev at gmail.com  Wed Feb 13 12:23:23 2013
From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=)
Date: Wed, 13 Feb 2013 17:23:23 +0000
Subject: [Bioperl-l] Next BioPerl release
Message-ID: <CAPOrs_0HoMHm6u5VFgCRONsv8YF_OX5TE1dJLTS+qBTRuh_Btw@mail.gmail.com>

On 5 February 2013 21:53, Fields, Christopher J <cjfields at illinois.edu> wrote:
> I am scheduling the next BioPerl CPAN release tentatively for March 1.  Any help in triaging bug reports would be greatly appreciated!

Hi

is this release of bioperl-live only or also includes bioperl-run?

Carn?


From cjfields at illinois.edu  Wed Feb 13 12:08:21 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 13 Feb 2013 17:08:21 +0000
Subject: [Bioperl-l] PhyloXML
In-Reply-To: <511BBD83.2000708@unil.ch>
References: <51152591.9010402@unil.ch>
	<F041F111-CF8F-4096-9968-5F8CA5DCA866@drycafe.net>
	<511898E6.7060400@unil.ch>
	<118F034CF4C3EF48A96F86CE585B94BF6CE24E8D@CHIMBX5.ad.uillinois.edu>
	<511BBD83.2000708@unil.ch>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE2532B@CHIMBX5.ad.uillinois.edu>

On Feb 13, 2013, at 10:21 AM, Moretti S?bastien <sebastien.moretti at unil.ch> wrote:

>>>>> # Add annotation
>>>>> $treeio->add_phyloXML_annotation(-obj => $tree,
>>>>>                                -xml => '<name>SUMF family</name>',
>>>>>                               );
>>>> 
>>>> If you really have $treeio in your script in this line and not $tree, then that's at least one problem. But the stack suggests that the above isn't the exact line in your script - can you confirm that?
>>>> 
>>>> 	-hilmar
>>> 
>>> I replaced $treeio by $tree in the above line but still get an error.
>>> Don't see what you mean by "the stack suggests that the above isn't the exact line in your script"
>>> 
>>> The only think I changed is the length of the xml string I try to insert. But get the same error with an empty xml string.
>>> 
>>> 
>>> 
>>> my $treeio = new Bio::TreeIO(-file   => "$infile",
>>>                             -format => 'phyloxml',
>>>                            );
>>> my $tree = $treeio->next_tree;
>>> 
>>> # Add annotation
>>> $tree->add_phyloXML_annotation(-obj => $tree,
>>>                               -xml => '<name>SUMF family</name>',
>>>                              );
>>> 
>>> Can't locate object method "add_phyloXML_annotation" via package
>>> 	"Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, <GEN0> line 1 (#1)
>>>    (F) You called a method correctly, and it correctly indicated a package
>>>    functioning as a class, but that package doesn't define that particular
>>>    method, nor does any of its base classes.  See perlobj.
>>> 
>>> Uncaught exception from user code:
>>> 	
>>> at ./add_annotation_to_phyloxml.pl line 40
>> 
>> Will have to look into this.  One problem we have is that phyloXML support has dwindled, so if anyone wants to take this on I would be more than happy to help them get started.
>> 
>> chris
> 
> You mean that BioPerl 1.6.901 has not a full support of PhyloXML ?
> The problem I have is "expected" ?
> 
> -- 
> S?bastien Moretti

I think it handles most of phyloXML fine, but the implementation of the parser is a little tricky.  I tried cleaning this up a few years back but didn't make much progress.

The function is in Bio::TreeIO::phyloxml, so the correct call should be (as you previously had it):

    $treeio->add_phyloXML_annotation(-obj => $tree,
                              -xml => '<name>SUMF family</name>',
                             );

My guess is that Bio::Tree::Tree was AnnotatableI at one point but that was removed, will have to trace that back.  Can you file a bug on this?

https://redmine.open-bio.org/

chris


From cjfields at illinois.edu  Wed Feb 13 13:05:53 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 13 Feb 2013 18:05:53 +0000
Subject: [Bioperl-l] Next BioPerl release
In-Reply-To: <CAPOrs_0HoMHm6u5VFgCRONsv8YF_OX5TE1dJLTS+qBTRuh_Btw@mail.gmail.com>
References: <CAPOrs_0HoMHm6u5VFgCRONsv8YF_OX5TE1dJLTS+qBTRuh_Btw@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE25573@CHIMBX5.ad.uillinois.edu>

On Feb 13, 2013, at 11:23 AM, Carn? Draug <carandraug+dev at gmail.com> wrote:

> On 5 February 2013 21:53, Fields, Christopher J <cjfields at illinois.edu> wrote:
>> I am scheduling the next BioPerl CPAN release tentatively for March 1.  Any help in triaging bug reports would be greatly appreciated!
> 
> Hi
> 
> is this release of bioperl-live only or also includes bioperl-run?
> 
> Carn?

We can work on a bioperl-run release.  It's too much to handle both in one go.  The problem I have faced with bioperl-run in the past is similar to bioperl-live, that the tools used are a moving target and that makes the wrappers easily out-of-date.  I would really like a more flexible generic way of defining these that would allow for easier maintenance.

chris


From l.m.timmermans at students.uu.nl  Wed Feb 13 14:44:22 2013
From: l.m.timmermans at students.uu.nl (Leon Timmermans)
Date: Wed, 13 Feb 2013 20:44:22 +0100
Subject: [Bioperl-l] Next BioPerl release
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE25573@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_0HoMHm6u5VFgCRONsv8YF_OX5TE1dJLTS+qBTRuh_Btw@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE25573@CHIMBX5.ad.uillinois.edu>
Message-ID: <CAC1jpXBf+uOXHKpxb7o8t3pYttnnRF35A49zY5M-3mEOuniGCA@mail.gmail.com>

On Wed, Feb 13, 2013 at 7:05 PM, Fields, Christopher J
<cjfields at illinois.edu> wrote:
> We can work on a bioperl-run release.  It's too much to handle both in one go.  The problem I have faced with bioperl-run in the past is similar to bioperl-live, that the tools used are a moving target and that makes the wrappers easily out-of-date.  I would really like a more flexible generic way of defining these that would allow for easier maintenance.

Also, bioperl-run needs to be cut into smaller distributions even more
than bioperl-live. Few people if anyone at all has all tools it tries
to wrap at hand, so its almost impossible to pass its testing suite.

We need dists that can realistically pass.

Leon


From cjfields at illinois.edu  Wed Feb 13 16:04:26 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 13 Feb 2013 21:04:26 +0000
Subject: [Bioperl-l] Next BioPerl release
In-Reply-To: <CAC1jpXBf+uOXHKpxb7o8t3pYttnnRF35A49zY5M-3mEOuniGCA@mail.gmail.com>
References: <CAPOrs_0HoMHm6u5VFgCRONsv8YF_OX5TE1dJLTS+qBTRuh_Btw@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE25573@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBf+uOXHKpxb7o8t3pYttnnRF35A49zY5M-3mEOuniGCA@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE25B07@CHIMBX5.ad.uillinois.edu>

On Feb 13, 2013, at 1:44 PM, Leon Timmermans <l.m.timmermans at students.uu.nl> wrote:

> On Wed, Feb 13, 2013 at 7:05 PM, Fields, Christopher J
> <cjfields at illinois.edu> wrote:
>> We can work on a bioperl-run release.  It's too much to handle both in one go.  The problem I have faced with bioperl-run in the past is similar to bioperl-live, that the tools used are a moving target and that makes the wrappers easily out-of-date.  I would really like a more flexible generic way of defining these that would allow for easier maintenance.
> 
> Also, bioperl-run needs to be cut into smaller distributions even more
> than bioperl-live. Few people if anyone at all has all tools it tries
> to wrap at hand, so its almost impossible to pass its testing suite.
> 
> We need dists that can realistically pass.
> 
> Leon

Yup.  It's a mess.

chris


From florent.angly at gmail.com  Wed Feb 13 17:33:14 2013
From: florent.angly at gmail.com (Florent Angly)
Date: Thu, 14 Feb 2013 08:33:14 +1000
Subject: [Bioperl-l] [ANNOUNCEMENT] BioPerl Future Development
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE24CF5@CHIMBX5.ad.uillinois.edu>
References: <118F034CF4C3EF48A96F86CE585B94BF6CE24CF5@CHIMBX5.ad.uillinois.edu>
Message-ID: <511C14AA.9030107@gmail.com>

On 14/02/13 01:18, Fields, Christopher J wrote:
> I*highly*  recommend using X.Y versioning for simplicity (e.g. no more 3-point versions)
Yes, I support the X.Y versioning as well.
Florent


From l.m.timmermans at students.uu.nl  Wed Feb 13 18:12:06 2013
From: l.m.timmermans at students.uu.nl (Leon Timmermans)
Date: Thu, 14 Feb 2013 00:12:06 +0100
Subject: [Bioperl-l] [ANNOUNCEMENT] BioPerl Future Development
In-Reply-To: <511C14AA.9030107@gmail.com>
References: <118F034CF4C3EF48A96F86CE585B94BF6CE24CF5@CHIMBX5.ad.uillinois.edu>
	<511C14AA.9030107@gmail.com>
Message-ID: <CAC1jpXBk9prChjjeHmnykWh4j7FRMN1adY0ibzM8uqH1+Z5uGA@mail.gmail.com>

On Wed, Feb 13, 2013 at 11:33 PM, Florent Angly <florent.angly at gmail.com> wrote:
> On 14/02/13 01:18, Fields, Christopher J wrote:
>>
>> I*highly*  recommend using X.Y versioning for simplicity (e.g. no more
>> 3-point versions)
>
> Yes, I support the X.Y versioning as well.
> Florent

See also: http://www.dagolden.com/index.php/369/version-numbers-should-be-boring/

Leon


From daisieh at gmail.com  Thu Feb 14 00:21:15 2013
From: daisieh at gmail.com (Daisie Huang)
Date: Wed, 13 Feb 2013 21:21:15 -0800 (PST)
Subject: [Bioperl-l] Question regarding while loops for reading files
In-Reply-To: <CADdQm2mHL-_X+bPh=cVwp1_xMCrVGhe0=D75Uf410X_L=qHz3g@mail.gmail.com>
References: <CADdQm2mHL-_X+bPh=cVwp1_xMCrVGhe0=D75Uf410X_L=qHz3g@mail.gmail.com>
Message-ID: <3cbbba3b-759d-4281-9592-6b690aea92ab@googlegroups.com>

I think you need to reset the pointer to the filehandle before you go 
through the while loop the second time: seek $fh,0,0

On Wednesday, February 13, 2013 6:46:41 PM UTC-8, Tiago Hori wrote:
>
> Hey Guys,
>
> I am still at the same place. I am writing these little pieces of code to 
> try to learn the language better, so any advice would be useful. I am again 
> parsing through tab delimited files and now trying to find fish from on id 
> (in these case families AS5 and AS9), retrieve the weights and average 
> them. When I started I did it for one family and it worked (instead of the 
> @families I had a scalar $family set to AS5). But really it is more useful 
> to look at more than one family at time (I should mention that are 2 types 
> of fish per family one ends in PS , the other doesn't). So I tried to use a 
> foreach loop to go through the file twice, once with a the search value set 
> to AS5 and a second time to AS9. It works for AS5, but for some reason, the 
> foreach loop sets $test to AS9 the second time, but it doesn't go through 
> the while loop. What am I doing wrong? 
>
> here is the code:
>
> #! /usr/bin/perl
> use strict;
> use warnings;
>
> my $file = $ARGV[0];
> my @family = ('AS5','AS9');
> my $i;
> my $ii;
> my $test;
>
> open (my $fh, "<", $file) or die ("Can't open $file: $!");
>
> foreach (@family){
>     $test = $_;
>     my @data_weight_2N = ();
>     my @data_weight_3N = ();
>     while (<$fh>){
>         chomp;  
>         my $line = $_;
>         my @data  = split ("\t", $line);
>         if ($data[0] !~ /[0-9]*/){
>         next;}
>         elsif ($data[1] eq "ABF09-$test"){
>             $i += 1; 
>             push (@data_weight_2N,  $data[6]);
>         }elsif ($data[1] eq "ABF09-".$test."PS"){
>         $ii += 1;
>             push (@data_weight_3N,$data[6]);
>     }
> }
>     my $mean_2N = &average (\@data_weight_2N);
>     my $stdev_2N = &stdev (\@data_weight_2N);
>     my $stderr_2N = ($stdev_2N/sqrt($i));
>
>     print "These are the the avearge weight, stdev and stderr for $test 
> 2N:\t", $mean_2N,"\t",$stdev_2N,"\t",$stderr_2N, "\n";
>
>     my $mean_3N = &average (\@data_weight_3N);
>     my $stdev_3N = &stdev (\@data_weight_3N);
>     my $stderr_3N = ($stdev_3N/sqrt($i));
>
>     print "These are the the avearge weight, stdev and stderr for $test 
> 3N:\t", $mean_3N,"\t",$stdev_3N,"\t",$stderr_3N, "\n";
> }
>
> close ($fh);
>
> sub average{
>         my($data) = @_;
>         if (not @$data) {
>                 print ("Empty array\n");
>                 return 0;
>         }
>         my $total = 0;
>         foreach (@$data) {
>                 $total += $_;
>         }
>         my $average = $total / @$data;
>         return $average;
> }
>
> sub stdev{
>         my($data) = @_;
>         if(@$data == 1){
>                 return 0;
>         }
>         my $average = &average($data);
>         my $sqtotal = 0;
>         foreach(@$data) {
>                 $sqtotal += ($average-$_) ** 2;
>         }
>         my $std = ($sqtotal / (@$data-1)) ** 0.5;
>         return $std;
> }
>
> Thanks,
>
> T.
>
> -- 
> "Education is not to be used to promote obscurantism." - Theodonius 
> Dobzhansky.
>
> "Gracias a la vida que me ha dado tanto
> Me ha dado el sonido y el abecedario
> Con ?l, las palabras que pienso y declaro
> Madre, amigo, hermano
> Y luz alumbrando la ruta del alma del que estoy amando
>
> Gracias a la vida que me ha dado tanto
> Me ha dado la marcha de mis pies cansados
> Con ellos anduve ciudades y charcos
> Playas y desiertos, monta?as y llanos
> Y la casa tuya, tu calle y tu patio"
>
> Violeta Parra - Gracias a la Vida
>
> Tiago S. F. Hori. PhD.
> Ocean Science Center-Memorial University of Newfoundland 
>


From sebastien.moretti at unil.ch  Thu Feb 14 03:09:06 2013
From: sebastien.moretti at unil.ch (=?ISO-8859-1?Q?S=E9bastien_MORETTI?=)
Date: Thu, 14 Feb 2013 09:09:06 +0100
Subject: [Bioperl-l] PhyloXML
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE2532B@CHIMBX5.ad.uillinois.edu>
References: <51152591.9010402@unil.ch>
	<F041F111-CF8F-4096-9968-5F8CA5DCA866@drycafe.net>
	<511898E6.7060400@unil.ch>
	<118F034CF4C3EF48A96F86CE585B94BF6CE24E8D@CHIMBX5.ad.uillinois.edu>
	<511BBD83.2000708@unil.ch>
	<118F034CF4C3EF48A96F86CE585B94BF6CE2532B@CHIMBX5.ad.uillinois.edu>
Message-ID: <511C9BA2.9000508@unil.ch>

>>>>>> # Add annotation
>>>>>> $treeio->add_phyloXML_annotation(-obj => $tree,
>>>>>>                                 -xml => '<name>SUMF family</name>',
>>>>>>                                );
>>>>>
>>>>> If you really have $treeio in your script in this line and not $tree, then that's at least one problem. But the stack suggests that the above isn't the exact line in your script - can you confirm that?
>>>>>
>>>>> 	-hilmar
>>>>
>>>> I replaced $treeio by $tree in the above line but still get an error.
>>>> Don't see what you mean by "the stack suggests that the above isn't the exact line in your script"
>>>>
>>>> The only think I changed is the length of the xml string I try to insert. But get the same error with an empty xml string.
>>>>
>>>>
>>>>
>>>> my $treeio = new Bio::TreeIO(-file   => "$infile",
>>>>                              -format => 'phyloxml',
>>>>                             );
>>>> my $tree = $treeio->next_tree;
>>>>
>>>> # Add annotation
>>>> $tree->add_phyloXML_annotation(-obj => $tree,
>>>>                                -xml => '<name>SUMF family</name>',
>>>>                               );
>>>>
>>>> Can't locate object method "add_phyloXML_annotation" via package
>>>> 	"Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, <GEN0> line 1 (#1)
>>>>     (F) You called a method correctly, and it correctly indicated a package
>>>>     functioning as a class, but that package doesn't define that particular
>>>>     method, nor does any of its base classes.  See perlobj.
>>>>
>>>> Uncaught exception from user code:
>>>> 	
>>>> at ./add_annotation_to_phyloxml.pl line 40
>>>
>>> Will have to look into this.  One problem we have is that phyloXML support has dwindled, so if anyone wants to take this on I would be more than happy to help them get started.
>>>
>>> chris
>>
>> You mean that BioPerl 1.6.901 has not a full support of PhyloXML ?
>> The problem I have is "expected" ?
>>
>> --
>> S?bastien Moretti
>
> I think it handles most of phyloXML fine, but the implementation of the parser is a little tricky.  I tried cleaning this up a few years back but didn't make much progress.
>
> The function is in Bio::TreeIO::phyloxml, so the correct call should be (as you previously had it):
>
>      $treeio->add_phyloXML_annotation(-obj => $tree,
>                                -xml => '<name>SUMF family</name>',
>                               );
>
> My guess is that Bio::Tree::Tree was AnnotatableI at one point but that was removed, will have to trace that back.  Can you file a bug on this?
>
> https://redmine.open-bio.org/
>
> chris

I will fill a bug on this.

I'd be happy to try to contribute to the phyloxml code.
But don't know how to proceed for BioPerl.

-- 
S?bastien Moretti


From hartzell at alerce.com  Thu Feb 14 15:04:44 2013
From: hartzell at alerce.com (George Hartzell)
Date: Thu, 14 Feb 2013 12:04:44 -0800
Subject: [Bioperl-l] Question regarding while loops for reading files
In-Reply-To: <3cbbba3b-759d-4281-9592-6b690aea92ab@googlegroups.com>
References: <CADdQm2mHL-_X+bPh=cVwp1_xMCrVGhe0=D75Uf410X_L=qHz3g@mail.gmail.com>
	<3cbbba3b-759d-4281-9592-6b690aea92ab@googlegroups.com>
Message-ID: <20765.17244.185833.755900@gargle.gargle.HOWL>


I think that it's important to get feedback on code that one has
written and to try to understand how/what/why someone else has done in
their code.  To that end....

Since Tiago's using this to learn the language better I can't resist
some comments beyond resetting the file handle.

For grins I rewrote it using Text::CSV_XS and Statistics::Basic and to
take a single pass through the data file using a multilevel data
structure.

I resisted the urge to rewrite it in Moose.  Didn't even have an urge
to rewrite it in R.  Funny, that....

The script is here

  Tiago.pl
    https://gist.github.com/hartzell/4955401

With something like what I think the data looks like here:

    https://gist.github.com/hartzell/4955570

Even without that big of a rewrite, I had a bunch of local comments
which are inline below.

Daisie Huang writes:
 > [...]
 > On Wednesday, February 13, 2013 6:46:41 PM UTC-8, Tiago Hori wrote:
 > >
 > > Hey Guys,
 > >
 > > I am still at the same place. I am writing these little pieces of code to 
 > > try to learn the language better, so any advice would be useful.
 > > [...]
 > > here is the code:
 > >
 > > #! /usr/bin/perl
 > > use strict;
 > > use warnings;
 > >
 > > my $file = $ARGV[0];

Slightly better would be $filename, so that when you step up to
Path::Class you can differentiate a file object from a file name
string.

 > > my @family = ('AS5','AS9');

Better would be @families, plural.  See the use of $family below.

 > > my $i;
 > > my $ii;

As far as I can tell, these are just counting the number of things
that you push onto the various arrays.  You don't need them, referring
to the list in scalar context will give you its size.

 > > my $test;

You use this to hold the name of the family, so it's not particularly
evocative.  You should also restrict it's scope to within the loop.
See the comment for the foreach loop.

 > > open (my $fh, "<", $file) or die ("Can't open $file: $!");

You made my day, three arg. open *and* you checked for errors.  Nice!

 > > foreach (@family){

Better would be

  for my $family (@families) {

which is evocative and restricts the scope of $family to the for loop
(and for is 4 characters shorter than foreach...).

 > >     $test = $_;

No longer need this, using $family declared in the for loop with the
proper scoping.

 > >     my @data_weight_2N = ();
 > >     my @data_weight_3N = ();
 > >     while (<$fh>){
 > >         chomp;  
 > >         my $line = $_;
 > >         my @data  = split ("\t", $line);

Don't parse CSV (TSV) files yourself.  Get in the habit of using
Text::CSV_XS.

 > >         if ($data[0] !~ /[0-9]*/){
 > >         next;}
 > >         elsif ($data[1] eq "ABF09-$test"){
 > >             $i += 1; 

You don't need the counter.

 > >             push (@data_weight_2N,  $data[6]);
 > >         }elsif ($data[1] eq "ABF09-".$test."PS"){
 > >         $ii += 1;

You don't need the counter.

 > >             push (@data_weight_3N,$data[6]);
 > >     }
 > > }
 > >     my $mean_2N = &average (\@data_weight_2N);
 > >     my $stdev_2N = &stdev (\@data_weight_2N);

You don't need the ampersands on the subroutine calls.  They're old
school <joke> and just encourage people to make fun of our language for its
use of all those funny punctuation marks </joke>.

 > >     my $stderr_2N = ($stdev_2N/sqrt($i));

Unless I'm mistaken, this is equivalent

    my $stderr_2N = ($stdev_2N/sqrt(scalar @data_weight_2N));

and you don't need the counter, the explicit use of scalar there might
even be redundant (I'm a coward).  You use the same trick in your
subroutine defn's below.

 > >
 > >     print "These are the the avearge weight, stdev and stderr for $test 
 > > 2N:\t", $mean_2N,"\t",$stdev_2N,"\t",$stderr_2N, "\n";
 > >
 > >     my $mean_3N = &average (\@data_weight_3N);
 > >     my $stdev_3N = &stdev (\@data_weight_3N);
 > >     my $stderr_3N = ($stdev_3N/sqrt($i));
 > >
 > >     print "These are the the avearge weight, stdev and stderr for $test 
 > > 3N:\t", $mean_3N,"\t",$stdev_3N,"\t",$stderr_3N, "\n";
 > > }
 > >
 > > close ($fh);

Ah, rats.  You checked whether open worked, you need to do the same
thing on close too!

  close ($fh) or die !$;

Or you could just

  use autodie qw(open close);

and then they'll die appropriately when they have to and you don't
have to bother with the checking.

 > > sub average{
 > >         my($data) = @_;
 > >         if (not @$data) {
 > >                 print ("Empty array\n");
 > >                 return 0;
 > >         }
 > >         my $total = 0;
 > >         foreach (@$data) {
 > >                 $total += $_;
 > >         }

  use List::AllUtils qw(sum); # somewhere up at the top of the script...

  my $total = sum(@$data);
  if (not defined $total) {
     print "Empty array\n";
     return;
  }

List::AllUtils is your friend.  Learn to use it.

Your returning 0 for an empty list is probably the wrong thing, isn't
it possible to the total to actually be 0?  Just return instead.
Don't return undef, just return (and let perl take context into
account for you).

You probably don't actually want to spew "Empty array" out into your
output stream, imagine writing a script that postprocesses your output
and having to deal with it.  If you really need to say it, send it to
standard error with

  print STDERR "Empty array\n";

 > >         my $average = $total / @$data;
 > >         return $average;

If you don't really need the error message, then you can get to

  my $total = sum(@$data);
  return unless $total;
  return $total / @$data;

And if an empty data array is *truly* unexpected, maybe you should
just die/carp.

 > > }
 > >
 > > sub stdev{
 > >         my($data) = @_;
 > >         if(@$data == 1){
 > >                 return 0;
 > >         }
 > >         my $average = &average($data);
 > >         my $sqtotal = 0;
 > >         foreach(@$data) {
 > >                 $sqtotal += ($average-$_) ** 2;
 > >         }
 > >         my $std = ($sqtotal / (@$data-1)) ** 0.5;
 > >         return $std;
 > > }

Ditto on the use of List::AllUtils, etc...

Phew.

The only other thing I'd like to see would be an arrangement that
let's you write simple tests.  A simple sol'n would be to package the
entire main part of the code up into e.g. a subroutine that returns a
hashref keyed by family, containing a hashref keyed by 2N/3N/... and
then you could just:

  use Test::More;
  
  use Tiago qw(summarize);
  
  my $output = summarize("test_data.tsv");
  
  is($output->{AS5}->{'2N}, "42", "Got the magic number")
  
  # etc...
  
  done_testing;
  
Thanks for sharing your code.  Keep practicing!

g.


From carandraug+dev at gmail.com  Thu Feb 14 17:13:45 2013
From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=)
Date: Thu, 14 Feb 2013 22:13:45 +0000
Subject: [Bioperl-l] bioperl in Google Summer of Code 2013
Message-ID: <CAPOrs_2GA-h1hM73+jZ13Mjh3w3ZDh7jupQ4jHYcG=560jTQPg@mail.gmail.com>

Hi

we got word of it on another project I'm involved with and I was
wondering. Is bioperl going to apply for the Google Summer of Code
this year?

http://www.google-melange.com/gsoc/homepage/google/gsoc2013

Carn?


From hlapp at drycafe.net  Fri Feb 15 09:28:30 2013
From: hlapp at drycafe.net (Hilmar Lapp)
Date: Fri, 15 Feb 2013 09:28:30 -0500
Subject: [Bioperl-l] bioperl in Google Summer of Code 2013
In-Reply-To: <CAPOrs_2GA-h1hM73+jZ13Mjh3w3ZDh7jupQ4jHYcG=560jTQPg@mail.gmail.com>
References: <CAPOrs_2GA-h1hM73+jZ13Mjh3w3ZDh7jupQ4jHYcG=560jTQPg@mail.gmail.com>
Message-ID: <50A8D5F3-7248-49C1-AA1D-969C2BCA38A9@drycafe.net>

I presume the OBF does as an umbrella organization on behalf of all Bio* projects. If you fancy proposing a project idea or mentoring, now is not a bad time to think about that or looking for co-mentors.

-hilmar

Sent with a tap.

On Feb 14, 2013, at 5:13 PM, Carn? Draug <carandraug+dev at gmail.com> wrote:

> Hi
> 
> we got word of it on another project I'm involved with and I was
> wondering. Is bioperl going to apply for the Google Summer of Code
> this year?
> 
> http://www.google-melange.com/gsoc/homepage/google/gsoc2013
> 
> Carn?
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From p.j.a.cock at googlemail.com  Fri Feb 15 09:47:39 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Fri, 15 Feb 2013 14:47:39 +0000
Subject: [Bioperl-l] bioperl in Google Summer of Code 2013
In-Reply-To: <50A8D5F3-7248-49C1-AA1D-969C2BCA38A9@drycafe.net>
References: <CAPOrs_2GA-h1hM73+jZ13Mjh3w3ZDh7jupQ4jHYcG=560jTQPg@mail.gmail.com>
	<50A8D5F3-7248-49C1-AA1D-969C2BCA38A9@drycafe.net>
Message-ID: <CAKVJ-_5M9r9ZA7=KLFhzejcJ36dL11f_2kCrJBp1vR5+S9BF3Q@mail.gmail.com>

On Fri, Feb 15, 2013 at 2:28 PM, Hilmar Lapp <hlapp at drycafe.net> wrote:
> I presume the OBF does as an umbrella organization on behalf of all Bio*
> projects. If you fancy proposing a project idea or mentoring, now is not a
> bad time to think about that or looking for co-mentors.
>
> -hilmar

Yes, the plan is that as in the last few years, the OBF will apply to
GSoC and cover for BioPerl, BioJava, BioRuby, Biopython etc. At
this stage the Bio* projects would be wise to start coming up with
some good project ideas and experienced developers thinking about
being a mentor. For potential students, getting involved in the
community early is a good idea (e.g. bug reports, or better fixing
existing bugs)

See also:
http://lists.open-bio.org/mailman/listinfo/gsoc
http://lists.open-bio.org/mailman/listinfo/gsoc-mentors

Peter


From cjfields at illinois.edu  Fri Feb 15 09:59:43 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Fri, 15 Feb 2013 14:59:43 +0000
Subject: [Bioperl-l] bioperl in Google Summer of Code 2013
In-Reply-To: <CAKVJ-_5M9r9ZA7=KLFhzejcJ36dL11f_2kCrJBp1vR5+S9BF3Q@mail.gmail.com>
References: <CAPOrs_2GA-h1hM73+jZ13Mjh3w3ZDh7jupQ4jHYcG=560jTQPg@mail.gmail.com>
	<50A8D5F3-7248-49C1-AA1D-969C2BCA38A9@drycafe.net>
	<CAKVJ-_5M9r9ZA7=KLFhzejcJ36dL11f_2kCrJBp1vR5+S9BF3Q@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE28328@CHIMBX5.ad.uillinois.edu>

On Feb 15, 2013, at 8:47 AM, Peter Cock <p.j.a.cock at googlemail.com> wrote:

> On Fri, Feb 15, 2013 at 2:28 PM, Hilmar Lapp <hlapp at drycafe.net> wrote:
>> I presume the OBF does as an umbrella organization on behalf of all Bio*
>> projects. If you fancy proposing a project idea or mentoring, now is not a
>> bad time to think about that or looking for co-mentors.
>> 
>> -hilmar
> 
> Yes, the plan is that as in the last few years, the OBF will apply to
> GSoC and cover for BioPerl, BioJava, BioRuby, Biopython etc. At
> this stage the Bio* projects would be wise to start coming up with
> some good project ideas and experienced developers thinking about
> being a mentor. For potential students, getting involved in the
> community early is a good idea (e.g. bug reports, or better fixing
> existing bugs)
> 
> See also:
> http://lists.open-bio.org/mailman/listinfo/gsoc
> http://lists.open-bio.org/mailman/listinfo/gsoc-mentors
> 
> Peter

At the moment I'm not sure if Rob is heading this up or if the baton will be passed on to someone else.  I can't take charge of writing up a proposal at the moment but I can certainly help edit.

chris


From scott at scottcain.net  Fri Feb 15 14:18:37 2013
From: scott at scottcain.net (Scott Cain)
Date: Fri, 15 Feb 2013 14:18:37 -0500
Subject: [Bioperl-l] sequence-region directives in gff files
In-Reply-To: <CAPOrs_3r_cay3d59uBXCNqKwGHRBOBy+c+XOzvrfMeHdbzNTLg@mail.gmail.com>
References: <CAPOrs_3r_cay3d59uBXCNqKwGHRBOBy+c+XOzvrfMeHdbzNTLg@mail.gmail.com>
Message-ID: <CA+JTaox4SeQueWRpvgmq7GpdJ=EzQe6t3Lim2yn6y=_dBcp95A@mail.gmail.com>

Hi Carn?,

Thanks for pointing this out; I was only sort of paying attention to
the FeatureIO discussion, and it hadn't occurred to me that my commit
was the problem.

I believe I've reproduced the functionality from that commit, and I
even added a test that makes use of the added method (yes, I know, it
surprised me too!).  All of the tests now pass for me in the FeatureIO
master.  I'm putting it on my todo list to check that the Chado loader
that makes use of Bio::FeatureIO still works as expected with the new
incarnation.

Thanks,
Scott


On Wed, Feb 13, 2013 at 5:22 AM, Carn? Draug <carandraug+dev at gmail.com> wrote:
> Hi Scott
>
> 3 years ago, the code for the Bio::SeqFeatureIO::* modules was split
> from bioperl-live into a separate repository[1]. Because the code was
> not removed from the bioperl-live repository, people ended up patching
> on both sides, leading to 2 branches of development. Last weekend I
> merged them back together with the exception of one commit that would
> not longer apply[2].
>
> This commit was authored by you with the following commit message:
> "tiny change to Bio::FeatureIO::gff to allow the gmod chado gff3 bulk
> loader to not choke when the gff file has ##sequence-region
> directives.  The loader is documented not to support this, but now it
> will quitely ignore those directives."
>
> Do you think you could take a look at it?
>
> Thank you,
> Carn?
>
> [1] https://github.com/bioperl/Bio-FeatureIO
> [2] https://github.com/bioperl/bioperl-live/commit/7218728b66ad297953676236077fd0ec757378c0


-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   scott at scottcain dot net
GMOD Coordinator (http://gmod.org/)                     216-392-3087
Ontario Institute for Cancer Research


From carandraug+dev at gmail.com  Tue Feb 19 13:52:57 2013
From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=)
Date: Tue, 19 Feb 2013 18:52:57 +0000
Subject: [Bioperl-l] bioperl in Google Summer of Code 2013
In-Reply-To: <CAPOrs_0u2Qpft6_pWMaj3Wdf_-ZPOfnoYoOaevdCL443hnUsoA@mail.gmail.com>
References: <CAPOrs_2GA-h1hM73+jZ13Mjh3w3ZDh7jupQ4jHYcG=560jTQPg@mail.gmail.com>
	<50A8D5F3-7248-49C1-AA1D-969C2BCA38A9@drycafe.net>
	<CAKVJ-_5M9r9ZA7=KLFhzejcJ36dL11f_2kCrJBp1vR5+S9BF3Q@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE28328@CHIMBX5.ad.uillinois.edu>
	<CAPOrs_0u2Qpft6_pWMaj3Wdf_-ZPOfnoYoOaevdCL443hnUsoA@mail.gmail.com>
Message-ID: <CAPOrs_0kiyqSfvS7ZgEkWwbAaiA2L5fV9U2r5U9cROTvyMGLRw@mail.gmail.com>

On 15 February 2013 14:28, Hilmar Lapp <hlapp at drycafe.net> wrote:
> [...]
> If you fancy proposing a project idea or mentoring, now is not a bad time to think about that or looking for co-mentors.

On 15 February 2013 14:59, Fields, Christopher J <cjfields at illinois.edu> wrote:
> At the moment I'm not sure if Rob is heading this up or if the baton will be passed on to someone else.  I can't take charge of writing up a proposal at the moment but I can certainly help edit.

I would like to participate this year as a student.

I do not have however, have any bioperl itch that would last a summer
to fix. The largest of them is to implement BLAST using NCBI's server.
They have made available a SOAP-based BLAST and doing this has been on
my todo for ages. Would you suggest any other project for bioperl?

Carn?


From peymanalavi at yahoo.com  Tue Feb 19 16:16:49 2013
From: peymanalavi at yahoo.com (peyman alavi)
Date: Tue, 19 Feb 2013 13:16:49 -0800 (PST)
Subject: [Bioperl-l] BioGraphics: Bio::SCF installation through cpan fails
Message-ID: <1361308609.90384.YahooMailNeo@web120901.mail.ne1.yahoo.com>

Hello,
I am having
problems for a while trying to install the Bio::SCF module on my Vista32. Now, I know that Bio::SCF isn't really a Bioperl module, but I need it for Bio::Graphics, and I thought perhaps other people had experienced the same problem before.? I
have installed zlib and io_lib (both their last available versions), but it
looks like sth. (presumably with io_lib) is missing. I should be very grateful
if someone could tell me what still needs to be done!
Here are
the paths where the io_lib "library" and "include" directories are installed, and I
set them to cpan before trying to install Bio::SCF:
o conf
makepl_arg ?LIBS=-Lc:/MinGW/msys/1.0/local/lib INC=-Ic:/MinGW/msys/1.0/local/include?
And the
following is what I get on the STDOUT:
?
Set up gcc environment - 4.7.2
[32m
cpan shell -- CPAN exploration and modules installation (v1.9800)
Enter 'h' for help.[0m
?
[32m??? makepl_arg???????? [LIBS=-Lc:/MinGW/msys/1.0/local/lib
INC=-Ic:/MinGW/msys/1.0/local/include][0m
[32mPlease use 'o conf commit' to make the config permanent![0m
?
[32m[0m
[32mReading 'D:\Perl\cpan\Metadata'[0m
[32m? Database was generated on
Sun, 17 Feb 2013 12:17:02 GMT[0m
[32mRunning install for module 'Bio::SCF'[0m
[32mRunning make for L/LD/LDS/Bio-SCF-1.03.tar.gz[0m
[32mChecksum for
D:\Perl\cpan\sources\authors\id\L\LD\LDS\Bio-SCF-1.03.tar.gz ok[0m
[32mScanning cache D:\Perl/cpan/build for sizes[0m
[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32mDONE[0m
[32mBio-SCF-1.03/[0m
[32mBio-SCF-1.03/t/[0m
[32mBio-SCF-1.03/t/scf.t[0m
[32mBio-SCF-1.03/eg/[0m
[32mBio-SCF-1.03/eg/write_test_obj.pl[0m
[32mBio-SCF-1.03/eg/write_test_tied.pl[0m
[32mBio-SCF-1.03/eg/read_test_obj.pl[0m
[32mBio-SCF-1.03/eg/read_test_tied.pl[0m
[32mBio-SCF-1.03/SCF/[0m
[32mBio-SCF-1.03/SCF/Arrays.pm[0m
[32mBio-SCF-1.03/DISCLAIMER[0m
[32mBio-SCF-1.03/README[0m
[32mBio-SCF-1.03/SCF.pm[0m
[32mBio-SCF-1.03/SCF.xs[0m
[32mBio-SCF-1.03/Changes[0m
[32mBio-SCF-1.03/test.scf[0m
[32mBio-SCF-1.03/Makefile.PL[0m
[32mBio-SCF-1.03/META.yml[0m
[32mBio-SCF-1.03/INSTALL[0m
[32mBio-SCF-1.03/MANIFEST[0m
[32m
? CPAN.pm: Building
L/LD/LDS/Bio-SCF-1.03.tar.gz[0m
?
Set up gcc environment - 4.7.2
Checking if your kit is complete...
Looks good
Writing Makefile for Bio::SCF
Writing MYMETA.yml and MYMETA.json
cp SCF.pm blib\lib\Bio\SCF.pm
cp SCF/Arrays.pm blib\lib\Bio\SCF\Arrays.pm
D:\Perl\bin\perl.exe D:\Perl\site\lib\ExtUtils\xsubpp? -typemap D:\Perl\lib\ExtUtils\typemap? SCF.xs > SCF.xsc &&
D:\Perl\bin\perl.exe -MExtUtils::Command -e mv -- SCF.xsc SCF.c
Please specify prototyping behavior for SCF.xs (see perlxs manual)
c:/MinGW/bin/gcc.exe -c? -Ic:/MinGW/msys/1.0/local/include ???????????? -DNDEBUG
-DWIN32 -D_CONSOLE -DNO_STRICT -DHAVE_DES_FCRYPT -DUSE_SITECUSTOMIZE
-DPERL_IMPLICIT_CONTEXT -DPERL_IMPLICIT_SYS -DUSE_PERLIO -D_USE_32BIT_TIME_T
-DPERL_MSVCRT_READFIX -DHASATTRIBUTE -fno-strict-aliasing -mms-bitfields -O2 ??????? ??-DVERSION=\"1.03\" ??????? -DXS_VERSION=\"1.03\"? "-ID:\Perl\lib\CORE"? -DLITTLE_ENDIAN SCF.c
In file included from c:/MinGW/msys/1.0/local/include/io_lib/scf.h:31:0,
???????????????? from SCF.xs:12:
c:/MinGW/msys/1.0/local/include/io_lib/mFILE.h:23:0: warning:
"MF_APPEND" redefined [enabled by default]
In file included from
c:\mingw\bin\../lib/gcc/mingw32/4.7.2/../../../../include/windows.h:55:0,
???????????????? from
D:\Perl\lib\CORE/win32.h:61,
???????????????? from
D:\Perl\lib\CORE/win32thread.h:4,
???????????????? from
D:\Perl\lib\CORE/perl.h:2825,
???????????????? from SCF.xs:5:
c:\mingw\bin\../lib/gcc/mingw32/4.7.2/../../../../include/winuser.h:131:0:
note: this is the location of the previous definition
SCF.xs: In function 'XS_Bio__SCF_get_scf_pointer':
SCF.xs:35:2: warning: passing argument 3 of '(*Perl_ILIO_ptr((struct
PerlInterpreter *)Perl_get_context()))->pNameStat' from incompatible pointer
type [enabled by default]
SCF.xs:35:2: note: expected 'struct _stati64 *' but argument is of type
'struct stat *'
Running Mkbootstrap for Bio::SCF ()
D:\Perl\bin\perl.exe -MExtUtils::Command -e chmod -- 644 SCF.bs
D:\Perl\bin\perl.exe -MExtUtils::Mksymlists \
???? -e
"Mksymlists('NAME'=>\"Bio::SCF\", 'DLBASE' => 'SCF',
'DL_FUNCS' => {? }, 'FUNCLIST' =>
[], 'IMPORTS' => {? }, 'DL_VARS' =>
[]);"
Set up gcc environment - 4.7.2
dlltool --def SCF.def --output-exp dll.exp
c:\MinGW\bin\g++.exe -o blib\arch\auto\Bio\SCF\SCF.dll -Wl,--base-file
-Wl,dll.base -mdll -L"D:\Perl\lib\CORE" SCF.o?? D:\Perl\lib\CORE\perl512.lib
c:\MinGW\lib\libkernel32.a c:\MinGW\lib\libuser32.a c:\MinGW\lib\libgdi32.a
c:\MinGW\lib\libwinspool.a c:\MinGW\lib\libcomdlg32.a c:\MinGW\lib\libadvapi32.a
c:\MinGW\lib\libshell32.a c:\MinGW\lib\libole32.a c:\MinGW\lib\liboleaut32.a
c:\MinGW\lib\libnetapi32.a c:\MinGW\lib\libuuid.a c:\MinGW\lib\libws2_32.a
c:\MinGW\lib\libmpr.a c:\MinGW\lib\libwinmm.a c:\MinGW\lib\libversion.a
c:\MinGW\lib\libodbc32.a c:\MinGW\lib\libodbccp32.a c:\MinGW\lib\libcomctl32.a
c:\MinGW\lib\libmsvcrt.a dll.exp
Warning: resolving _VirtualQuery at 12 by linking to _VirtualQuery
Use --enable-stdcall-fixup to disable these warnings
Use --disable-stdcall-fixup to disable these fixups
Warning: resolving _VirtualProtect at 16 by linking to _VirtualProtect
Warning: resolving _EnterCriticalSection at 4 by linking to
_EnterCriticalSection
Warning: resolving _TlsGetValue at 4 by linking to _TlsGetValue
Warning: resolving _GetLastError at 0 by linking to _GetLastError
Warning: resolving _LeaveCriticalSection at 4 by linking to
_LeaveCriticalSection
Warning: resolving _DeleteCriticalSection at 4 by linking to
_DeleteCriticalSection
Warning: resolving _InitializeCriticalSection at 4 by linking to
_InitializeCriticalSection
SCF.o:SCF.c:(.text+0xf35): undefined reference to `mfreopen'
SCF.o:SCF.c:(.text+0xf4b): undefined reference to `mfwrite_scf'
SCF.o:SCF.c:(.text+0xf6a): undefined reference to `mfflush'
SCF.o:SCF.c:(.text+0xf72): undefined reference to `mfdestroy'
SCF.o:SCF.c:(.text+0x1138): undefined reference to `write_scf'
SCF.o:SCF.c:(.text+0x16ac): undefined reference to `scf_deallocate'
SCF.o:SCF.c:(.text+0x17b1): undefined reference to `mfreopen'
SCF.o:SCF.c:(.text+0x17c1): undefined reference to `mfread_scf'
SCF.o:SCF.c:(.text+0x19bd): undefined reference to `read_scf'
c:/mingw/bin/../lib/gcc/mingw32/4.7.2/../../../../mingw32/bin/ld.exe:
SCF.o: bad reloc address 0xa4 in section `.rdata'
c:/mingw/bin/../lib/gcc/mingw32/4.7.2/../../../../mingw32/bin/ld.exe:
final link failed: Invalid operation
collect2.exe: error: ld returned 1 exit status
dmake.exe:? Error code 129, while
making 'blib\arch\auto\Bio\SCF\SCF.dll'
[32m? LDS/Bio-SCF-1.03.tar.gz[0m
[31m? D:\Perl\site\bin\dmake.exe
-- NOT OK[0m
[32mRunning make test[0m
[32m? Can't test without successful
make[0m
[32mRunning make install[0m
[32m? Make had returned bad
status, install seems impossible[0m
[32mFailed during this command:
?LDS/Bio-SCF-1.03.tar.gz????????????????????? : make NO[0m
[32m[0m
[31mWarning: Configuration not saved.[0m
[32mLockfile removed.[0m
?
?
?Thanks in advance for any useful
suggestions/help!!
Peyman


From scott at scottcain.net  Tue Feb 19 18:39:44 2013
From: scott at scottcain.net (Scott Cain)
Date: Tue, 19 Feb 2013 18:39:44 -0500
Subject: [Bioperl-l] BioGraphics: Bio::SCF installation through cpan
	fails
In-Reply-To: <1361308609.90384.YahooMailNeo@web120901.mail.ne1.yahoo.com>
References: <1361308609.90384.YahooMailNeo@web120901.mail.ne1.yahoo.com>
Message-ID: <777246AB-2EF0-403D-9652-8EA8390D5C53@scottcain.net>

Hi Peyman,

I have no idea what might be required to get staden and Bio::SCF installed on a windows machine; you have my sympathies for having to go through it. 

But what I wanted to touch on was what you wrote, that is, that you "need" it for Bio::Graphics. I just wanted to point out that you don't need it unless you want to be able to display traces from ABI sequencers (which most people don't really care to do these days). Bioi::SCF is listed as a recommended module, not a required one.

Scott


Sent from my iPad

On Feb 19, 2013, at 4:16 PM, peyman alavi <peymanalavi at yahoo.com> wrote:

> Hello,
> I am having
> problems for a while trying to install the Bio::SCF module on my Vista32. Now, I know that Bio::SCF isn't really a Bioperl module, but I need it for Bio::Graphics, and I thought perhaps other people had experienced the same problem before.  I
> have installed zlib and io_lib (both their last available versions), but it
> looks like sth. (presumably with io_lib) is missing. I should be very grateful
> if someone could tell me what still needs to be done!
> Here are
> the paths where the io_lib "library" and "include" directories are installed, and I
> set them to cpan before trying to install Bio::SCF:
> o conf
> makepl_arg ?LIBS=-Lc:/MinGW/msys/1.0/local/lib INC=-Ic:/MinGW/msys/1.0/local/include?
> And the
> following is what I get on the STDOUT:
>  
> Set up gcc environment - 4.7.2
> [32m
> cpan shell -- CPAN exploration and modules installation (v1.9800)
> Enter 'h' for help.[0m
>  
> [32m    makepl_arg         [LIBS=-Lc:/MinGW/msys/1.0/local/lib
> INC=-Ic:/MinGW/msys/1.0/local/include][0m
> [32mPlease use 'o conf commit' to make the config permanent![0m
>  
> [32m[0m
> [32mReading 'D:\Perl\cpan\Metadata'[0m
> [32m  Database was generated on
> Sun, 17 Feb 2013 12:17:02 GMT[0m
> [32mRunning install for module 'Bio::SCF'[0m
> [32mRunning make for L/LD/LDS/Bio-SCF-1.03.tar.gz[0m
> [32mChecksum for
> D:\Perl\cpan\sources\authors\id\L\LD\LDS\Bio-SCF-1.03.tar.gz ok[0m
> [32mScanning cache D:\Perl/cpan/build for sizes[0m
> [32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32mDONE[0m
> [32mBio-SCF-1.03/[0m
> [32mBio-SCF-1.03/t/[0m
> [32mBio-SCF-1.03/t/scf.t[0m
> [32mBio-SCF-1.03/eg/[0m
> [32mBio-SCF-1.03/eg/write_test_obj.pl[0m
> [32mBio-SCF-1.03/eg/write_test_tied.pl[0m
> [32mBio-SCF-1.03/eg/read_test_obj.pl[0m
> [32mBio-SCF-1.03/eg/read_test_tied.pl[0m
> [32mBio-SCF-1.03/SCF/[0m
> [32mBio-SCF-1.03/SCF/Arrays.pm[0m
> [32mBio-SCF-1.03/DISCLAIMER[0m
> [32mBio-SCF-1.03/README[0m
> [32mBio-SCF-1.03/SCF.pm[0m
> [32mBio-SCF-1.03/SCF.xs[0m
> [32mBio-SCF-1.03/Changes[0m
> [32mBio-SCF-1.03/test.scf[0m
> [32mBio-SCF-1.03/Makefile.PL[0m
> [32mBio-SCF-1.03/META.yml[0m
> [32mBio-SCF-1.03/INSTALL[0m
> [32mBio-SCF-1.03/MANIFEST[0m
> [32m
>   CPAN.pm: Building
> L/LD/LDS/Bio-SCF-1.03.tar.gz[0m
>  
> Set up gcc environment - 4.7.2
> Checking if your kit is complete...
> Looks good
> Writing Makefile for Bio::SCF
> Writing MYMETA.yml and MYMETA.json
> cp SCF.pm blib\lib\Bio\SCF.pm
> cp SCF/Arrays.pm blib\lib\Bio\SCF\Arrays.pm
> D:\Perl\bin\perl.exe D:\Perl\site\lib\ExtUtils\xsubpp  -typemap D:\Perl\lib\ExtUtils\typemap  SCF.xs > SCF.xsc &&
> D:\Perl\bin\perl.exe -MExtUtils::Command -e mv -- SCF.xsc SCF.c
> Please specify prototyping behavior for SCF.xs (see perlxs manual)
> c:/MinGW/bin/gcc.exe -c  -Ic:/MinGW/msys/1.0/local/include              -DNDEBUG
> -DWIN32 -D_CONSOLE -DNO_STRICT -DHAVE_DES_FCRYPT -DUSE_SITECUSTOMIZE
> -DPERL_IMPLICIT_CONTEXT -DPERL_IMPLICIT_SYS -DUSE_PERLIO -D_USE_32BIT_TIME_T
> -DPERL_MSVCRT_READFIX -DHASATTRIBUTE -fno-strict-aliasing -mms-bitfields -O2           -DVERSION=\"1.03\"         -DXS_VERSION=\"1.03\"  "-ID:\Perl\lib\CORE"  -DLITTLE_ENDIAN SCF.c
> In file included from c:/MinGW/msys/1.0/local/include/io_lib/scf.h:31:0,
>                  from SCF.xs:12:
> c:/MinGW/msys/1.0/local/include/io_lib/mFILE.h:23:0: warning:
> "MF_APPEND" redefined [enabled by default]
> In file included from
> c:\mingw\bin\../lib/gcc/mingw32/4.7.2/../../../../include/windows.h:55:0,
>                  from
> D:\Perl\lib\CORE/win32.h:61,
>                  from
> D:\Perl\lib\CORE/win32thread.h:4,
>                  from
> D:\Perl\lib\CORE/perl.h:2825,
>                  from SCF.xs:5:
> c:\mingw\bin\../lib/gcc/mingw32/4.7.2/../../../../include/winuser.h:131:0:
> note: this is the location of the previous definition
> SCF.xs: In function 'XS_Bio__SCF_get_scf_pointer':
> SCF.xs:35:2: warning: passing argument 3 of '(*Perl_ILIO_ptr((struct
> PerlInterpreter *)Perl_get_context()))->pNameStat' from incompatible pointer
> type [enabled by default]
> SCF.xs:35:2: note: expected 'struct _stati64 *' but argument is of type
> 'struct stat *'
> Running Mkbootstrap for Bio::SCF ()
> D:\Perl\bin\perl.exe -MExtUtils::Command -e chmod -- 644 SCF.bs
> D:\Perl\bin\perl.exe -MExtUtils::Mksymlists \
>      -e
> "Mksymlists('NAME'=>\"Bio::SCF\", 'DLBASE' => 'SCF',
> 'DL_FUNCS' => {  }, 'FUNCLIST' =>
> [], 'IMPORTS' => {  }, 'DL_VARS' =>
> []);"
> Set up gcc environment - 4.7.2
> dlltool --def SCF.def --output-exp dll.exp
> c:\MinGW\bin\g++.exe -o blib\arch\auto\Bio\SCF\SCF.dll -Wl,--base-file
> -Wl,dll.base -mdll -L"D:\Perl\lib\CORE" SCF.o   D:\Perl\lib\CORE\perl512.lib
> c:\MinGW\lib\libkernel32.a c:\MinGW\lib\libuser32.a c:\MinGW\lib\libgdi32.a
> c:\MinGW\lib\libwinspool.a c:\MinGW\lib\libcomdlg32.a c:\MinGW\lib\libadvapi32.a
> c:\MinGW\lib\libshell32.a c:\MinGW\lib\libole32.a c:\MinGW\lib\liboleaut32.a
> c:\MinGW\lib\libnetapi32.a c:\MinGW\lib\libuuid.a c:\MinGW\lib\libws2_32.a
> c:\MinGW\lib\libmpr.a c:\MinGW\lib\libwinmm.a c:\MinGW\lib\libversion.a
> c:\MinGW\lib\libodbc32.a c:\MinGW\lib\libodbccp32.a c:\MinGW\lib\libcomctl32.a
> c:\MinGW\lib\libmsvcrt.a dll.exp
> Warning: resolving _VirtualQuery at 12 by linking to _VirtualQuery
> Use --enable-stdcall-fixup to disable these warnings
> Use --disable-stdcall-fixup to disable these fixups
> Warning: resolving _VirtualProtect at 16 by linking to _VirtualProtect
> Warning: resolving _EnterCriticalSection at 4 by linking to
> _EnterCriticalSection
> Warning: resolving _TlsGetValue at 4 by linking to _TlsGetValue
> Warning: resolving _GetLastError at 0 by linking to _GetLastError
> Warning: resolving _LeaveCriticalSection at 4 by linking to
> _LeaveCriticalSection
> Warning: resolving _DeleteCriticalSection at 4 by linking to
> _DeleteCriticalSection
> Warning: resolving _InitializeCriticalSection at 4 by linking to
> _InitializeCriticalSection
> SCF.o:SCF.c:(.text+0xf35): undefined reference to `mfreopen'
> SCF.o:SCF.c:(.text+0xf4b): undefined reference to `mfwrite_scf'
> SCF.o:SCF.c:(.text+0xf6a): undefined reference to `mfflush'
> SCF.o:SCF.c:(.text+0xf72): undefined reference to `mfdestroy'
> SCF.o:SCF.c:(.text+0x1138): undefined reference to `write_scf'
> SCF.o:SCF.c:(.text+0x16ac): undefined reference to `scf_deallocate'
> SCF.o:SCF.c:(.text+0x17b1): undefined reference to `mfreopen'
> SCF.o:SCF.c:(.text+0x17c1): undefined reference to `mfread_scf'
> SCF.o:SCF.c:(.text+0x19bd): undefined reference to `read_scf'
> c:/mingw/bin/../lib/gcc/mingw32/4.7.2/../../../../mingw32/bin/ld.exe:
> SCF.o: bad reloc address 0xa4 in section `.rdata'
> c:/mingw/bin/../lib/gcc/mingw32/4.7.2/../../../../mingw32/bin/ld.exe:
> final link failed: Invalid operation
> collect2.exe: error: ld returned 1 exit status
> dmake.exe:  Error code 129, while
> making 'blib\arch\auto\Bio\SCF\SCF.dll'
> [32m  LDS/Bio-SCF-1.03.tar.gz[0m
> [31m  D:\Perl\site\bin\dmake.exe
> -- NOT OK[0m
> [32mRunning make test[0m
> [32m  Can't test without successful
> make[0m
> [32mRunning make install[0m
> [32m  Make had returned bad
> status, install seems impossible[0m
> [32mFailed during this command:
>  LDS/Bio-SCF-1.03.tar.gz                      : make NO[0m
> [32m[0m
> [31mWarning: Configuration not saved.[0m
> [32mLockfile removed.[0m
>  
>  
>  Thanks in advance for any useful
> suggestions/help!!
> Peyman
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From anngregory at email.arizona.edu  Wed Feb 20 00:20:41 2013
From: anngregory at email.arizona.edu (Ann Gregory)
Date: Tue, 19 Feb 2013 22:20:41 -0700
Subject: [Bioperl-l]  Problem Parsing BLAST output to annotate FASTA file
Message-ID: <CAHxs2gtL=UVAh_f7nSCFKAOj11wf92MThNqHCDxAEfRyb+M_zw@mail.gmail.com>

Hi BioPerl,

I am having issues with a BioPerl script. I have a blastxml file from a
blastx blast and the original multifasta file containing the original
nucleotides sequences.

I want to take the blast result (ie. the blast description) and annotate my
multifasta file.

I have written 2 while loops that extract the blast descriptions as well as
the nucleotide sequence from the multifasta file.

My problem is that I cannot incorporate one of the while loops into the
other without loosing the loop property of one of the loops. I would like
to take the 1st blast description, then the 1st nucleotide sequence, then
the 2nd blast description, then the 2nd nucleotide sequence and so
on...just can figure out how to alternate the results.

See script below:


use warnings;
use strict;
use Bio::SearchIO;
use Bio::SeqIO;


my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file =>
"$ARGV[0]");
while (my $result = $search_in->next_result) {
while (my $hit = $result->next_hit) {
while (my $hsp = $hit->next_hsp) {
my $qd = $hit->description;
print $qd, "\n";
}
}
}

my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]");
while (my $seqobj = $seqio->next_seq) {
my $nuc = $seqobj->seq();
print $nuc, "\n";
}--
Ann (Nina) Gregory
Graduate Student
Rich Lab / Sullivan Lab
Soil, Water, Environmental Science Department
University of Arizona


From yonexhalaolv at gmail.com  Wed Feb 20 04:17:12 2013
From: yonexhalaolv at gmail.com (Sebastian Lau)
Date: Wed, 20 Feb 2013 01:17:12 -0800 (PST)
Subject: [Bioperl-l] =?utf-8?q?failed_to_install_via_fink=EF=BC=9Ano_packa?=
 =?utf-8?q?ge_found_for_specification_=27bioperl-pm5100=27!?=
Message-ID: <84fa1bcb-a39f-4847-bff2-e3a9c2b909ea@googlegroups.com>

*Hi guys,*
*
*
*I just about to install bioperl on my MacOS 10.7.5 via fink. but after 
typing the command, fink said it couldn't find any package:*

fangmatoMacBook-Pro:~ yoyo$ fink install bioperl-pm5100
Information about 6901 packages read in 1 seconds.
Failed: no package found for specification 'bioperl-pm5100'!
fangmatoMacBook-Pro:~ yoyo$ fink install bioperl-pm588
Information about 6901 packages read in 1 seconds.
Failed: no package found for specification 'bioperl-pm588'!
fangmatoMacBook-Pro:~ yoyo$ fink install bioperl-pm586
Information about 6901 packages read in 1 seconds.
Failed: no package found for specification 'bioperl-pm586'!

*I followed the instruction on wiki. I don't know what's wrong with it. 
Thanks for your help.*


From awitney at sgul.ac.uk  Wed Feb 20 10:22:51 2013
From: awitney at sgul.ac.uk (Adam Witney)
Date: Wed, 20 Feb 2013 15:22:51 +0000
Subject: [Bioperl-l] Problem Parsing BLAST output to annotate FASTA file
In-Reply-To: <CAHxs2gtL=UVAh_f7nSCFKAOj11wf92MThNqHCDxAEfRyb+M_zw@mail.gmail.com>
References: <CAHxs2gtL=UVAh_f7nSCFKAOj11wf92MThNqHCDxAEfRyb+M_zw@mail.gmail.com>
Message-ID: <5124EA4B.5020409@sgul.ac.uk>


Hi Ann,

On 20/02/2013 05:20, Ann Gregory wrote:
> Hi BioPerl,
> 
> I am having issues with a BioPerl script. I have a blastxml file from a
> blastx blast and the original multifasta file containing the original
> nucleotides sequences.
> 
> I want to take the blast result (ie. the blast description) and annotate my
> multifasta file.
> 
> I have written 2 while loops that extract the blast descriptions as well as
> the nucleotide sequence from the multifasta file.
> 
> My problem is that I cannot incorporate one of the while loops into the
> other without loosing the loop property of one of the loops. I would like
> to take the 1st blast description, then the 1st nucleotide sequence, then
> the 2nd blast description, then the 2nd nucleotide sequence and so
> on...just can figure out how to alternate the results.
> 
> See script below:
> 
> 
> use warnings;
> use strict;
> use Bio::SearchIO;
> use Bio::SeqIO;
> 
> 
> my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file =>
> "$ARGV[0]");
> while (my $result = $search_in->next_result) {
> while (my $hit = $result->next_hit) {
> while (my $hsp = $hit->next_hsp) {
> my $qd = $hit->description;
> print $qd, "\n";
> }
> }
> }
> 
> my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]");
> while (my $seqobj = $seqio->next_seq) {
> my $nuc = $seqobj->seq();
> print $nuc, "\n";
> }--

I think what you are proposing assumes that the loop over the BLAST
results will come back in the same order as the loop over the Fasta
file, this may be the case, but I'm not sure its something I would rely on.

Anyway, I would loop over the BLAST results, storing the relevant data
to an array or hash and then loop over the fasta file to put the two
together. eg:

my $blast_data;

while ( ... blast data ... ) {
	...
	$blast_data->{$qd} = <whatever you want to store>
	...
}

while ( my $seqobj = $seqio->next_seq ) {
	my $id = $seqobj->id;
	print $blast_data->{$id}."\n";
}

something along those lines... or have i misunderstood you? if so can
you provide some more details, like what do you want your output to look
like?

HTH

Adam


From andreas.leimbach at uni-wuerzburg.de  Wed Feb 20 11:24:50 2013
From: andreas.leimbach at uni-wuerzburg.de (Andreas Leimbach)
Date: Wed, 20 Feb 2013 17:24:50 +0100
Subject: [Bioperl-l] Problem Parsing BLAST output to annotate FASTA file
In-Reply-To: <CAHxs2gtL=UVAh_f7nSCFKAOj11wf92MThNqHCDxAEfRyb+M_zw@mail.gmail.com>
References: <CAHxs2gtL=UVAh_f7nSCFKAOj11wf92MThNqHCDxAEfRyb+M_zw@mail.gmail.com>
Message-ID: <5124F8D2.4020904@uni-wuerzburg.de>

oops, I just realized I had one loop to much in there. Adam is correct. 
Sorry.

The last part of the code I send you should look like this:

my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]");
while (my $seqobj = $seqio->next_seq) {
print ">$hits{$seqobj->display_id}\n";
my $nuc = $seqobj->seq();
print $nuc, "\n";
}


Cheers,
Andreas

--
Andreas Leimbach
Universit?t M?nster
Institut f?r Hygiene
Mendelstr. 7
D-48149 M?nster
Germany

Tel.: +49 (0)551 39 3843
E-Mail: andreas.leimbach at uni-wuerzburg.de

On 20.2.13 06:20, Ann Gregory wrote:
> Hi BioPerl,
>
> I am having issues with a BioPerl script. I have a blastxml file from a
> blastx blast and the original multifasta file containing the original
> nucleotides sequences.
>
> I want to take the blast result (ie. the blast description) and annotate my
> multifasta file.
>
> I have written 2 while loops that extract the blast descriptions as well as
> the nucleotide sequence from the multifasta file.
>
> My problem is that I cannot incorporate one of the while loops into the
> other without loosing the loop property of one of the loops. I would like
> to take the 1st blast description, then the 1st nucleotide sequence, then
> the 2nd blast description, then the 2nd nucleotide sequence and so
> on...just can figure out how to alternate the results.
>
> See script below:
>
>
> use warnings;
> use strict;
> use Bio::SearchIO;
> use Bio::SeqIO;
>
>
> my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file =>
> "$ARGV[0]");
> while (my $result = $search_in->next_result) {
> while (my $hit = $result->next_hit) {
> while (my $hsp = $hit->next_hsp) {
> my $qd = $hit->description;
> print $qd, "\n";
> }
> }
> }
>
> my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]");
> while (my $seqobj = $seqio->next_seq) {
> my $nuc = $seqobj->seq();
> print $nuc, "\n";
> }--
> Ann (Nina) Gregory
> Graduate Student
> Rich Lab / Sullivan Lab
> Soil, Water, Environmental Science Department
> University of Arizona
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From andreas.leimbach at uni-wuerzburg.de  Wed Feb 20 11:14:29 2013
From: andreas.leimbach at uni-wuerzburg.de (Andreas Leimbach)
Date: Wed, 20 Feb 2013 17:14:29 +0100
Subject: [Bioperl-l] Problem Parsing BLAST output to annotate FASTA file
In-Reply-To: <CAHxs2gtL=UVAh_f7nSCFKAOj11wf92MThNqHCDxAEfRyb+M_zw@mail.gmail.com>
References: <CAHxs2gtL=UVAh_f7nSCFKAOj11wf92MThNqHCDxAEfRyb+M_zw@mail.gmail.com>
Message-ID: <5124F665.5050602@uni-wuerzburg.de>

Hi Ann,

I agree with Adam, but I was already writing my email, while his came 
in. Hope it helps:

I hope I understand correctly what you want to do.
Just to clarify, you queried a protein blast database with blastx and 
nucleotide queries. Now you want to associate the protein description 
for the FIRST blast hit with the corresponding nucleotide fasta file. Is 
that correct?
You have to put the two while loops into one another. Or associate the 
blast hits with the query descriptions. But it's not feasible to take 
the first blast hit and the first nucleotide fasta seq, then the 2nd of 
both etc, as Adam already pointed out.
You would have to iterate through both at the same time. I.e. take the 
first blast hit, then iterate through the nucleotide fasta until you 
find the hit. Then take the 2nd blast hit and iterate through the 
nucleotide fasta etc. It's probably easiest to do this in a hash.

Something along the lines of (not tested I just punched that in the E-Mail):

my %hits;
my $hit_desc;
my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file =>
"$ARGV[0]");
while (my $result = $search_in->next_result) {
while (my $hit = $result->next_hit) {
while (my $hsp = $hit->next_hsp) {
if ($hit->description eq $hit_desc) { # Only want the first blast hit
next;
}
my $hit_desc = $hit->description;
$hits{$result->query_description} = $hit_desc;
}
}
}

my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]");
foreach my $query (keys %hits) {
while (my $seqobj = $seqio->next_seq) {
if ($seqobj->display_id eq $query) {
print ">$hits{$query}\n";
my $nuc = $seqobj->seq();
print $nuc, "\n";
}

You might want to put some evalue cutoff in there to only score 
significant hits. Also if your nucleotide query multi-fasta file is very 
large, you might consider creating an index first:
http://www.bioperl.org/wiki/HOWTO:Local_Databases#Bio::Index

Hope that helps!

Cheers,
Andreas

P.S.: Please next time include version numbers for BioPerl and Perl and 
a little more detail what you want to do. ;-)


--
Andreas Leimbach
Universit?t M?nster
Institut f?r Hygiene
Mendelstr. 7
D-48149 M?nster
Germany

Tel.: +49 (0)551 39 3843
E-Mail: andreas.leimbach at uni-wuerzburg.de

On 20.2.13 06:20, Ann Gregory wrote:
> Hi BioPerl,
>
> I am having issues with a BioPerl script. I have a blastxml file from a
> blastx blast and the original multifasta file containing the original
> nucleotides sequences.
>
> I want to take the blast result (ie. the blast description) and annotate my
> multifasta file.
>
> I have written 2 while loops that extract the blast descriptions as well as
> the nucleotide sequence from the multifasta file.
>
> My problem is that I cannot incorporate one of the while loops into the
> other without loosing the loop property of one of the loops. I would like
> to take the 1st blast description, then the 1st nucleotide sequence, then
> the 2nd blast description, then the 2nd nucleotide sequence and so
> on...just can figure out how to alternate the results.
>
> See script below:
>
>
> use warnings;
> use strict;
> use Bio::SearchIO;
> use Bio::SeqIO;
>
>
> my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file =>
> "$ARGV[0]");
> while (my $result = $search_in->next_result) {
> while (my $hit = $result->next_hit) {
> while (my $hsp = $hit->next_hsp) {
> my $qd = $hit->description;
> print $qd, "\n";
> }
> }
> }
>
> my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]");
> while (my $seqobj = $seqio->next_seq) {
> my $nuc = $seqobj->seq();
> print $nuc, "\n";
> }--
> Ann (Nina) Gregory
> Graduate Student
> Rich Lab / Sullivan Lab
> Soil, Water, Environmental Science Department
> University of Arizona
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From andreas.leimbach at uni-wuerzburg.de  Wed Feb 20 12:00:51 2013
From: andreas.leimbach at uni-wuerzburg.de (Andreas Leimbach)
Date: Wed, 20 Feb 2013 18:00:51 +0100
Subject: [Bioperl-l] Problem Parsing BLAST output to annotate FASTA file
In-Reply-To: <CAHxs2gtYf70wvFtEX2nFZEtTsUcuw0i1nHzKBRL=H4tcVo+vBQ@mail.gmail.com>
References: <CAHxs2gtL=UVAh_f7nSCFKAOj11wf92MThNqHCDxAEfRyb+M_zw@mail.gmail.com>
	<5124F8D2.4020904@uni-wuerzburg.de>
	<CAHxs2gtYf70wvFtEX2nFZEtTsUcuw0i1nHzKBRL=H4tcVo+vBQ@mail.gmail.com>
Message-ID: <51250143.9050503@uni-wuerzburg.de>

Hey Ann,

damn, it 's not my best day ... Anyways, I wouldn't work with 
List::MoreUtils's each_array function, as this assumes that the blast 
hits and the nucleotide queries are in the same order (as Adam pointed 
out). Rather use a hash which associates a key to a certain value. Also, 
the hash can be used to skip sequences that have no hits.
Here's my new version:

my %hits;
my $hit_desc;
my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file =>
"$ARGV[0]");
while (my $result = $search_in->next_result) {
while (my $hit = $result->next_hit) {
while (my $hsp = $hit->next_hsp) {
$hits{$result->query_description} = $hit->description; # hash: associate 
query_desc (key) with hit_desc (value)
last; # jump out of the while loop; this should resolve getting only the 
first hit
}
last; # see above
}
}


my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]");
while (my $seqobj = $seqio->next_seq) {
if ($hits{$seqobj->display_id}) { # only true if display_id associated 
with hit_desc and should skip seqs without hits
print ">$hits{$seqobj->display_id}\n";
my $nuc = $seqobj->seq();
print $nuc, "\n";
}
}

Cheers,
Andreas

P.S.: I redirected your mail to the BioPerl mailing list, others might 
profit from my mistakes ;-) ...

--
Andreas Leimbach
Universit?t M?nster
Institut f?r Hygiene
Mendelstr. 7
D-48149 M?nster
Germany

Tel.: +49 (0)551 39 3843
E-Mail: andreas.leimbach at uni-wuerzburg.de

On 20.2.13 17:35, Ann Gregory wrote:
> Hi Andreas,
>
> Thanks for you help! I don't understand how this gets the first blast hit:
>
> if ($hit->description eq $hit_desc) { # Only want the first blast hit
> next;
> }
>
> I tried this and seems to be working...but I can't get the 1st blast hit
> or skip the sequences that had no hits. Do you know any quick fixes?
>
> *
> use warnings;
> use strict;
> use Bio::SearchIO;
> use Bio::SeqIO;
> use List::MoreUtils qw(each_array);
>
> my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file =>
> "$ARGV[0]");
> my @ids;
> while (my $result = $search_in->next_result) {
> while (my $hit = $result->next_hit) {
> while (my $hsp = $hit->next_hsp) {
> my $match = $result->num_hits;
> push(@ids, $qd);
> }
> }
> }
> }
>
> my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]");
> my @seqs;
> while (my $seqobj = $seqio->next_seq) {
> my $nuc = $seqobj->seq();
> push(@seqs, $nuc);
> }
>
> my $it = each_array(@ids, at seqs);
> while(my($ids,$seqs)=$it->()){
> print $ids, "\n", $seqs, "\n";
> }
> *
>
> Thanks again!
> ~Ann
>
> On Wed, Feb 20, 2013 at 9:24 AM, Andreas Leimbach
> <andreas.leimbach at uni-wuerzburg.de
> <mailto:andreas.leimbach at uni-wuerzburg.de>> wrote:
>
>     oops, I just realized I had one loop to much in there. Adam is
>     correct. Sorry.
>
>     The last part of the code I send you should look like this:
>
>
>     my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]");
>     while (my $seqobj = $seqio->next_seq) {
>     print ">$hits{$seqobj->display_id}\__n";
>
>     my $nuc = $seqobj->seq();
>     print $nuc, "\n";
>     }
>
>
>     Cheers,
>     Andreas
>
>
>     --
>     Andreas Leimbach
>     Universit?t M?nster
>     Institut f?r Hygiene
>     Mendelstr. 7
>     D-48149 M?nster
>     Germany
>
>     Tel.: +49 (0)551 39 3843 <tel:%2B49%20%280%29551%2039%203843>
>     E-Mail: andreas.leimbach at uni-__wuerzburg.de
>     <mailto:andreas.leimbach at uni-wuerzburg.de>
>
>     On 20.2.13 06:20, Ann Gregory wrote:
>
>         Hi BioPerl,
>
>         I am having issues with a BioPerl script. I have a blastxml file
>         from a
>         blastx blast and the original multifasta file containing the
>         original
>         nucleotides sequences.
>
>         I want to take the blast result (ie. the blast description) and
>         annotate my
>         multifasta file.
>
>         I have written 2 while loops that extract the blast descriptions
>         as well as
>         the nucleotide sequence from the multifasta file.
>
>         My problem is that I cannot incorporate one of the while loops
>         into the
>         other without loosing the loop property of one of the loops. I
>         would like
>         to take the 1st blast description, then the 1st nucleotide
>         sequence, then
>         the 2nd blast description, then the 2nd nucleotide sequence and so
>         on...just can figure out how to alternate the results.
>
>         See script below:
>
>
>         use warnings;
>         use strict;
>         use Bio::SearchIO;
>         use Bio::SeqIO;
>
>
>         my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file =>
>         "$ARGV[0]");
>         while (my $result = $search_in->next_result) {
>         while (my $hit = $result->next_hit) {
>         while (my $hsp = $hit->next_hsp) {
>         my $qd = $hit->description;
>         print $qd, "\n";
>         }
>         }
>         }
>
>         my $seqio = Bio::SeqIO->new(-format => 'fasta', -file =>
>         "$ARGV[1]");
>         while (my $seqobj = $seqio->next_seq) {
>         my $nuc = $seqobj->seq();
>         print $nuc, "\n";
>         }--
>         Ann (Nina) Gregory
>         Graduate Student
>         Rich Lab / Sullivan Lab
>         Soil, Water, Environmental Science Department
>         University of Arizona
>         _________________________________________________
>         Bioperl-l mailing list
>         Bioperl-l at lists.open-bio.org <mailto:Bioperl-l at lists.open-bio.org>
>         http://lists.open-bio.org/__mailman/listinfo/bioperl-l
>         <http://lists.open-bio.org/mailman/listinfo/bioperl-l>
>
>
>
>
> --
> Ann (Nina) Gregory
> Graduate Student
> Rich Lab / Sullivan Lab
> Soil, Water, Environmental Science Department
> University of Arizona
>
>
>


From cjfields at illinois.edu  Wed Feb 20 13:24:58 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 20 Feb 2013 18:24:58 +0000
Subject: [Bioperl-l] Problem Parsing BLAST output to annotate FASTA file
In-Reply-To: <51250143.9050503@uni-wuerzburg.de>
References: <CAHxs2gtL=UVAh_f7nSCFKAOj11wf92MThNqHCDxAEfRyb+M_zw@mail.gmail.com>
	<5124F8D2.4020904@uni-wuerzburg.de>
	<CAHxs2gtYf70wvFtEX2nFZEtTsUcuw0i1nHzKBRL=H4tcVo+vBQ@mail.gmail.com>
	<51250143.9050503@uni-wuerzburg.de>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE2EB4A@CHIMBX5.ad.uillinois.edu>

If this is meant to be something done using the same FASTA files for a bunch of BLAST reports, might be worth setting up a flat file index and using that to look up and grab the sequences; it should be a LOT faster, just the first pass (generation of the initial index) would take a little time.  Look at Bio::DB::Fasta for an example.

chris

On Feb 20, 2013, at 11:00 AM, Andreas Leimbach <andreas.leimbach at uni-wuerzburg.de>
 wrote:

> Hey Ann,
> 
> damn, it 's not my best day ... Anyways, I wouldn't work with List::MoreUtils's each_array function, as this assumes that the blast hits and the nucleotide queries are in the same order (as Adam pointed out). Rather use a hash which associates a key to a certain value. Also, the hash can be used to skip sequences that have no hits.
> Here's my new version:
> 
> my %hits;
> my $hit_desc;
> my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file =>
> "$ARGV[0]");
> while (my $result = $search_in->next_result) {
> while (my $hit = $result->next_hit) {
> while (my $hsp = $hit->next_hsp) {
> $hits{$result->query_description} = $hit->description; # hash: associate query_desc (key) with hit_desc (value)
> last; # jump out of the while loop; this should resolve getting only the first hit
> }
> last; # see above
> }
> }
> 
> 
> my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]");
> while (my $seqobj = $seqio->next_seq) {
> if ($hits{$seqobj->display_id}) { # only true if display_id associated with hit_desc and should skip seqs without hits
> print ">$hits{$seqobj->display_id}\n";
> my $nuc = $seqobj->seq();
> print $nuc, "\n";
> }
> }
> 
> Cheers,
> Andreas
> 
> P.S.: I redirected your mail to the BioPerl mailing list, others might profit from my mistakes ;-) ...
> 
> --
> Andreas Leimbach
> Universit?t M?nster
> Institut f?r Hygiene
> Mendelstr. 7
> D-48149 M?nster
> Germany
> 
> Tel.: +49 (0)551 39 3843
> E-Mail: andreas.leimbach at uni-wuerzburg.de
> 
> On 20.2.13 17:35, Ann Gregory wrote:
>> Hi Andreas,
>> 
>> Thanks for you help! I don't understand how this gets the first blast hit:
>> 
>> if ($hit->description eq $hit_desc) { # Only want the first blast hit
>> next;
>> }
>> 
>> I tried this and seems to be working...but I can't get the 1st blast hit
>> or skip the sequences that had no hits. Do you know any quick fixes?
>> 
>> *
>> use warnings;
>> use strict;
>> use Bio::SearchIO;
>> use Bio::SeqIO;
>> use List::MoreUtils qw(each_array);
>> 
>> my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file =>
>> "$ARGV[0]");
>> my @ids;
>> while (my $result = $search_in->next_result) {
>> while (my $hit = $result->next_hit) {
>> while (my $hsp = $hit->next_hsp) {
>> my $match = $result->num_hits;
>> push(@ids, $qd);
>> }
>> }
>> }
>> }
>> 
>> my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]");
>> my @seqs;
>> while (my $seqobj = $seqio->next_seq) {
>> my $nuc = $seqobj->seq();
>> push(@seqs, $nuc);
>> }
>> 
>> my $it = each_array(@ids, at seqs);
>> while(my($ids,$seqs)=$it->()){
>> print $ids, "\n", $seqs, "\n";
>> }
>> *
>> 
>> Thanks again!
>> ~Ann
>> 
>> On Wed, Feb 20, 2013 at 9:24 AM, Andreas Leimbach
>> <andreas.leimbach at uni-wuerzburg.de
>> <mailto:andreas.leimbach at uni-wuerzburg.de>> wrote:
>> 
>>    oops, I just realized I had one loop to much in there. Adam is
>>    correct. Sorry.
>> 
>>    The last part of the code I send you should look like this:
>> 
>> 
>>    my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]");
>>    while (my $seqobj = $seqio->next_seq) {
>>    print ">$hits{$seqobj->display_id}\__n";
>> 
>>    my $nuc = $seqobj->seq();
>>    print $nuc, "\n";
>>    }
>> 
>> 
>>    Cheers,
>>    Andreas
>> 
>> 
>>    --
>>    Andreas Leimbach
>>    Universit?t M?nster
>>    Institut f?r Hygiene
>>    Mendelstr. 7
>>    D-48149 M?nster
>>    Germany
>> 
>>    Tel.: +49 (0)551 39 3843 <tel:%2B49%20%280%29551%2039%203843>
>>    E-Mail: andreas.leimbach at uni-__wuerzburg.de
>>    <mailto:andreas.leimbach at uni-wuerzburg.de>
>> 
>>    On 20.2.13 06:20, Ann Gregory wrote:
>> 
>>        Hi BioPerl,
>> 
>>        I am having issues with a BioPerl script. I have a blastxml file
>>        from a
>>        blastx blast and the original multifasta file containing the
>>        original
>>        nucleotides sequences.
>> 
>>        I want to take the blast result (ie. the blast description) and
>>        annotate my
>>        multifasta file.
>> 
>>        I have written 2 while loops that extract the blast descriptions
>>        as well as
>>        the nucleotide sequence from the multifasta file.
>> 
>>        My problem is that I cannot incorporate one of the while loops
>>        into the
>>        other without loosing the loop property of one of the loops. I
>>        would like
>>        to take the 1st blast description, then the 1st nucleotide
>>        sequence, then
>>        the 2nd blast description, then the 2nd nucleotide sequence and so
>>        on...just can figure out how to alternate the results.
>> 
>>        See script below:
>> 
>> 
>>        use warnings;
>>        use strict;
>>        use Bio::SearchIO;
>>        use Bio::SeqIO;
>> 
>> 
>>        my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file =>
>>        "$ARGV[0]");
>>        while (my $result = $search_in->next_result) {
>>        while (my $hit = $result->next_hit) {
>>        while (my $hsp = $hit->next_hsp) {
>>        my $qd = $hit->description;
>>        print $qd, "\n";
>>        }
>>        }
>>        }
>> 
>>        my $seqio = Bio::SeqIO->new(-format => 'fasta', -file =>
>>        "$ARGV[1]");
>>        while (my $seqobj = $seqio->next_seq) {
>>        my $nuc = $seqobj->seq();
>>        print $nuc, "\n";
>>        }--
>>        Ann (Nina) Gregory
>>        Graduate Student
>>        Rich Lab / Sullivan Lab
>>        Soil, Water, Environmental Science Department
>>        University of Arizona
>>        _________________________________________________
>>        Bioperl-l mailing list
>>        Bioperl-l at lists.open-bio.org <mailto:Bioperl-l at lists.open-bio.org>
>>        http://lists.open-bio.org/__mailman/listinfo/bioperl-l
>>        <http://lists.open-bio.org/mailman/listinfo/bioperl-l>
>> 
>> 
>> 
>> 
>> --
>> Ann (Nina) Gregory
>> Graduate Student
>> Rich Lab / Sullivan Lab
>> Soil, Water, Environmental Science Department
>> University of Arizona
>> 
>> 
>> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From carandraug+dev at gmail.com  Mon Feb 25 05:08:23 2013
From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=)
Date: Mon, 25 Feb 2013 10:08:23 +0000
Subject: [Bioperl-l] module for description of sequence variants (where to
	place code)
Message-ID: <CAPOrs_0X9tF0_4q-KmV_OMu5vPDT7JbRsPZteLf5dYh1n9_vPg@mail.gmail.com>

Hi

I'm writing a perl module to write a description of the variance
between 2 sequences as described on
http://www.hgvs.org/mutnomen/recs-prot.html

Basically, given 2 sequences, would returns something like "p.Lys2del
p.His25_Met26insGln" if those are the differences. It also accounts
for the existence of - characters on the sequences that may come from
their alignment.

My question is, where on the project tree should I place the module?

Also, is there something already written that would convert from 1 to
3 letter code?

Carn?


From andreas.leimbach at uni-wuerzburg.de  Mon Feb 25 05:32:43 2013
From: andreas.leimbach at uni-wuerzburg.de (Andreas Leimbach)
Date: Mon, 25 Feb 2013 11:32:43 +0100
Subject: [Bioperl-l] module for description of sequence variants (where
 to place code)
In-Reply-To: <CAPOrs_0X9tF0_4q-KmV_OMu5vPDT7JbRsPZteLf5dYh1n9_vPg@mail.gmail.com>
References: <CAPOrs_0X9tF0_4q-KmV_OMu5vPDT7JbRsPZteLf5dYh1n9_vPg@mail.gmail.com>
Message-ID: <512B3DCB.7050008@uni-wuerzburg.de>

Hi Carn?,

for your last question:
You can convert aa strings from one to three letter code with 
'Bio::SeqUtils'.

Cheers,
Andreas

--
Andreas Leimbach
Universit?t M?nster
Institut f?r Hygiene
Mendelstr. 7
D-48149 M?nster
Germany

Tel.: +49 (0)551 39 3843
E-Mail: andreas.leimbach at uni-wuerzburg.de

On 25.2.13 11:08, Carn? Draug wrote:
> Hi
>
> I'm writing a perl module to write a description of the variance
> between 2 sequences as described on
> http://www.hgvs.org/mutnomen/recs-prot.html
>
> Basically, given 2 sequences, would returns something like "p.Lys2del
> p.His25_Met26insGln" if those are the differences. It also accounts
> for the existence of - characters on the sequences that may come from
> their alignment.
>
> My question is, where on the project tree should I place the module?
>
> Also, is there something already written that would convert from 1 to
> 3 letter code?
>
> Carn?
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From genehack at genehack.org  Wed Feb 27 19:57:48 2013
From: genehack at genehack.org (John SJ Anderson)
Date: Wed, 27 Feb 2013 16:57:48 -0800
Subject: [Bioperl-l] YAPC talks?
Message-ID: <CABJ3DF_o2n2nS5ywzweYaaA6AQzXuQ-KPQHp80QkVv+U09T0aw@mail.gmail.com>

Hi -

Is there anyone that was planning on submitting a Bioperl talk to
YAPC::NA? In an unrelated conversation, one of the organizers
expressed an interest in getting a Bioperl talk this year.

If no one else is planning on a talk submission, Jay Hannah (aka
deafferret) and I are promising/threatening a tag-team style "Bioperl
rules / Bioperl sucks" overview/state of the dist style talk...

thanks,
john.


From cjfields at illinois.edu  Wed Feb 27 21:48:55 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Thu, 28 Feb 2013 02:48:55 +0000
Subject: [Bioperl-l] YAPC talks?
In-Reply-To: <CABJ3DF_o2n2nS5ywzweYaaA6AQzXuQ-KPQHp80QkVv+U09T0aw@mail.gmail.com>
References: <CABJ3DF_o2n2nS5ywzweYaaA6AQzXuQ-KPQHp80QkVv+U09T0aw@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6E705CD3@CHIMBX5.ad.uillinois.edu>

At the moment I personally have no plans on going, but I think a no-holds-barred bioperl talk is a good idea.  

chris

On Feb 27, 2013, at 6:57 PM, John SJ Anderson <genehack at genehack.org> wrote:

> Hi -
> 
> Is there anyone that was planning on submitting a Bioperl talk to
> YAPC::NA? In an unrelated conversation, one of the organizers
> expressed an interest in getting a Bioperl talk this year.
> 
> If no one else is planning on a talk submission, Jay Hannah (aka
> deafferret) and I are promising/threatening a tag-team style "Bioperl
> rules / Bioperl sucks" overview/state of the dist style talk...
> 
> thanks,
> john.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From hlapp at drycafe.net  Wed Feb 27 22:20:34 2013
From: hlapp at drycafe.net (Hilmar Lapp)
Date: Wed, 27 Feb 2013 22:20:34 -0500
Subject: [Bioperl-l] YAPC talks?
In-Reply-To: <CABJ3DF_o2n2nS5ywzweYaaA6AQzXuQ-KPQHp80QkVv+U09T0aw@mail.gmail.com>
References: <CABJ3DF_o2n2nS5ywzweYaaA6AQzXuQ-KPQHp80QkVv+U09T0aw@mail.gmail.com>
Message-ID: <42C1F1B8-FE26-43A8-B601-E80D17D215EC@drycafe.net>


On Feb 27, 2013, at 7:57 PM, John SJ Anderson wrote:

> Jay Hannah (aka deafferret) and I are promising/threatening a tag-team style "Bioperl
> rules / Bioperl sucks" overview/state of the dist style talk...

Please videotape. I'll be sure to watch and promote it :-)

	-hilmar
-- 
===========================================================
: Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net :
===========================================================


From saladi1 at illinois.edu  Thu Feb 28 01:58:20 2013
From: saladi1 at illinois.edu (Shyam Saladi)
Date: Wed, 27 Feb 2013 22:58:20 -0800
Subject: [Bioperl-l] EUtilities Cookbook - Accn to gi
Message-ID: <CAARX5cXXD_DNb+Sbt-_zXvsn63QAaVBcot9YGtEjQ7ucrqAEKQ@mail.gmail.com>

Hi,

I think that rettype for the section "Get GIs for a list of accessions"
should be

-rettype => 'gi');

instead of 'gilist' as it is now. I think this change is due to a change in
NCBI eutils.

webpage:
http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook#Get_GIs_for_a_list_of_accessions

Thanks,
Shyam


From fossandonc at hotmail.com  Thu Feb 28 10:36:34 2013
From: fossandonc at hotmail.com (=?iso-8859-1?Q?Francisco_J._Ossand=F3n?=)
Date: Thu, 28 Feb 2013 12:36:34 -0300
Subject: [Bioperl-l] Fix for Bug #3376 broke somewhere else
Message-ID: <SNT133-ds14A180BAFAE068EE359031CFFE0@phx.gbl>

Hi,
I was re-checking Bug #3302 using the Bio::SearchIO modules of the
repository and found that now it can't parse a Hmmer2 file that was
previously fine. After tracking the problem, I discovered that a change in a
regular expression to fix another bug broke the parse.
 
The fix for the Bug #3376 consisted in adding an extra condition to omit
lines where end of domain indicator is split across lines
(https://redmine.open-bio.org/issues/3376):
TEST: domain 1 of 1, from 8 to 97: score 184.7, E = 2.5e-56
                   *->svfqqqqssksttgstvtAiAiAigYRYRYRAvtWnsGsLssGvnDn
                      sv+qqqq+  +    +vtAiAiAigYRYRYRAv Wn GsLs G nDn
        Test     8    SVYQQQQGGSA----MVTAIAIAIGYRYRYRAVVWNKGSLSTGTNDN 50   

                   DnDqqsdgLYtiYYsvtvpssslpsqtviHHHaHkasstkiiikiePr<-
                   DnDq +d LYtiYYsvtv +ss+p q+v+HHHaH+asstkiiiki P   
        Test    51 DNDQAAD-LYTIYYSVTVSASSWPGQSVTHHHAHPASSTKIIIKIAPS   97   

                   *

        Test     -   -
This case is characterized by the 2 dashes in the line...

So the expression added in hmmer2.pm - ?next_result?
(https://github.com/bioperl/bioperl-live/commit/142e5d79e3a6593db32bf0af9904
8f47d01bd3f2):
                        elsif (CORE::length($_) == 0
                            || ( $count != 1 && /^\s+$/o )
                            || /^\s+\-?\*\s*$/
                            || /^.+\-\s+\-\s*$/ ) ### <--- This regex was
designed for bug 3376
                        {
                            next;
                        }

But the expression used is too broad because it uses the "^.+" just before
the 2 dashes, and it broke these lines parsing, where is full of dashes:
                   KyACrqCdtiVQAPaPakpIErGiptaGLLArvlVSKyaEHlPLYRQsEI
                                                                     
  lcl|gi|340     - -------------------------------------------------- -    

                   yaRqGVeiaRstLadWVgrtgarLaPLvdALaeyVLkeGklHADeTPVqV
                         +i  s L   V++ + r                           
  lcl|gi|340 60938 ------AIMISGLIHGVSARCLRF-------------------------- 60955

I think a reasonable fix that still fixes the original bug and restore the
function for this case is to add an extra \s+ in the regex just before the
first dash, so the expression makes sure that the first dash is the one that
comes AFTER the description (and is replacing the usual coordinate number)
and is not the last of an alignment or a series of dashes like the one
above:
                        elsif (CORE::length($_) == 0
                            || ( $count != 1 && /^\s+$/o )
                            || /^\s+\-?\*\s*$/
                            || /^.+\s+\-\s+\-\s*$/ ) ### <--- Tweaked regex
                        {
                            next;
                        }
I tested it and it works fine, hope you find the fix acceptable.

Cheers,

--
Francisco J. Ossandon
Bioinformatician.
Ph.D. Candidate, University Andres Bello.
Center for Bioinformatics and Genome Biology,
Fundacion Ciencia para la Vida.
Santiago, Chile.
www.cienciavida.cl/CBGB.htm


From PDagosto at edgebio.com  Mon Feb 25 11:50:34 2013
From: PDagosto at edgebio.com (Phil Dagosto)
Date: Mon, 25 Feb 2013 16:50:34 +0000
Subject: [Bioperl-l] Error when running Build.PL
Message-ID: <DC8C6FE0AED292469CF192A00459937BC0F8660B@EDGE-EXCH02.edgebio.com>

Greetings,

I downloaded BioPerl 1.6.1 from this location: http://www.bioperl.org/wiki/Getting_BioPerl

When I ran Build.PL with all of the default settings chosen in the interactive mode I got the following error message:

Could not get valid metadata. Error is: Invalid metadata structure. Errors: 'Perl_5' for 'license' does not have a URL scheme (resources -> license) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::FeatureIO::gff -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::WebAgent -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::EUtilParameters -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::OntologyIO::InterProParser -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Biblio::IO::medlinexml -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::strider -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::PhyloNetwork::RandomFactory -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::Analysis::DNA::ESEfinder -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::game::gameSubs -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::FeatureIO::interpro -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::GFF::Adaptor::berkeleydb -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::entrezgene -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::tinyseq -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::chadoxml -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::game::gameWriter -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::FileCache -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::bsml_sax -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::Primer3 -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::GFF::Adaptor::ace -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::PopGen::HtSNP -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tree::Compatible -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::Ace -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::Taxonomy::entrez -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::agave -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::PopGen::TagHaplotype -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::SeqFeature::Store::FeatureFileLoader -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::* -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::Analysis::Protein* -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SearchIO::blastxml -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::EUtilities -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tree::Draw::Cladogram -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::SeqPattern -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::tigrxml -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqFeature::Collection -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Draw::Pictogram -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SearchIO::Writer::BSMLResultWriter -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::Query::HIVQuery -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::TreeIO::svggraph -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::Biblio::eutils -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::PhyloNetwork -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::SeqPattern::BackTranslate -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::Query::GenBank -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Variation::IO::xml -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::PhyloNetwork::GraphViz -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqFeature::Annotated -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::NCBIHelper -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::HIV -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::Analysis::DNA* -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::Run::RemoteBlast -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::excel -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::ClusterIO::dbsnp -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Microarray::Tools::ReseqChip -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::Biblio::soap -> requires) [Validation: 1.2]
at /usr/local/lib/perl5/5.10.1/Module/Build/Base.pm line 4559

Could not create MYMETA files
Creating new 'Build' script for 'BioPerl' version '1.006001'

I have no idea whether this is a problem or not or if I can proceed. Also, I'm confused by the version number referenced in the last line. 1.006001 is our current version - I thought I was installing version 1.6.1. Are these version numbers equivalent, i.e., are the zeros not meaningful?.

I was actually looking for version 1.2.3 (or greater) - where can I find that?

Thanks,
Phil

Phil Dagosto
Sr. Software Engineer
Edge Bio
201 Perry Parkway, Suite 5
Gaithersburg, MD 20850

pdagosto at edgebio.com
(240) 912-8669


From chapmanb at 50mail.com  Thu Feb 28 21:30:01 2013
From: chapmanb at 50mail.com (Brad Chapman)
Date: Thu, 28 Feb 2013 21:30:01 -0500
Subject: [Bioperl-l] Coming soon: BOSC/Broad Hackathon, BOSC Codefest
Message-ID: <874ngvua1i.fsf@fastmail.fm>


Hi all; 
There are some upcoming coding events and conferences of interest to open source
biology programmers:

- BOSC/Broad Interoperability Hackathon -- This is a two day coding session at
  the Broad Institute in Cambridge, MA on April 7-8 focused on improving tool
  interoperability.
  
  Sign up and details: http://j.mp/XJT6ew
  
- Codefest at the Bioinformatics Open Source Conference -- This year BOSC is
  taking place in Berlin from July 19-20 and we'll have a two day coding session
  before the conference. This is the 4th year of Codefests and they've proven to
  be a productive and fun time to work collectively on open source projects.

  Sign up and details: http://www.open-bio.org/wiki/Codefest_2013
  BOSC conference: http://www.open-bio.org/wiki/BOSC_2013

Here are the key dates for the events and abstracts:

April  7-8, 2013: BOSC/Broad Interoperability Hackathon, Cambridge, MA
April   12, 2013: BOSC abstracts due
July 17-18, 2013: Codefest 2013, Berlin
July 19-20, 2013: BOSC 2013, Berlin

Looking forward to seeing everyone this spring and summer for plenty of fun
science and code,
Brad


From jason.stajich at gmail.com  Fri Feb  1 01:58:57 2013
From: jason.stajich at gmail.com (Jason Stajich)
Date: Thu, 31 Jan 2013 22:58:57 -0800
Subject: [Bioperl-l] Bioperl-l Digest, Vol 117, Issue 13
In-Reply-To: <575B184F-C051-4FEF-9BEB-7AB98E3C52A6@yahoo.com>
References: <mailman.7.1359565204.26693.bioperl-l@lists.open-bio.org>
	<575B184F-C051-4FEF-9BEB-7AB98E3C52A6@yahoo.com>
Message-ID: <CD561DB2-ACFC-4592-B83B-829F44ADE6A3@gmail.com>

Dan - 

I think the answer is yes if others are doing it - I am not in a position to be much of a main coder.

I don't know which format you speak of here or if you had to write something for the text blast changes or something else.  Specific bug reports on formats that aren't working is always helpful.  The XML format has been pretty stable so I would suggest that if you are simply parsing reports not looking at them.

Chris posted instructions on how to contribute and the move to github simplifies this.  That you had to write a whole new parser seems probably a bit severe - I hope that in the future people can speak to the problems sooner. If I hit a wall with something I can't do I usually write the code to fix it and contribute it back but I don't play follow-the-format-changes with the tools anymore, but hopefully others like yourself can make the contributions.

If you speak to the response I made to the question below, I don't think anyone will be trying and support the NCBI's additional markups that refer to the upstream and downstream features as they are laid out in the text files without some serious effort. Perhaps in the future that information will be reported in the XML format and thus be more parseable.

best wishes,
Jason
On Jan 30, 2013, at 1:40 PM, Dan kilburn <dr_kilburn59 at yahoo.com> wrote:

> Hi Jason,
> 
> Are there any plans to keep SearchIO up to date with ncbi blast? I know they change formats ridiculously often, but I had to write my own parser to get sequence identity, which I would rather not have done. I realize that this job would be a big load on anyone who takes it, but it's so fundamental. Maybe I can help.
> 
> --Dan
> Sent from my iPhone
> 
> On Jan 30, 2013, at 12:00 PM, bioperl-l-request at lists.open-bio.org wrote:
> 
>> Send Bioperl-l mailing list submissions to
>>   bioperl-l at lists.open-bio.org
>> 
>> To subscribe or unsubscribe via the World Wide Web, visit
>>   http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> or, via email, send a message with subject or body 'help' to
>>   bioperl-l-request at lists.open-bio.org
>> 
>> You can reach the person managing the list at
>>   bioperl-l-owner at lists.open-bio.org
>> 
>> When replying, please edit your Subject line so it is more specific
>> than "Re: Contents of Bioperl-l digest..."
>> 
>> 
>> Today's Topics:
>> 
>>  1. Re:  Parsing Blast-Report extracting "Features flanking    .."
>>     (Jason Stajich)
>> 
>> 
>> ----------------------------------------------------------------------
>> 
>> Message: 1
>> Date: Tue, 29 Jan 2013 11:00:16 -0800
>> From: Jason Stajich <jason.stajich at gmail.com>
>> Subject: Re: [Bioperl-l] Parsing Blast-Report extracting "Features
>>   flanking    .."
>> To: buschj at hhu.de
>> Cc: bioperl-l at lists.open-bio.org
>> Message-ID: <6E83E3F3-C304-4DC4-9A11-FE1CA90F207D at gmail.com>
>> Content-Type: text/plain;    charset=us-ascii
>> 
>> We don't parse the NCBI feature info from the BLAST reports per your query. To look up a specific feature you can use Bio::DB::GenBank to query for sequence from a specific feature by accession number - see the HOWTOs for that.
>> 
>> However, most people use tools that generate SAM/BAM files with short reads - then you can use a tool like bedtools to find overlaps of reads with the locations of features.
>> 
>> basically:
>> - download the genome and GFF for arabidopsis
>> - align your sRNA to the genome with a short read aligner - bowtie, bwa, others
>> - convert your sam to bam file with SAMtools or picard
>> - compare the location of features with the reads to get expression summaries or individuals reads with BEDTools
>> 
>> 
>> On Jan 25, 2013, at 2:20 AM, jobu <buschj at hhu.de> wrote:
>> 
>>> Am 22.01.2013 19:03, schrieb Mgavi Brathwaite:
>>>> What upstream and downstream elements are you interested in?
>>> 
>>> 
>>> I've got a huge pile of short RNA reads.
>>> Part of the question now is whether those RNA fragments originate from
>>> siRNA events,
>>> or may represent miRNAs / parts of pre-miRNAs.
>>> 
>>> So I did an online  blast search against database nt.
>>> The resulting report quite often just gives subject information like this:
>>> 
>>> -----
>>>> gb|CP002686.1| Arabidopsis thaliana chromosome 3, complete sequence
>>> Length=23459830
>>> -----
>>> 
>>> Now I would like to get the hit's neighbouring regions  for further
>>> analysis.
>>> Preferably I would like to do that  in an automized way, but the only
>>> possible action with this kind of subject gi | description would be to
>>> fetch the entire chromosomal  sequence I guess ?
>>> 
>>> However,
>>> right below the line above, the report states more precisely:
>>> 
>>> ------
>>> Features flanking this part of subject sequence:
>>> 8872 bp at 5' side: cytochrome P450 90B1
>>> 402 bp at 3' side: U1 small nuclear ribonucleoprotein-70K
>>> ------
>>> 
>>> Still I would like to have the possibility to automatically fetch the
>>> subject's sequence(s),
>>> as of now I think  parsing the report with SearchIO won't let me aquire
>>> that information, because SearchIO does not recognize report sections
>>> like those.
>>> 
>>> I hope I did not miss any of SearchIOs capabilities, but I could not
>>> find any method covering my wish?!
>>> 
>>> Right now maybe the only way to get the information I want is to
>>> construct my own parser and write it out into a separate file, which in
>>> turn again  I could read into a hash before processing the Blast-Report
>>> with SearchIO to combine both data for further automized work.
>>> 
>>> I am aware though that even successfully getting the flanking features
>>> would leave me with the more or less wide  intergenic gap my hsp is
>>> located in.
>>> 
>>> However I'm in need of a way to get the flanking features including
>>> their annotation and the region spanning between them.
>>> But I hope I do not have to get complete sequences to accomplish that,
>>> as this would be kind of an overkill.
>>> 
>>> with kind regards
>>> Jochen
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> 
>> Jason Stajich
>> jason.stajich at gmail.com
>> jason at bioperl.org
>> 
>> 
>> 
>> 
>> ------------------------------
>> 
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> 
>> End of Bioperl-l Digest, Vol 117, Issue 13
>> ******************************************
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From dr_kilburn59 at yahoo.com  Fri Feb  1 09:25:34 2013
From: dr_kilburn59 at yahoo.com (Dan Kilburn)
Date: Fri, 1 Feb 2013 06:25:34 -0800 (PST)
Subject: [Bioperl-l] Bioperl-l Digest, Vol 117, Issue 13
In-Reply-To: <CD561DB2-ACFC-4592-B83B-829F44ADE6A3@gmail.com>
References: <mailman.7.1359565204.26693.bioperl-l@lists.open-bio.org>
	<575B184F-C051-4FEF-9BEB-7AB98E3C52A6@yahoo.com>
	<CD561DB2-ACFC-4592-B83B-829F44ADE6A3@gmail.com>
Message-ID: <1359728734.27412.YahooMailNeo@web162006.mail.bf1.yahoo.com>

Hi Jason,
?
Thanks for?the detailed feedback.? The real reason I had to write my own parser is that even with close, repeated support from NCBI we couldn't get XML output with short_web_blast.pl?because the parameter that turns on XML output was not functioning (they've probably fixed it by now), and I had to crank out a parser asap to support a job talk.
?
I don't think the upstream and downstream feature reports are particulalry useful, becase in mammals they tend to be so far away that they are not likely to be biologically relevant.? But the internal motif reports are useful, maybe especially if you are blasting short reads, like I was.? A 16-mer preserved domain hit is really good if you're blasting 18-mer Illumina short reads, like I was.
?
As far as my involvement goes, I got diagnosed with cancer on Wednesday, so I'll be taking a step back until next week's surgery and taking a lot a deep breaths.? On the other hand, this just makes me more motivated: I've been thinking alot about time, and timely contributions, the last two days.
?
Cheers,
Dan
 

________________________________
 From: Jason Stajich <jason.stajich at gmail.com>
To: Dan kilburn <dr_kilburn59 at yahoo.com> 
Cc: "bioperl-l at lists.open-bio.org" <bioperl-l at lists.open-bio.org> 
Sent: Friday, February 1, 2013 1:58 AM
Subject: Re: [Bioperl-l] Bioperl-l Digest, Vol 117, Issue 13
  

Dan -?

I think the answer is yes if others are doing it - I am not in a position to be much of a main coder.

I don't know which format you speak of here or if you had to write something for the text blast changes or something else. ?Specific bug reports on formats that aren't working is always helpful. ?The XML format has been pretty stable so I would suggest that if you are simply parsing reports not looking at them.

Chris posted instructions on how to contribute and the move to github simplifies this. ?That you had to write a whole new parser seems probably a bit severe - I hope that in the future people can speak to the problems sooner. If I hit a wall with something I can't do I usually write the code to fix it and contribute it back but I don't play follow-the-format-changes with the tools anymore, but hopefully others like yourself can make the contributions.

If you speak to the response I made to the question below, I don't think anyone will be trying and support the NCBI's additional markups that refer to the upstream and downstream features as they are laid out in the text files without some serious effort. Perhaps in the future that information will be reported in the XML format and thus be more parseable.
best wishes,
Jason

On Jan 30, 2013, at 1:40 PM, Dan kilburn <dr_kilburn59 at yahoo.com> wrote:

Hi Jason,
>
>Are there any plans to keep SearchIO up to date with ncbi blast? I know they change formats ridiculously often, but I had to write my own parser to get sequence identity, which I would rather not have done. I realize that this job would be a big load on anyone who takes it, but it's so fundamental. Maybe I can help.
>
>--Dan
>Sent from my iPhone
>
>On Jan 30, 2013, at 12:00 PM, bioperl-l-request at lists.open-bio.org wrote:
>
>
>Send Bioperl-l mailing list submissions to
>>??bioperl-l at lists.open-bio.org
>>
>>To subscribe or unsubscribe via the World Wide Web, visit
>>??http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>or, via email, send a message with subject or body 'help' to
>>??bioperl-l-request at lists.open-bio.org
>>
>>You can reach the person managing the list at
>>??bioperl-l-owner at lists.open-bio.org
>>
>>When replying, please edit your Subject line so it is more specific
>>than "Re: Contents of Bioperl-l digest..."
>>
>>
>>Today's Topics:
>>
>>?1. Re: ?Parsing Blast-Report extracting "Features flanking ???.."
>>????(Jason Stajich)
>>
>>
>>----------------------------------------------------------------------
>>
>>Message: 1
>>Date: Tue, 29 Jan 2013 11:00:16 -0800
>>From: Jason Stajich <jason.stajich at gmail.com>
>>Subject: Re: [Bioperl-l] Parsing Blast-Report extracting "Features
>>??flanking ???.."
>>To: buschj at hhu.de
>>Cc: bioperl-l at lists.open-bio.org
>>Message-ID: <6E83E3F3-C304-4DC4-9A11-FE1CA90F207D at gmail.com>
>>Content-Type: text/plain; ???charset=us-ascii
>>
>>We don't parse the NCBI feature info from the BLAST reports per your query. To look up a specific feature you can use Bio::DB::GenBank to query for sequence from a specific feature by accession number - see the HOWTOs for that.
>>
>>However, most people use tools that generate SAM/BAM files with short reads - then you can use a tool like bedtools to find overlaps of reads with the locations of features.
>>
>>basically:
>>- download the genome and GFF for arabidopsis
>>- align your sRNA to the genome with a short read aligner - bowtie, bwa, others
>>- convert your sam to bam file with SAMtools or picard
>>- compare the location of features with the reads to get expression summaries or individuals reads with BEDTools
>>
>>
>>On Jan 25, 2013, at 2:20 AM, jobu <buschj at hhu.de> wrote:
>>
>>
>>Am 22.01.2013 19:03, schrieb Mgavi Brathwaite:
>>>
>>>What upstream and downstream elements are you interested in?
>>>>
>>>
>>>I've got a huge pile of short RNA reads.
>>>Part of the question now is whether those RNA fragments originate from
>>>siRNA events,
>>>or may represent miRNAs / parts of pre-miRNAs.
>>>
>>>So I did an online ?blast search against database nt.
>>>The resulting report quite often just gives subject information like this:
>>>
>>>-----
>>>
>>>gb|CP002686.1| Arabidopsis thaliana chromosome 3, complete sequence
>>>>Length=23459830
>>>-----
>>>
>>>Now I would like to get the hit's neighbouring regions ?for further
>>>analysis.
>>>Preferably I would like to do that ?in an automized way, but the only
>>>possible action with this kind of subject gi | description would be to
>>>fetch the entire chromosomal ?sequence I guess ?
>>>
>>>However,
>>>right below the line above, the report states more precisely:
>>>
>>>------
>>>Features flanking this part of subject sequence:
>>>8872 bp at 5' side: cytochrome P450 90B1
>>>402 bp at 3' side: U1 small nuclear ribonucleoprotein-70K
>>>------
>>>
>>>Still I would like to have the possibility to automatically fetch the
>>>subject's sequence(s),
>>>as of now I think ?parsing the report with SearchIO won't let me aquire
>>>that information, because SearchIO does not recognize report sections
>>>like those.
>>>
>>>I hope I did not miss any of SearchIOs capabilities, but I could not
>>>find any method covering my wish?!
>>>
>>>Right now maybe the only way to get the information I want is to
>>>construct my own parser and write it out into a separate file, which in
>>>turn again ?I could read into a hash before processing the Blast-Report
>>>with SearchIO to combine both data for further automized work.
>>>
>>>I am aware though that even successfully getting the flanking features
>>>would leave me with the more or less wide ?intergenic gap my hsp is
>>>located in.
>>>
>>>However I'm in need of a way to get the flanking features including
>>>their annotation and the region spanning between them.
>>>But I hope I do not have to get complete sequences to accomplish that,
>>>as this would be kind of an overkill.
>>>
>>>with kind regards
>>>Jochen
>>>
>>>
>>>
>>>_______________________________________________
>>>Bioperl-l mailing list
>>>Bioperl-l at lists.open-bio.org
>>>http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>Jason Stajich
>>jason.stajich at gmail.com
>>jason at bioperl.org
>>
>>
>>
>>
>>------------------------------
>>
>>_______________________________________________
>>Bioperl-l mailing list
>>Bioperl-l at lists.open-bio.org
>>http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>End of Bioperl-l Digest, Vol 117, Issue 13
>>******************************************
>>
>_______________________________________________
>Bioperl-l mailing list
>Bioperl-l at lists.open-bio.org
>http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org  


From carandraug+dev at gmail.com  Sat Feb  2 20:44:31 2013
From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=)
Date: Sun, 3 Feb 2013 01:44:31 +0000
Subject: [Bioperl-l] TCofee does not accept named arguments and issue with
	output option
Message-ID: <CAPOrs_3TM5+yD3s3=npWb1sucmy_smSLejxz3Cr6C0Rg6h3Dyw@mail.gmail.com>

Hi

the TCoffee module does not options of the named argument type:

-arg => option

one needs to do like

'arg' => option

Is there a special reason for this? I tracked down this to the commit

7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e

12 years ago[1]. A comment on the code actually says "don't want named
parameters"[2] (though the commit message sounds pretty innocuous
"migrated to new Bio::Root::RootI chained new"). Is there a reason for
this? The rest of bioperl has no issue with named parameters, and the
API should be the same as Clustalw which also has no problem with it.
This is very easy to fix, I can submit a pull request no problem.

Also, shouldn't the code complain in the case of non-supported
options? Took me a very long time to find out the problem because
there was no complaints coming from the code.

There is also a problem with the way it handles the output option.
I'll have to look closer into it, but the documentation is simply
incorrect. "'output' => 'fasta_aln'" gives an error while just 'fasta'
(undocumented), works fine.

Carn?
[1] https://github.com/carandraug/bioperl-run/commit/7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e
[2] https://github.com/carandraug/bioperl-run/commit/7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e#L0R374


From cjfields at illinois.edu  Sun Feb  3 16:54:51 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Sun, 3 Feb 2013 21:54:51 +0000
Subject: [Bioperl-l] TCofee does not accept named arguments and issue
 with	output option
In-Reply-To: <CAPOrs_3TM5+yD3s3=npWb1sucmy_smSLejxz3Cr6C0Rg6h3Dyw@mail.gmail.com>
References: <CAPOrs_3TM5+yD3s3=npWb1sucmy_smSLejxz3Cr6C0Rg6h3Dyw@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE14D30@CHIMBX5.ad.uillinois.edu>

Carn?,

On Feb 2, 2013, at 7:44 PM, Carn? Draug <carandraug+dev at gmail.com> wrote:

> Hi
> 
> the TCoffee module does not options of the named argument type:
> 
> -arg => option
> 
> one needs to do like
> 
> 'arg' => option
> 
> Is there a special reason for this? I tracked down this to the commit
> 
> 7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e
> 
> 12 years ago[1]. A comment on the code actually says "don't want named
> parameters"[2] (though the commit message sounds pretty innocuous
> "migrated to new Bio::Root::RootI chained new"). Is there a reason for
> this? The rest of bioperl has no issue with named parameters, and the
> API should be the same as Clustalw which also has no problem with it.
> This is very easy to fix, I can submit a pull request no problem.

IIRC the reasoning behind this was to differentiate Bioperl parameters from command-specific ones.  This decision predates my involvement w/ core dev, but my general feeling is that anything that is an object attribute (regardless whether it is a direct representation of a value passed to a wrapped program or not) should be preceded by '-' for consistency.  

The downside of big changes like this: potential backwards compatibility issues.  Such changes would need to be tested out rigorously, as there are a ton of old scripts that would potentially break with a direct change.  I don't have a problem breaking this with a bioperl 2.0 release, though.  

> Also, shouldn't the code complain in the case of non-supported
> options? Took me a very long time to find out the problem because
> there was no complaints coming from the code.

Yes, it should complain when options are given that do not make sense, some validation would help there.  With some modules this might be a side-effect of using AUTOLOAD or simply not checking the parameters.

> There is also a problem with the way it handles the output option.
> I'll have to look closer into it, but the documentation is simply
> incorrect. "'output' => 'fasta_aln'" gives an error while just 'fasta'
> (undocumented), works fine.

That's entirely possible.

> Carn?
> [1] https://github.com/carandraug/bioperl-run/commit/7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e
> [2] https://github.com/carandraug/bioperl-run/commit/7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e#L0R374

As an aside, there are a few downsides of trying to implement command-line parameters as perl object attributes (getter/setter), one being that many can't be directly represented as an object attribute (namely, anything that can't be a getter/setter named subroutine, such as those having hyphens, starting with a number, etc) so you have to hack your way around it.  Infernal was this way IIRC.  Maybe these should just be simply stored as a semi-validated set of key-value pairs.  

chris


From carandraug+dev at gmail.com  Sun Feb  3 23:34:22 2013
From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=)
Date: Mon, 4 Feb 2013 04:34:22 +0000
Subject: [Bioperl-l] TCofee does not accept named arguments and issue
 with output option
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE14D30@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_3TM5+yD3s3=npWb1sucmy_smSLejxz3Cr6C0Rg6h3Dyw@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE14D30@CHIMBX5.ad.uillinois.edu>
Message-ID: <CAPOrs_2b2+Dy-HW3ngjNd2tjaTxgvFpTR-rKzq7HOO-6ZzyoTQ@mail.gmail.com>

On 3 February 2013 21:54, Fields, Christopher J <cjfields at illinois.edu> wrote:
> On Feb 2, 2013, at 7:44 PM, Carn? Draug <carandraug+dev at gmail.com> wrote:
>
>> Hi
>>
>> the TCoffee module does not options of the named argument type:
>>
>> -arg => option
>>
>> one needs to do like
>>
>> 'arg' => option
>>
>> Is there a special reason for this? I tracked down this to the commit
>>
>> 7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e
>>
>> 12 years ago[1]. A comment on the code actually says "don't want named
>> parameters"[2] (though the commit message sounds pretty innocuous
>> "migrated to new Bio::Root::RootI chained new"). Is there a reason for
>> this? The rest of bioperl has no issue with named parameters, and the
>> API should be the same as Clustalw which also has no problem with it.
>> This is very easy to fix, I can submit a pull request no problem.
>
> IIRC the reasoning behind this was to differentiate Bioperl parameters from command-specific ones.  This decision predates my involvement w/ core dev, but my general feeling is that anything that is an object attribute (regardless whether it is a direct representation of a value passed to a wrapped program or not) should be preceded by '-' for consistency.
>
> The downside of big changes like this: potential backwards compatibility issues.  Such changes would need to be tested out rigorously, as there are a ton of old scripts that would potentially break with a direct change.  I don't have a problem breaking this with a bioperl 2.0 release, though.

Should passing the tests be enough? There's one for TCofee. At the
moment I don't see how this would cause compatibility issues, we are
adding an option, not removing it. But the comment on the code,
stating plainly that the -param API was not wanted caught me by
surpise and why I'm asking.

> As an aside, there are a few downsides of trying to implement command-line parameters as perl object attributes (getter/setter), one being that many can't be directly represented as an object attribute (namely, anything that can't be a getter/setter named subroutine, such as those having hyphens, starting with a number, etc) so you have to hack your way around it.  Infernal was this way IIRC.  Maybe these should just be simply stored as a semi-validated set of key-value pairs.

>From a quick glance at the list of TCoffee parameters I don't at the
moment see any that should cause problem.

I have submitted a bug report[1] which mentions some other issues I
found with TCoffee. If someone could comment on them would be great
and I can start fixing it.

Carn?

[1] https://redmine.open-bio.org/issues/3406


From whereverroadgoes at gmail.com  Mon Feb  4 10:39:19 2013
From: whereverroadgoes at gmail.com (Slym)
Date: Mon, 4 Feb 2013 07:39:19 -0800 (PST)
Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases
Message-ID: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com>

The result I get is:

Number of bases of type A = 
Number of bases of type C = 
Number of bases of type G = 
Number of bases of type T = 

i.e. There's no expected values. 
Please help!

#! /usr/bin/perl

use Bio::Tools::SeqStats;
use Bio::Seq;

open (FILE, "seq.fasta");
@array = <FILE>;

# Removing first line of fasta

shift (@array);
$array = join('', at array);
open (FILE2, ">>seq2.fasta");
print FILE2 "$array";

$seqobj = Bio::PrimarySeq->new( -file => "sekw2.fasta",
- alphabet => 'dna',);


my $seq_stats = Bio::Tools::SeqStats->new(-seq=>$seqobj);

my $monomer_ref = $seq_stats->count_monomers();

foreach $base (sort keys %$monomer_ref) {
print "Liczba zasad typu ", $base," = ", $monomer_ref{$base},"\n";
}


From hamish.mcwilliam at bioinfo-user.org.uk  Mon Feb  4 11:59:16 2013
From: hamish.mcwilliam at bioinfo-user.org.uk (Hamish McWilliam)
Date: Mon, 4 Feb 2013 16:59:16 +0000
Subject: [Bioperl-l] Where to get BLASTCLUST or equivalent?
In-Reply-To: <loom.20130201T045704-740@post.gmane.org>
References: <200305311150.h4VBopn2019091@localhost.localdomain>
	<loom.20130201T045704-740@post.gmane.org>
Message-ID: <CABqDwwLHWp2fZm5h8KJmZhBFV6QmNLJrg5OE=hR+9U3Y3UJ7_g@mail.gmail.com>

BLASTCLUST is part of the legacy NCBI BLAST package (not NCBI BLAST+)
and can be obtained from:

ftp://ftp.ncbi.nlm.nih.gov/blast/executables/release/LATEST

As Robert notes there are many other tools which can be used to
perform sequence clustering, Wikipedia has a Sequence Clustering
article (http://en.wikipedia.org/wiki/Sequence_clustering) which lists
some of the most commonly used.

All the best,

Hamish

On 1 February 2013 04:15, Rob <yuf228 at hotmail.com> wrote:
> Cyril C.C. Chua <bmbcccc <at> bmb.leeds.ac.uk> writes:
>
>>
>> Hi,
>>
>> I have some difficulty in sourcing for BLASTCLUST or related
>> programs/mods. Does any1 know exactly how to locate them?
>>
>> Regards
>>
>> Cyril Chua
>>
>
>
> Hi Cyril,
>
> I heard of the following programmes that might do similar things (I HAVEN'T
> used any of them yet):
>
> Afree - http://www.vicbioinformatics.com/software.afree.shtml
> Uclust - http://drive5.com/uclust/uclust_userguide_2_1.pdf
> Usearch - http://www.drive5.com/usearch/
> DomClust - http://mbgd.genome.ad.jp/domclust/
>
> or
>
> Check this:
>
> http://ppod.princeton.edu/help/help_tech.html
>
> God bless,
>
>
> Robert
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


--
----
"Saying the internet has changed dramatically over the last five years
is clich? ? the internet is always changing dramatically" - Craig
Labovitz, Arbor Networks.


From whereverroadgoes at gmail.com  Mon Feb  4 12:34:10 2013
From: whereverroadgoes at gmail.com (Slym)
Date: Mon, 4 Feb 2013 09:34:10 -0800 (PST)
Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases
In-Reply-To: <CAJ57qHHphLgEyfkEEyt2HVh+RahSWpiuhuaA08vi5ZxMwDDgTg@mail.gmail.com>
References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com>
	<CAJ57qHHphLgEyfkEEyt2HVh+RahSWpiuhuaA08vi5ZxMwDDgTg@mail.gmail.com>
Message-ID: <b2154001-d1eb-4266-a491-108d3e6ae77d@googlegroups.com>

Thanks Roy,

It still doesn't seem to produce anything. :/


From roy.chaudhuri at gmail.com  Mon Feb  4 12:51:03 2013
From: roy.chaudhuri at gmail.com (Roy Chaudhuri)
Date: Mon, 4 Feb 2013 17:51:03 +0000
Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases
In-Reply-To: <b2154001-d1eb-4266-a491-108d3e6ae77d@googlegroups.com>
References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com>
	<CAJ57qHHphLgEyfkEEyt2HVh+RahSWpiuhuaA08vi5ZxMwDDgTg@mail.gmail.com>
	<b2154001-d1eb-4266-a491-108d3e6ae77d@googlegroups.com>
Message-ID: <CAJ57qHG9zFomG1wB4fN7hZZaByvP_EhxOHRTt2OrOZz__WgawQ@mail.gmail.com>

Sorry, I'd missed another problem in your code - you are trying to
load a fasta file using Bio::PrimarySeq. To read sequence data from a
file you should use Bio::SeqIO, see:

http://www.bioperl.org/wiki/HOWTO:Beginners#Retrieving_a_sequence_from_a_file
http://www.bioperl.org/wiki/HOWTO:SeqIO

Cheers,
Roy.


From asjo at koldfront.dk  Mon Feb  4 12:58:25 2013
From: asjo at koldfront.dk (Adam =?iso-8859-1?Q?Sj=F8gren?=)
Date: Mon, 04 Feb 2013 18:58:25 +0100
Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases
In-Reply-To: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com> (Slym's
	message of "Mon, 4 Feb 2013 07:39:19 -0800 (PST)")
References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com>
Message-ID: <8738xc2c72.fsf@topper.koldfront.dk>

On Mon, 4 Feb 2013 07:39:19 -0800 (PST), Slym wrote:

> #! /usr/bin/perl

> use Bio::Tools::SeqStats;
> use Bio::Seq;

It can be a good idea to add "use strict; use warnings;" to the top of
your script. At least two problems in your program would have been
caught by perl if you had.

> open (FILE, "seq.fasta");

Using (global) literal filehandles and the two parameter open() is
somewhat outdated, a more current way to do it could be:

  open my $fh, '<', 'seq.fasta';

> @array = <FILE>;

> # Removing first line of fasta

> shift (@array);
> $array = join('', at array);
> open (FILE2, ">>seq2.fasta");
> print FILE2 "$array";

Note that you are writing just the sequence to your seq2.fasta file
here, so the new file isn't really a fasta file.

> $seqobj = Bio::PrimarySeq->new( -file => "sekw2.fasta",
> - alphabet => 'dna',);

Bio::PrimarySeq doesn't take a '-file' parameter. Also, note that the
filename is different than before "sekw2" vs. "seq2"!

Either you should use Bio::SeqIO with a '-file' parameter, or you can
use Bio::PrimarySeq with a '-seq' parameter.

> my $seq_stats = Bio::Tools::SeqStats->new(-seq=>$seqobj);

> my $monomer_ref = $seq_stats->count_monomers();

> foreach $base (sort keys %$monomer_ref) {
> print "Liczba zasad typu ", $base," = ", $monomer_ref{$base},"\n";

Here you wanted $monomer_ref->{$base}, as %monomer_ref isn't mentioned
anywhere else.

> }

Here is a complete version of your script - I chose to use Bio::SeqIO -
that works:

  #!/usr/bin/perl

  use strict;
  use warnings;

  use Bio::SeqIO;
  use Bio::Tools::SeqStats;

  my $io=Bio::SeqIO->new(-file=>'seq.fasta', -alphabet=>'dna');
  my $seqobj=$io->next_seq; # Get the first sequence from the file

  my $seq_stats = Bio::Tools::SeqStats->new(-seq=>$seqobj);
  my $monomer_ref = $seq_stats->count_monomers();
  foreach my $base (sort keys %$monomer_ref) {
      print "Liczba zasad typu ", $base," = ", $monomer_ref->{$base},"\n";
  }

E.g.:

  $ cat seq.fasta
  >test
  aaaacccggt
  $ ./slym.pl 
  Liczba zasad typu A = 4
  Liczba zasad typu C = 3
  Liczba zasad typu G = 2
  Liczba zasad typu T = 1
  $ 


  Best regards,

    Adam

-- 
 "Grittings. Ma nam is Kahlfin."                              Adam Sj?gren
                                                         asjo at koldfront.dk


From whereverroadgoes at gmail.com  Mon Feb  4 13:02:29 2013
From: whereverroadgoes at gmail.com (Slym)
Date: Mon, 4 Feb 2013 10:02:29 -0800 (PST)
Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases
In-Reply-To: <CAJ57qHG9zFomG1wB4fN7hZZaByvP_EhxOHRTt2OrOZz__WgawQ@mail.gmail.com>
References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com>
	<CAJ57qHHphLgEyfkEEyt2HVh+RahSWpiuhuaA08vi5ZxMwDDgTg@mail.gmail.com>
	<b2154001-d1eb-4266-a491-108d3e6ae77d@googlegroups.com>
	<CAJ57qHG9zFomG1wB4fN7hZZaByvP_EhxOHRTt2OrOZz__WgawQ@mail.gmail.com>
Message-ID: <d5e347d1-cbaa-498a-9b64-a5242fdc4dd8@googlegroups.com>

The thing is, if I use Bio::SeqIO then  Bio::Tools::SeqStats produces an 
error (saying that it wants input provided by Bio::PrimarySeq).
(btw in this line
 $seqobj = Bio::PrimarySeq->new( -file => "sekw2.fasta", - alphabet => 
'dna',); 
there's a typo "sekw2" instead of "seq2" but this is correct in my original 
code).


From whereverroadgoes at gmail.com  Mon Feb  4 13:02:29 2013
From: whereverroadgoes at gmail.com (Slym)
Date: Mon, 4 Feb 2013 10:02:29 -0800 (PST)
Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases
In-Reply-To: <CAJ57qHG9zFomG1wB4fN7hZZaByvP_EhxOHRTt2OrOZz__WgawQ@mail.gmail.com>
References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com>
	<CAJ57qHHphLgEyfkEEyt2HVh+RahSWpiuhuaA08vi5ZxMwDDgTg@mail.gmail.com>
	<b2154001-d1eb-4266-a491-108d3e6ae77d@googlegroups.com>
	<CAJ57qHG9zFomG1wB4fN7hZZaByvP_EhxOHRTt2OrOZz__WgawQ@mail.gmail.com>
Message-ID: <d5e347d1-cbaa-498a-9b64-a5242fdc4dd8@googlegroups.com>

The thing is, if I use Bio::SeqIO then  Bio::Tools::SeqStats produces an 
error (saying that it wants input provided by Bio::PrimarySeq).
(btw in this line
 $seqobj = Bio::PrimarySeq->new( -file => "sekw2.fasta", - alphabet => 
'dna',); 
there's a typo "sekw2" instead of "seq2" but this is correct in my original 
code).


From cjfields at illinois.edu  Mon Feb  4 13:54:39 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Mon, 4 Feb 2013 18:54:39 +0000
Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases
In-Reply-To: <d5e347d1-cbaa-498a-9b64-a5242fdc4dd8@googlegroups.com>
References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com>
	<CAJ57qHHphLgEyfkEEyt2HVh+RahSWpiuhuaA08vi5ZxMwDDgTg@mail.gmail.com>
	<b2154001-d1eb-4266-a491-108d3e6ae77d@googlegroups.com>
	<CAJ57qHG9zFomG1wB4fN7hZZaByvP_EhxOHRTt2OrOZz__WgawQ@mail.gmail.com>
	<d5e347d1-cbaa-498a-9b64-a5242fdc4dd8@googlegroups.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE161ED@CHIMBX5.ad.uillinois.edu>

Please make sure and read both Roy's and Adam's responses all the way through; Bio::SeqIO is not a sequence object but the front-end for format parsing (e.g. FASTA, etc).  Bio::PrimarySeq does not have a '-file' parameter, Bio::SeqIO does.  

If SeqStats truly doesn't work with Bio::Seq we can fix that, but according to Adam he has tested using Bio::SeqIO out and it seems to work.

chris

On Feb 4, 2013, at 12:02 PM, Slym <whereverroadgoes at gmail.com>
 wrote:

> The thing is, if I use Bio::SeqIO then  Bio::Tools::SeqStats produces an 
> error (saying that it wants input provided by Bio::PrimarySeq).
> (btw in this line
> $seqobj = Bio::PrimarySeq->new( -file => "sekw2.fasta", - alphabet => 
> 'dna',); 
> there's a typo "sekw2" instead of "seq2" but this is correct in my original 
> code).
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From asjo at koldfront.dk  Mon Feb  4 15:00:32 2013
From: asjo at koldfront.dk (Adam =?iso-8859-1?Q?Sj=F8gren?=)
Date: Mon, 04 Feb 2013 21:00:32 +0100
Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases
In-Reply-To: <d5e347d1-cbaa-498a-9b64-a5242fdc4dd8@googlegroups.com> (Slym's
	message of "Mon, 4 Feb 2013 10:02:29 -0800 (PST)")
References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com>
	<CAJ57qHHphLgEyfkEEyt2HVh+RahSWpiuhuaA08vi5ZxMwDDgTg@mail.gmail.com>
	<b2154001-d1eb-4266-a491-108d3e6ae77d@googlegroups.com>
	<CAJ57qHG9zFomG1wB4fN7hZZaByvP_EhxOHRTt2OrOZz__WgawQ@mail.gmail.com>
	<d5e347d1-cbaa-498a-9b64-a5242fdc4dd8@googlegroups.com>
Message-ID: <87txpr26jj.fsf@topper.koldfront.dk>

On Mon, 4 Feb 2013 10:02:29 -0800 (PST), Slym wrote:

> The thing is, if I use Bio::SeqIO then  Bio::Tools::SeqStats produces an 
> error (saying that it wants input provided by Bio::PrimarySeq).

That sounds like you forgot to call ->next_seq() on the Bio::SeqIO
object - to get a sequence object - please see the complete, working
example I sent earlier.


  Best regards,

    Adam

-- 
 "Denial springs eternal."                                    Adam Sj?gren
                                                         asjo at koldfront.dk


From scott at scottcain.net  Tue Feb  5 09:45:14 2013
From: scott at scottcain.net (Scott Cain)
Date: Tue, 5 Feb 2013 09:45:14 -0500
Subject: [Bioperl-l] Have your say in the 2013 GMOD Community Survey!
Message-ID: <CA+JTaoy5NZubXo2jQ8oDN20BQ5BAHg3B9ZmYZRJ6f2Ryr+-awQ@mail.gmail.com>

Give us your thoughts on the GMOD project and win a personal DNA test
from 23andMe!

The GMOD project provides tools like GBrowse, Galaxy, MAKER, JBrowse,
Tripal, Apollo, Chado, and many more to a huge community of users and
developers around the world.

To make sure that GMOD is giving you the support you need, we want to
know how you use GMOD, which components you find valuable, your
opinion on support, training, and GMOD's strengths and weaknesses.
Your feedback is vital in helping GMOD to serve its user community
more effectively and to suggest future directions for the project.

Do the survey: http://gmod.org/survey.html

The survey should take between 10 and 15 minutes (including thinking
time), and participants can enter a draw to win "A Journey Through
Your DNA", the personal DNA test from 23andMe (the winner can pick a
$50 Amazon gift voucher if they prefer).

The survey will be open until March 1st. Results will be collated and
discussed at the April 2013 GMOD Meeting in Cambridge, UK, and posted
on the GMOD wiki at http://gmod.org.

Please spread the word to other friends and colleagues who use GMOD:
the more voices we hear, the better the picture we get of the needs of
our users, and the better we can help you!

Do the survey: http://gmod.org/survey.html

If you have any questions or problems with the survey, please email me
-- I will be happy to help out!

Thanks,
Scott


-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   scott at scottcain dot net
GMOD Coordinator (http://gmod.org/)                     216-392-3087
Ontario Institute for Cancer Research


From tiago.hori at gmail.com  Tue Feb  5 10:21:55 2013
From: tiago.hori at gmail.com (Tiago Hori)
Date: Tue, 5 Feb 2013 07:21:55 -0800 (PST)
Subject: [Bioperl-l] Search I::O
Message-ID: <39b1269f-63a7-4b29-af79-8c93ab231abf@googlegroups.com>

Hi All,

I am trying to find the best putative orthologs for 44K Atlantic Salmon 
sequences, and so I need to parse 44K BLAST reports to find the best human 
hit. I am trying to learn Seach::IO, but when I try the first example on 
the HOWTO: use strict;
use Bio::SearchIO;

my $in = new Bio::SearchIO(-format => 'blast'
               -file => 'C001R047.txt');

while( my $result = $in->next_result ) {
  ## $result is a Bio::Search::Result::ResultI compliant object
  while( my $hit = $result->next_hit ) {
    ## $hit is a Bio::Search::Hit::HitI compliant object
    while( my $hsp = $hit->next_hsp ) {
      ## $hsp is a Bio::Search::HSP::HSPI compliant object
      if( $hsp->length('total') > 50 ) {
        if ( $hsp->percent_identity >= 75 ) {
          print "Query=",   $result->query_name,
            " Hit=",        $hit->name,
            " Length=",     $hsp->length('total'),
            " Percent_id=", $hsp->percent_identity, "\n";
        }
      }
    }  
  }
}

I get this error: Odd number of elements in hash assignment at 
/usr/local/share/perl/5.14.2/Bio/SearchIO.pm line 189.

I am using BioPerl version 1.6.901. Is there a format problem with the 
blast reports?

Any help would be greatly appreciated!

T.


From tiago.hori at gmail.com  Tue Feb  5 10:33:32 2013
From: tiago.hori at gmail.com (Tiago Hori)
Date: Tue, 5 Feb 2013 07:33:32 -0800 (PST)
Subject: [Bioperl-l] Search::IO example from HOWTO
Message-ID: <c87907a1-18da-49ed-ad70-55ca7bd27658@googlegroups.com>

Hi All,

I am trying to run tha example from the Search::IO how to use strict;
use Bio::SearchIO;

my $in = new Bio::SearchIO(-format => 'blast'
               -file => 'test.txt');

while( my $result = $in->next_result ) {
  ## $result is a Bio::Search::Result::ResultI compliant object
  while( my $hit = $result->next_hit ) {
    ## $hit is a Bio::Search::Hit::HitI compliant object
    while( my $hsp = $hit->next_hsp ) {
      ## $hsp is a Bio::Search::HSP::HSPI compliant object
      if( $hsp->length('total') > 50 ) {
        if ( $hsp->percent_identity >= 75 ) {
          print "Query=",   $result->query_name,
            " Hit=",        $hit->name,
            " Length=",     $hsp->length('total'),
            " Percent_id=", $hsp->percent_identity, "\n";
        }
      }
    }  
  }
}

And I get this error:Odd number of elements in hash assignment at 
/usr/local/share/perl/5.14.2/Bio/SearchIO.pm line 189.

Can anybody help!

Cheers,

T.


From carandraug+dev at gmail.com  Tue Feb  5 13:56:21 2013
From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=)
Date: Tue, 5 Feb 2013 18:56:21 +0000
Subject: [Bioperl-l] removing packages from bioperl-live
Message-ID: <CAPOrs_1z3xYWVFvObLryf7E4w1oO3O0ZjJ_Cu8HA805=S0Fpzw@mail.gmail.com>

Hi

some of the bioperl-live packages have already been split into
separate repositories. However, they were never actually removed from
bioperl-live. This creates 2 entry points for bug fixes and
implementations. After a chat on #bioperl, I was told to ask here.

Should these be removed? For example, there's bioperl-FeatureIO but
that code alo exists in bioperl-live. Can I remove it from
bioperl-live?

Carn?


From cjfields at illinois.edu  Tue Feb  5 14:34:07 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Tue, 5 Feb 2013 19:34:07 +0000
Subject: [Bioperl-l] Removing Bio::FeatureIO? was Re: removing packages from
 bioperl-live
In-Reply-To: <CAPOrs_1z3xYWVFvObLryf7E4w1oO3O0ZjJ_Cu8HA805=S0Fpzw@mail.gmail.com>
References: <CAPOrs_1z3xYWVFvObLryf7E4w1oO3O0ZjJ_Cu8HA805=S0Fpzw@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu>

Probably should retitle this to ask the question directly (make sure the right radars are pinged).

My vote is yes, it should be removed.  There were a lot of implementation issues with it that ended up becoming problematic.  I do believe it is used, though, so I would like to get additional responses from the community before removing it and pointing to the separate repository (where there has been a lot of experimenting going on).

chris

On Feb 5, 2013, at 12:56 PM, Carn? Draug <carandraug+dev at gmail.com> wrote:

> Hi
> 
> some of the bioperl-live packages have already been split into
> separate repositories. However, they were never actually removed from
> bioperl-live. This creates 2 entry points for bug fixes and
> implementations. After a chat on #bioperl, I was told to ask here.
> 
> Should these be removed? For example, there's bioperl-FeatureIO but
> that code alo exists in bioperl-live. Can I remove it from
> bioperl-live?
> 
> Carn?
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From scott at scottcain.net  Tue Feb  5 14:36:10 2013
From: scott at scottcain.net (Scott Cain)
Date: Tue, 5 Feb 2013 14:36:10 -0500
Subject: [Bioperl-l] Removing Bio::FeatureIO? was Re: removing packages
 from bioperl-live
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_1z3xYWVFvObLryf7E4w1oO3O0ZjJ_Cu8HA805=S0Fpzw@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu>
Message-ID: <CA+JTaowxkgy+2ytqHG-MG6VrOdT7jGLQ9-_TJfVA3COsLgUZYw@mail.gmail.com>

I'm sure it will lead to lots of fun, but I suspect you are right and
it should be removed.  It's time you yank on that bandaid :-)

Scott


On Tue, Feb 5, 2013 at 2:34 PM, Fields, Christopher J
<cjfields at illinois.edu> wrote:
> Probably should retitle this to ask the question directly (make sure the right radars are pinged).
>
> My vote is yes, it should be removed.  There were a lot of implementation issues with it that ended up becoming problematic.  I do believe it is used, though, so I would like to get additional responses from the community before removing it and pointing to the separate repository (where there has been a lot of experimenting going on).
>
> chris
>
> On Feb 5, 2013, at 12:56 PM, Carn? Draug <carandraug+dev at gmail.com> wrote:
>
>> Hi
>>
>> some of the bioperl-live packages have already been split into
>> separate repositories. However, they were never actually removed from
>> bioperl-live. This creates 2 entry points for bug fixes and
>> implementations. After a chat on #bioperl, I was told to ask here.
>>
>> Should these be removed? For example, there's bioperl-FeatureIO but
>> that code alo exists in bioperl-live. Can I remove it from
>> bioperl-live?
>>
>> Carn?
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   scott at scottcain dot net
GMOD Coordinator (http://gmod.org/)                     216-392-3087
Ontario Institute for Cancer Research


From carandraug+dev at gmail.com  Tue Feb  5 15:06:23 2013
From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=)
Date: Tue, 5 Feb 2013 20:06:23 +0000
Subject: [Bioperl-l] dependencies on perl version
Message-ID: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>

Hi

how much perl backwards compatibility does bioperl needs to keep?

If I have something I want to implement and use state (requires
5.010), is it acceptable? 5.010 is already a quite old perl version.
Of course, there are other less elegant ways to implement those
features. If I can't use modern perl stuff, what version number is the
limit?

Carn?


From carandraug+dev at gmail.com  Tue Feb  5 15:10:01 2013
From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=)
Date: Tue, 5 Feb 2013 20:10:01 +0000
Subject: [Bioperl-l] Removing Bio::FeatureIO? was Re: removing packages
 from bioperl-live
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_1z3xYWVFvObLryf7E4w1oO3O0ZjJ_Cu8HA805=S0Fpzw@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu>
Message-ID: <CAPOrs_0qgrs3FKaoyFHL_RmbYJG8jNDfhxW-YddFVUfW3DFn4w@mail.gmail.com>

On 5 February 2013 19:34, Fields, Christopher J <cjfields at illinois.edu> wrote:
> Probably should retitle this to ask the question directly (make sure the right radars are pinged).
>
> My vote is yes, it should be removed.  There were a lot of implementation issues with it that ended up becoming problematic.  I do believe it is used, though, so I would like to get additional responses from the community before removing it and pointing to the separate repository (where there has been a lot of experimenting going on).

Mentioning Bio::FeatureIO was just an example. I meant to ask it as
more general. If the code is already in a separate repository, should
it be removed from bioperl-live?

Carn?


From cjfields at illinois.edu  Tue Feb  5 15:56:48 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Tue, 5 Feb 2013 20:56:48 +0000
Subject: [Bioperl-l] dependencies on perl version
In-Reply-To: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>

Aim for 5.10.1, but be careful of smart-match.  If you do this, make sure to add a 'use 5.010' pragma at the top.  

(for those who don't like this, please speak up.  perl 5.8 has been around almost 8 yrs, we would like to allow using new features if at all possible)

chris

On Feb 5, 2013, at 2:06 PM, Carn? Draug <carandraug+dev at gmail.com> wrote:

> Hi
> 
> how much perl backwards compatibility does bioperl needs to keep?
> 
> If I have something I want to implement and use state (requires
> 5.010), is it acceptable? 5.010 is already a quite old perl version.
> Of course, there are other less elegant ways to implement those
> features. If I can't use modern perl stuff, what version number is the
> limit?
> 
> Carn?
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Tue Feb  5 15:59:38 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Tue, 5 Feb 2013 20:59:38 +0000
Subject: [Bioperl-l] Removing Bio::FeatureIO? was Re: removing packages
 from bioperl-live
In-Reply-To: <CAPOrs_0qgrs3FKaoyFHL_RmbYJG8jNDfhxW-YddFVUfW3DFn4w@mail.gmail.com>
References: <CAPOrs_1z3xYWVFvObLryf7E4w1oO3O0ZjJ_Cu8HA805=S0Fpzw@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu>
	<CAPOrs_0qgrs3FKaoyFHL_RmbYJG8jNDfhxW-YddFVUfW3DFn4w@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1829D@CHIMBX5.ad.uillinois.edu>

On Feb 5, 2013, at 2:10 PM, Carn? Draug <carandraug+dev at gmail.com> wrote:

> On 5 February 2013 19:34, Fields, Christopher J <cjfields at illinois.edu> wrote:
>> Probably should retitle this to ask the question directly (make sure the right radars are pinged).
>> 
>> My vote is yes, it should be removed.  There were a lot of implementation issues with it that ended up becoming problematic.  I do believe it is used, though, so I would like to get additional responses from the community before removing it and pointing to the separate repository (where there has been a lot of experimenting going on).
> 
> Mentioning Bio::FeatureIO was just an example. I meant to ask it as
> more general. If the code is already in a separate repository, should
> it be removed from bioperl-live?
> 
> Carn?

Yes for Bio::FeatureIO, no for Bio::Root::Root and the others at the moment (I want to get a release out by March 1, which I'm planning on announcing later today, so the less disruptive it is the better).  Once we get a new release out we should remove the rest.

chris


From cjfields at illinois.edu  Tue Feb  5 16:53:29 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Tue, 5 Feb 2013 21:53:29 +0000
Subject: [Bioperl-l] Next BioPerl release
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu>

All,

I am scheduling the next BioPerl CPAN release tentatively for March 1.  Any help in triaging bug reports would be greatly appreciated!  

Amongst all other changes, as mentioned in a separate thread we will remove Bio::FeatureIO, now developed in a separate repository:

    https://github.com/bioperl/Bio-FeatureIO

Feedback, suggestions, etc are greatly appreciated.

chris


From miker at htblis.com  Tue Feb  5 19:54:17 2013
From: miker at htblis.com (Michael Rogoff)
Date: Tue, 5 Feb 2013 16:54:17 -0800
Subject: [Bioperl-l] Bio::Graphics error when rendering features with Split
	locations
Message-ID: <C71FF11A-F2E2-4204-9A10-50F5535A0C81@htblis.com>

When trying to render features from a genbank file that include a split location e.g.:

     promoter        join(1000..1080,1..5)
                     /label=PROM1

The following exception is raised:
Can't locate object method "has_tag" via package "Bio::Location::Simple" at lib/perl5/site_perl/5.10.1/Bio/Graphics/Glyph.pm line 704, <GEN0> line 36.

This can be reproduced with the code in the example "Rendering Features from a GenBank or EMBL File" from the Graphics HOW-TO:
http://www.bioperl.org/wiki/HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File

Is there a way to change the script so that split locations would, at the very least, not cause a fatal error?  Is there a different glyph type that needs to be used?  Thanks in advance for any help.

I've attached a simple genbank input that will reproduce the error:

LOCUS       sample2     1080 bp DNA    circular
DEFINITION  Cloning vector sample2
ACCESSION   sample2
VERSION     sample2.1  GI:4352432
COMMENT     Component Fragments
FEATURES               Location/Qualifiers
     terminator      39..328
                     /label=TERM1
                     /note="terminator 1"
     misc_feature    393..488
                     /label=MF1
     CDS             complement(800..900)
                     /label=CDS1
                     /note="resistence gene"
     promoter        join(1000..1080,1..5)
                     /label=PROM1
ORIGIN
        1  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
       61  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      121  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      181  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      241  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      301  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      361  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      421  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      481  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      541  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      601  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      661  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      721  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      781  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      841  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      901  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      961  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
     1021  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
//


P.S.  I think I have traced the source of the problem to Glyph's _subfeat method, which in the case of a feature with split locations is returning location objects instead of feature objects.  Is this a bug?

sub _subfeat {
  my $class   = shift;
  my $feature = shift;

  return $feature->segments     if $feature->can('segments');

  my @split = eval { my $id   = $feature->location->seq_id;
                     my @subs = $feature->location->sub_Location;
                     grep {$id eq $_->seq_id} @subs;
                   };

  return @split if @split;

  # Either the APIs have changed, or I got confused at some point...
  return $feature->get_SeqFeatures         if $feature->can('get_SeqFeatures');
  return $feature->sub_SeqFeature          if $feature->can('sub_SeqFeature');
  return;
}


From l.m.timmermans at students.uu.nl  Tue Feb  5 21:40:27 2013
From: l.m.timmermans at students.uu.nl (Leon Timmermans)
Date: Wed, 6 Feb 2013 03:40:27 +0100
Subject: [Bioperl-l] dependencies on perl version
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
Message-ID: <CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>

On Tue, Feb 5, 2013 at 9:56 PM, Fields, Christopher J
<cjfields at illinois.edu> wrote:
> Aim for 5.10.1, but be careful of smart-match.  If you do this, make sure to add a 'use 5.010' pragma at the top.
>
> (for those who don't like this, please speak up.  perl 5.8 has been around almost 8 yrs, we would like to allow using new features if at all possible)

I *really* hate saying it, but I fear a lot of places are still stuck
on 5.8, in particular on 5.8.8 because of CentOS 5. I know my
department still is and doesn't seem to be in a hurry to upgrade, and
I'm pretty sure it won't be the only one (though personally I use a
self-compiled 5.16).

Leon


From florent.angly at gmail.com  Tue Feb  5 21:51:27 2013
From: florent.angly at gmail.com (Florent Angly)
Date: Wed, 06 Feb 2013 12:51:27 +1000
Subject: [Bioperl-l] Removing Bio::FeatureIO? was Re: removing packages
 from bioperl-live
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1829D@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_1z3xYWVFvObLryf7E4w1oO3O0ZjJ_Cu8HA805=S0Fpzw@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu>
	<CAPOrs_0qgrs3FKaoyFHL_RmbYJG8jNDfhxW-YddFVUfW3DFn4w@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1829D@CHIMBX5.ad.uillinois.edu>
Message-ID: <5111C52F.50101@gmail.com>

On 06/02/13 06:59, Fields, Christopher J wrote:
> On Feb 5, 2013, at 2:10 PM, Carn? Draug <carandraug+dev at gmail.com> wrote:
>
>> On 5 February 2013 19:34, Fields, Christopher J <cjfields at illinois.edu> wrote:
>>> Probably should retitle this to ask the question directly (make sure the right radars are pinged).
>>>
>>> My vote is yes, it should be removed.  There were a lot of implementation issues with it that ended up becoming problematic.  I do believe it is used, though, so I would like to get additional responses from the community before removing it and pointing to the separate repository (where there has been a lot of experimenting going on).
>> Mentioning Bio::FeatureIO was just an example. I meant to ask it as
>> more general. If the code is already in a separate repository, should
>> it be removed from bioperl-live?
>>
>> Carn?
> Yes for Bio::FeatureIO, no for Bio::Root::Root and the others at the moment (I want to get a release out by March 1, which I'm planning on announcing later today, so the less disruptive it is the better).  Once we get a new release out we should remove the rest.
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Sounds good to me (I've been burnt once by the fact that Bio::FeatureIO 
is in two places).
Florent


From florent.angly at gmail.com  Tue Feb  5 21:56:19 2013
From: florent.angly at gmail.com (Florent Angly)
Date: Wed, 06 Feb 2013 12:56:19 +1000
Subject: [Bioperl-l] dependencies on perl version
In-Reply-To: <CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
Message-ID: <5111C653.2010703@gmail.com>

For what it's worth, the current stable version of Debian uses perl 
5.10.1 (http://packages.debian.org/stable/perl/perl).
Florent

On 06/02/13 12:40, Leon Timmermans wrote:
> On Tue, Feb 5, 2013 at 9:56 PM, Fields, Christopher J
> <cjfields at illinois.edu> wrote:
>> Aim for 5.10.1, but be careful of smart-match.  If you do this, make sure to add a 'use 5.010' pragma at the top.
>>
>> (for those who don't like this, please speak up.  perl 5.8 has been around almost 8 yrs, we would like to allow using new features if at all possible)
> I *really* hate saying it, but I fear a lot of places are still stuck
> on 5.8, in particular on 5.8.8 because of CentOS 5. I know my
> department still is and doesn't seem to be in a hurry to upgrade, and
> I'm pretty sure it won't be the only one (though personally I use a
> self-compiled 5.16).
>
> Leon
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From hlapp at drycafe.net  Tue Feb  5 22:27:35 2013
From: hlapp at drycafe.net (Hilmar Lapp)
Date: Tue, 5 Feb 2013 22:27:35 -0500
Subject: [Bioperl-l] Next BioPerl release
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu>
References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu>
Message-ID: <09524241-59F8-4BFF-8054-53CD0A649C11@drycafe.net>


On Feb 5, 2013, at 4:53 PM, Fields, Christopher J wrote:

> I am scheduling the next BioPerl CPAN release tentatively for March 1.

Yay!! Thanks for your leadership again, Chris, and for volunteering your time for the project. If nothing else, and I know this is no compensation really worth speaking of, we owe you beer, and I'll certainly pay my debt to you in Berlin if you come there.

	-hilmar
-- 
===========================================================
: Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net :
===========================================================


From hlapp at drycafe.net  Tue Feb  5 22:32:40 2013
From: hlapp at drycafe.net (Hilmar Lapp)
Date: Tue, 5 Feb 2013 22:32:40 -0500
Subject: [Bioperl-l] dependencies on perl version
In-Reply-To: <5111C653.2010703@gmail.com>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
Message-ID: <A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>

Does anyone know what Ubuntu uses? I've heard lots of other old version problems with CentOS.

8 years is really old, and at some point I fear that weighing backwards compatibility too much just holds us back in a real detrimental way.

	-hilmar

On Feb 5, 2013, at 9:56 PM, Florent Angly wrote:

> For what it's worth, the current stable version of Debian uses perl 5.10.1 (http://packages.debian.org/stable/perl/perl).
> Florent
> 
> On 06/02/13 12:40, Leon Timmermans wrote:
>> On Tue, Feb 5, 2013 at 9:56 PM, Fields, Christopher J
>> <cjfields at illinois.edu> wrote:
>>> Aim for 5.10.1, but be careful of smart-match.  If you do this, make sure to add a 'use 5.010' pragma at the top.
>>> 
>>> (for those who don't like this, please speak up.  perl 5.8 has been around almost 8 yrs, we would like to allow using new features if at all possible)
>> I *really* hate saying it, but I fear a lot of places are still stuck
>> on 5.8, in particular on 5.8.8 because of CentOS 5. I know my
>> department still is and doesn't seem to be in a hurry to upgrade, and
>> I'm pretty sure it won't be the only one (though personally I use a
>> self-compiled 5.16).
>> 
>> Leon
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net :
===========================================================


From cjfields at illinois.edu  Tue Feb  5 22:58:08 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 6 Feb 2013 03:58:08 +0000
Subject: [Bioperl-l] dependencies on perl version
In-Reply-To: <A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE18CBE@CHIMBX5.ad.uillinois.edu>

Re: being held back, I agree.  I don't necessarily want to intentionally break current modules by adding modern code unless it can be demonstrated to be a decent benefit performance-wise, but I don't want to impede new additions by requiring compat with perl 5.8 (hence my suggestion of a 'use 5.01x' pragma when appropriate).

Ubuntu 12.04 LTS is on perl 5.14.2: 

    http://askubuntu.com/questions/80672/what-perl-version-will-be-in-12-04-lts

BTW, I was wrong about perl 5.8 being 8 yrs old; it's almost 11 yrs old (perl 5.8.0 was released on 7/18/2002).  perl 5.8 reached end-of-life in 2008, fixes being only for security reasons.

So, I support dropping perl 5.8 support, but we should have a decent route of use for the folks stuck on old clusters.

chris

On Feb 5, 2013, at 9:32 PM, Hilmar Lapp <hlapp at drycafe.net> wrote:

> Does anyone know what Ubuntu uses? I've heard lots of other old version problems with CentOS.
> 
> 8 years is really old, and at some point I fear that weighing backwards compatibility too much just holds us back in a real detrimental way.
> 
> 	-hilmar
> 
> On Feb 5, 2013, at 9:56 PM, Florent Angly wrote:
> 
>> For what it's worth, the current stable version of Debian uses perl 5.10.1 (http://packages.debian.org/stable/perl/perl).
>> Florent
>> 
>> On 06/02/13 12:40, Leon Timmermans wrote:
>>> On Tue, Feb 5, 2013 at 9:56 PM, Fields, Christopher J
>>> <cjfields at illinois.edu> wrote:
>>>> Aim for 5.10.1, but be careful of smart-match.  If you do this, make sure to add a 'use 5.010' pragma at the top.
>>>> 
>>>> (for those who don't like this, please speak up.  perl 5.8 has been around almost 8 yrs, we would like to allow using new features if at all possible)
>>> I *really* hate saying it, but I fear a lot of places are still stuck
>>> on 5.8, in particular on 5.8.8 because of CentOS 5. I know my
>>> department still is and doesn't seem to be in a hurry to upgrade, and
>>> I'm pretty sure it won't be the only one (though personally I use a
>>> self-compiled 5.16).
>>> 
>>> Leon
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> 
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> -- 
> ===========================================================
> : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net :
> ===========================================================
> 
> 
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From l.m.timmermans at students.uu.nl  Tue Feb  5 23:11:52 2013
From: l.m.timmermans at students.uu.nl (Leon Timmermans)
Date: Wed, 6 Feb 2013 05:11:52 +0100
Subject: [Bioperl-l] dependencies on perl version
In-Reply-To: <A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
Message-ID: <CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>

On Wed, Feb 6, 2013 at 4:32 AM, Hilmar Lapp <hlapp at drycafe.net> wrote:
> Does anyone know what Ubuntu uses?

5.14.2, distrowatch is your friend ;-)

> I've heard lots of other old version problems with CentOS.

I know people who still use CentOS 4 in production :-|

> 8 years is really old, and at some point I fear that weighing backwards compatibility too much just holds us back in a real detrimental way.

CentOS 5 is 6 years old (and will be supported another 4), but CentOS
6 is 'only' 19 months. perl missing a release in the 5.8-5.10
timeframe combined with an unfortunate alignment of its release
schedule with Red Hat's don't do us any favors here.

Leon


From cjfields at illinois.edu  Tue Feb  5 23:14:24 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 6 Feb 2013 04:14:24 +0000
Subject: [Bioperl-l] dependencies on perl version
In-Reply-To: <CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE18E52@CHIMBX5.ad.uillinois.edu>

On Feb 5, 2013, at 8:40 PM, Leon Timmermans <l.m.timmermans at students.uu.nl> wrote:

> On Tue, Feb 5, 2013 at 9:56 PM, Fields, Christopher J
> <cjfields at illinois.edu> wrote:
>> Aim for 5.10.1, but be careful of smart-match.  If you do this, make sure to add a 'use 5.010' pragma at the top.
>> 
>> (for those who don't like this, please speak up.  perl 5.8 has been around almost 8 yrs, we would like to allow using new features if at all possible)
> 
> I *really* hate saying it, but I fear a lot of places are still stuck
> on 5.8, in particular on 5.8.8 because of CentOS 5. I know my
> department still is and doesn't seem to be in a hurry to upgrade, and
> I'm pretty sure it won't be the only one (though personally I use a
> self-compiled 5.16).
> 
> Leon

We had the same problem for a while, but our sysadmins were willing to set up perl 5.12 (at that time) loadable as a module (we can of course set up a local perl as well).  We're now using a sysadmin-installed perl 5.16 with our current cluster.

chris


From cjfields at illinois.edu  Tue Feb  5 23:24:31 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 6 Feb 2013 04:24:31 +0000
Subject: [Bioperl-l] dependencies on perl version
In-Reply-To: <CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>

On Feb 5, 2013, at 10:11 PM, Leon Timmermans <l.m.timmermans at students.uu.nl> wrote:

> On Wed, Feb 6, 2013 at 4:32 AM, Hilmar Lapp <hlapp at drycafe.net> wrote:
>> Does anyone know what Ubuntu uses?
> 
> 5.14.2, distrowatch is your friend ;-)
> 
>> I've heard lots of other old version problems with CentOS.
> 
> I know people who still use CentOS 4 in production :-|
> 
>> 8 years is really old, and at some point I fear that weighing backwards compatibility too much just holds us back in a real detrimental way.
> 
> CentOS 5 is 6 years old (and will be supported another 4), but CentOS
> 6 is 'only' 19 months. perl missing a release in the 5.8-5.10
> timeframe combined with an unfortunate alignment of its release
> schedule with Red Hat's don't do us any favors here.
> 
> Leon

Right, it took ~8 yrs to go from 5.8 to 5.10.  I'd like to point out that Python users are in the same boat: the Python version for CentOS 5 is 2.4.3, and Biopython requires a minimum of python 2.5 (and recommends python 2.7).  

We can always state that perl 5.8 is supported for the upcoming Bioperl release, but we're dropping v5.8 support for any future releases.

chris


From l.m.timmermans at students.uu.nl  Tue Feb  5 23:33:57 2013
From: l.m.timmermans at students.uu.nl (Leon Timmermans)
Date: Wed, 6 Feb 2013 05:33:57 +0100
Subject: [Bioperl-l] dependencies on perl version
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
Message-ID: <CAC1jpXAjt8m9Go9YkGFOUkxw92FUoLFbs0Q_fys-f_gyAwX8yw@mail.gmail.com>

On Wed, Feb 6, 2013 at 5:24 AM, Fields, Christopher J
<cjfields at illinois.edu> wrote:
> Right, it took ~8 yrs to go from 5.8 to 5.10.  I'd like to point out that Python users are in the same boat: the Python version for CentOS 5 is 2.4.3, and Biopython requires a minimum of python 2.5 (and recommends python 2.7).
>
> We can always state that perl 5.8 is supported for the upcoming Bioperl release, but we're dropping v5.8 support for any future releases.

Sounds reasonable. These things shouldn't come as a surprise.

I suspect that the thing that will save us is that most of these
people install it once and then never upgrade.

Leon


From hartzell at alerce.com  Wed Feb  6 12:58:07 2013
From: hartzell at alerce.com (George Hartzell)
Date: Wed, 6 Feb 2013 09:58:07 -0800
Subject: [Bioperl-l] dependencies on perl version
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
Message-ID: <20754.39343.128576.743448@gargle.gargle.HOWL>

Fields, Christopher J writes:
 > [...]
 > Right, it took ~8 yrs to go from 5.8 to 5.10.  I'd like to point
 > out that Python users are in the same boat: the Python version for
 > CentOS 5 is 2.4.3, and Biopython requires a minimum of python 2.5
 > (and recommends python 2.7).   
 > 
 > We can always state that perl 5.8 is supported for the upcoming
 > Bioperl release, but we're dropping v5.8 support for any future
 > releases. 

Do more than drop support for 5.8.

The Perl community has put a transparent and predictable process in
place for releasing [generally] better versions of the language.  It
means that Perl has a chance of continuing to be relevant, attracting
new talent and actually *fixing* some of the s&%t that gives Perl a
bad rap.  It gives people something to plan around, no one should be
surprised that v 5.X.Y is coming out in mid 20ZZ.

BioPerl should do the same thing, declare a release policy that trails
along with the Perl release schedule.  Keep it simple and no one can
argue with it.  Support Perl releases as long as the releases
themselves are supported.

Rather than expending energy supporting out of date platforms, put the
energy into being modern (or Modern...), better distro building and
packaging, testing, documentation and releasing so that the process of
staying current is painless.

Look forward.  Keep it interesting and fun.

Everyone running Mac OS 9 on their Pismo, raise your hand.  Anyone
make their living running sequencing gels in Plexiglas doohickeys on
their lab bench?

I'm not suggesting that the BioPerl community is free to make
arbitrary and capricious changes that makes it difficult for *anyone*
to get anything done.  Churn is a waste of time.

But why should the all-volunteer BioPerl community be stuck supporting
code from 12 years ago because it's cost effective for someone else to
avoid spending *their* $/time/people to stay up to date.

Those sites that value stability/maturity/stagnation so highly have
already accepted the cost/difficulty of nailing one of their feet to
the floor as they try to run forward.  They recognize and depend on
the benefits of having that stable base but generally they've also
accepted the costs associated with their restrictive choices.  They
know how to pull in separate kernel/driver updates so that they can
actually run on nearly modern hardware.  They know, and live with, the
fact that they're not going to have access to the shiny new stuff.
And they know how to stay up to date, when they need to, with the
software that their users need to be competitive (e.g. BioConductor
and R).

As long as (if/when...) updating a BioPerl release is something that
can reliably happen with a few cpanm invocations then the sites that
otherwise favor punctuated equilibrium will learn to handle gradual
change.

Those folks that are "stuck" on older releases always have the option
of supporting professional Perl programmers to keep older releases
going, backport changes, etc....  They're already buying support for
their platforms (or freeloading and coping), let them put bread on the
table at one of the bioinformatics consultancies or labs if they have
something special they need.

Have fun.  Use sharp tools.  Do cool science.  Build cool things.  No
one is paying you to be backwards compatible with the previous
millennium.

g.


From amackey at virginia.edu  Wed Feb  6 13:47:46 2013
From: amackey at virginia.edu (Aaron Mackey)
Date: Wed, 6 Feb 2013 13:47:46 -0500
Subject: [Bioperl-l] dependencies on perl version
In-Reply-To: <20754.39343.128576.743448@gargle.gargle.HOWL>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
Message-ID: <CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>

Huzzah!

--
Aaron J. Mackey, PhD
Assistant Professor
Center for Public Health Genomics
University of Virginia
amackey at virginia.edu
http://www.cphg.virginia.edu/mackey


On Wed, Feb 6, 2013 at 12:58 PM, George Hartzell <hartzell at alerce.com>wrote:

> Fields, Christopher J writes:
>  > [...]
>  > Right, it took ~8 yrs to go from 5.8 to 5.10.  I'd like to point
>  > out that Python users are in the same boat: the Python version for
>  > CentOS 5 is 2.4.3, and Biopython requires a minimum of python 2.5
>  > (and recommends python 2.7).
>  >
>  > We can always state that perl 5.8 is supported for the upcoming
>  > Bioperl release, but we're dropping v5.8 support for any future
>  > releases.
>
> Do more than drop support for 5.8.
>
> The Perl community has put a transparent and predictable process in
> place for releasing [generally] better versions of the language.  It
> means that Perl has a chance of continuing to be relevant, attracting
> new talent and actually *fixing* some of the s&%t that gives Perl a
> bad rap.  It gives people something to plan around, no one should be
> surprised that v 5.X.Y is coming out in mid 20ZZ.
>
> BioPerl should do the same thing, declare a release policy that trails
> along with the Perl release schedule.  Keep it simple and no one can
> argue with it.  Support Perl releases as long as the releases
> themselves are supported.
>
> Rather than expending energy supporting out of date platforms, put the
> energy into being modern (or Modern...), better distro building and
> packaging, testing, documentation and releasing so that the process of
> staying current is painless.
>
> Look forward.  Keep it interesting and fun.
>
> Everyone running Mac OS 9 on their Pismo, raise your hand.  Anyone
> make their living running sequencing gels in Plexiglas doohickeys on
> their lab bench?
>
> I'm not suggesting that the BioPerl community is free to make
> arbitrary and capricious changes that makes it difficult for *anyone*
> to get anything done.  Churn is a waste of time.
>
> But why should the all-volunteer BioPerl community be stuck supporting
> code from 12 years ago because it's cost effective for someone else to
> avoid spending *their* $/time/people to stay up to date.
>
> Those sites that value stability/maturity/stagnation so highly have
> already accepted the cost/difficulty of nailing one of their feet to
> the floor as they try to run forward.  They recognize and depend on
> the benefits of having that stable base but generally they've also
> accepted the costs associated with their restrictive choices.  They
> know how to pull in separate kernel/driver updates so that they can
> actually run on nearly modern hardware.  They know, and live with, the
> fact that they're not going to have access to the shiny new stuff.
> And they know how to stay up to date, when they need to, with the
> software that their users need to be competitive (e.g. BioConductor
> and R).
>
> As long as (if/when...) updating a BioPerl release is something that
> can reliably happen with a few cpanm invocations then the sites that
> otherwise favor punctuated equilibrium will learn to handle gradual
> change.
>
> Those folks that are "stuck" on older releases always have the option
> of supporting professional Perl programmers to keep older releases
> going, backport changes, etc....  They're already buying support for
> their platforms (or freeloading and coping), let them put bread on the
> table at one of the bioinformatics consultancies or labs if they have
> something special they need.
>
> Have fun.  Use sharp tools.  Do cool science.  Build cool things.  No
> one is paying you to be backwards compatible with the previous
> millennium.
>
> g.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From tiago.hori at gmail.com  Wed Feb  6 08:25:41 2013
From: tiago.hori at gmail.com (Tiago Hori)
Date: Wed, 6 Feb 2013 05:25:41 -0800 (PST)
Subject: [Bioperl-l] Problems installing Bio::Tools::Run:StandAloneBlastPlus
Message-ID: <9b488c6e-34b3-4269-a7ac-e2206720939a@googlegroups.com>

Hi Guys,

I am trying to install the module Bio::Tools::Run:StandAloneBlastPlus, but 
it has been hard so far.

I managed to install and compile samtools, after finding all the 
dependencies, but I am still missing something! I posted the complete 
report below!

Any help, would be great!

Cheers,

T.

cpan[1]> install Bio::Tools::Run::StandAloneBlastPlus
Reading '/home/tiagohori/.cpan/Metadata'
  Database was generated on Tue, 05 Feb 2013 18:41:03 GMT
Running install for module 'Bio::Tools::Run::StandAloneBlastPlus'
Running make for C/CJ/CJFIELDS/BioPerl-Run-1.006900.tar.gz
Checksum for 
/home/tiagohori/.cpan/sources/authors/id/C/CJ/CJFIELDS/BioPerl-Run-1.006900.tar.gz 
ok
Scanning cache /home/tiagohori/.cpan/build for sizes
..................................------------------------------------------DONE
DEL(1/20): /home/tiagohori/.cpan/build/BioPerl-Run-1.006900-qpHfzz 
DEL(2/20): /home/tiagohori/.cpan/build/BioPerl-Run-1.006900-qpHfzz.yml 
DEL(3/20): /home/tiagohori/.cpan/build/BioPerl-Run-1.006900-nMOXgO 
DEL(4/20): /home/tiagohori/.cpan/build/BioPerl-Run-1.006900-nMOXgO.yml 
DEL(5/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-bgBQyC 
DEL(6/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-bgBQyC.yml 
DEL(7/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-Ki3dbt 
DEL(8/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-Ki3dbt.yml 
DEL(9/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-ciM7U4 
DEL(10/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-ciM7U4.yml 
DEL(11/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-oDyi_5 
DEL(12/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-oDyi_5.yml 
DEL(13/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-AQiiAn 
DEL(14/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-AQiiAn.yml 
DEL(15/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-0H2Z9o 
DEL(16/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-0H2Z9o.yml 
DEL(17/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-c_8A_U 
DEL(18/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-c_8A_U.yml 
DEL(19/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-lWtV8v 
DEL(20/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-lWtV8v.yml 

  CPAN.pm: Building C/CJ/CJFIELDS/BioPerl-Run-1.006900.tar.gz

Install scripts? y/n [n ]
n 
Do you want to run tests that require connection to servers across the 
internet
(likely to cause some failures)? y/n [n ]
n 
  - will not run internet-requiring tests
Created MYMETA.yml and MYMETA.json
Creating new 'Build' script for 'BioPerl-Run' version '1.006900'
Building BioPerl-Run
  CJFIELDS/BioPerl-Run-1.006900.tar.gz
  ./Build -- OK
Running Build test
t/Amap.t ...................... 1/18 # Required executable for 
Bio::Tools::Run::Alignment::Amap is not present
t/Amap.t ...................... ok     
t/AnalysisFactory_soap.t ...... skipped: Network tests have not been 
requested
t/Analysis_soap.t ............. skipped: Network tests have not been 
requested
t/BEDTools.t .................. 3/423 # Required executable for 
Bio::Tools::Run::BEDTools is not present
t/BEDTools.t .................. ok       
t/BWA.t ....................... 1/36 # Required executable for 
Bio::Tools::Run::BWA is not present
t/BWA.t ....................... ok     
t/Blat.t ...................... 1/33 # Required executable for 
Bio::Tools::Run::Alignment::Blat is not present
# Looks like you planned 33 tests but ran 20.
t/Blat.t ...................... Dubious, test returned 255 (wstat 65280, 
0xff00)
Failed 13/33 subtests 
(less 15 skipped subtests: 5 okay)
t/Bowtie.t .................... 1/73 # Required executable for 
Bio::Tools::Run::Bowtie is not present
t/Bowtie.t .................... ok     
t/Cap3.t ...................... 1/91 # Required executable for 
Bio::Tools::Run::Cap3 is not present
t/Cap3.t ...................... ok     
t/Clustalw.t .................. 1/45 # Required executable for 
Bio::Tools::Run::Alignment::Clustalw is not present
t/Clustalw.t .................. ok     
t/Coil.t ...................... 2/6 # Required executable for 
Bio::Tools::Run::Coil is not present
t/Coil.t ...................... ok   
t/Consense.t .................. 1/9 # Required executable for 
Bio::Tools::Run::Phylo::Phylip::Consense is not present
t/Consense.t .................. ok   
t/DBA.t ....................... 1/18 # Required executable for 
Bio::Tools::Run::Alignment::DBA is not present
t/DBA.t ....................... ok     
t/DrawGram.t .................. 1/6 # Required executable for 
Bio::Tools::Run::Phylo::Phylip::DrawGram is not present
t/DrawGram.t .................. ok   
t/DrawTree.t .................. 1/6 # Required executable for 
Bio::Tools::Run::Phylo::Phylip::DrawTree is not present
t/DrawTree.t .................. ok   
t/EMBOSS.t .................... ok     
t/Ensembl.t ................... skipped: Network tests have not been 
requested
t/Eponine.t ................... 1/7 # Looks like you planned 7 tests but 
ran 2.
t/Eponine.t ................... Dubious, test returned 255 (wstat 65280, 
0xff00)
Failed 5/7 subtests 
t/Exonerate.t ................. 1/89 # Required executable for 
Bio::Tools::Run::Alignment::Exonerate is not present
t/Exonerate.t ................. ok     
t/FootPrinter.t ............... 1/24 # Required executable for 
Bio::Tools::Run::FootPrinter is not present
t/FootPrinter.t ............... ok     
t/Genemark.hmm.prokaryotic.t .. 1/99 # Required environment variable 
$GENEMARK_MODELS is not set
t/Genemark.hmm.prokaryotic.t .. ok     
t/Genewise.t .................. 1/20 # Required executable for 
Bio::Tools::Run::Genewise is not present
t/Genewise.t .................. ok     
t/Genscan.t ................... 1/6 # Required environment variable 
$GENSCANDIR is not set
t/Genscan.t ................... ok   
t/Gerp.t ...................... 1/33 # Required executable for 
Bio::Tools::Run::Phylo::Gerp is not present
t/Gerp.t ...................... ok     
t/Glimmer2.t .................. 1/217 # Required executable for 
Bio::Tools::Run::Glimmer is not present
t/Glimmer2.t .................. ok       
t/Glimmer3.t .................. 1/111 # Required executable for 
Bio::Tools::Run::Glimmer is not present
t/Glimmer3.t .................. ok       
t/Gumby.t ..................... 1/124 # Required executable for 
Bio::Tools::Run::Phylo::Gumby is not present
t/Gumby.t ..................... ok       
t/Hmmer.t ..................... 1/27 # Required executable for 
Bio::Tools::Run::Hmmer is not present
t/Hmmer.t ..................... ok     
t/Hyphy.t ..................... 2/15 # Required executable for 
Bio::Tools::Run::Phylo::Hyphy::SLAC is not present
t/Hyphy.t ..................... ok     
t/Infernal.t .................. 1/43 # Required executable for 
Bio::Tools::Run::Infernal is not present
t/Infernal.t .................. ok     
t/Kalign.t .................... 1/8 # Required executable for 
Bio::Tools::Run::Alignment::Kalign is not present
t/Kalign.t .................... ok   
t/LVB.t ....................... 1/19 # Required executable for 
Bio::Tools::Run::Phylo::LVB is not present
t/LVB.t ....................... ok     
t/Lagan.t ..................... 1/12 # Required executable for 
Bio::Tools::Run::Alignment::Lagan is not present
t/Lagan.t ..................... ok     
t/MAFFT.t ..................... 1/17 # Required executable for 
Bio::Tools::Run::Alignment::MAFFT is not present
t/MAFFT.t ..................... ok     
t/MCS.t ....................... 1/24 # Required executable for 
Bio::Tools::Run::MCS is not present
t/MCS.t ....................... ok     
t/Maq.t ....................... 1/51 # Required executable for 
Bio::Tools::Run::Maq is not present
t/Maq.t ....................... ok     
t/Match.t ..................... 1/7 # Required executable for 
Bio::Tools::Run::Match is not present
t/Match.t ..................... ok   
t/Mdust.t ..................... 1/5 # Required executable for 
Bio::Tools::Run::Mdust is not present
t/Mdust.t ..................... ok   
t/Meme.t ...................... 1/25 # Required executable for 
Bio::Tools::Run::Meme is not present
t/Meme.t ...................... ok     
t/Minimo.t .................... 1/72 # Required executable for 
Bio::Tools::Run::Minimo is not present
t/Minimo.t .................... ok     
t/Molphy.t .................... 1/10 # Required executable for 
Bio::Tools::Run::Phylo::Molphy::ProtML is not present
t/Molphy.t .................... ok     
t/Muscle.t .................... 1/16 # Required executable for 
Bio::Tools::Run::Alignment::Muscle is not present
t/Muscle.t .................... ok     
t/Neighbor.t .................. 1/17 # Required executable for 
Bio::Tools::Run::Phylo::Phylip::Neighbor is not present
t/Neighbor.t .................. ok     
t/Newbler.t ................... 1/98 # Required executable for 
Bio::Tools::Run::Newbler is not present
t/Newbler.t ................... ok     
t/Njtree.t .................... 1/6 # Required executable for 
Bio::Tools::Run::Phylo::Njtree::Best is not present
t/Njtree.t .................... ok   
t/PAML.t ...................... 1/28 # Required executable for 
Bio::Tools::Run::Phylo::PAML::Codeml is not present
t/PAML.t ...................... ok     
t/Pal2Nal.t ................... 1/9 # Required executable for 
Bio::Tools::Run::Alignment::Pal2Nal is not present
t/Pal2Nal.t ................... ok   
t/PhastCons.t ................. 1/181 # Required executable for 
Bio::Tools::Run::Phylo::Phast::PhastCons is not present
t/PhastCons.t ................. ok       
t/Phrap.t ..................... 1/127 # Required executable for 
Bio::Tools::Run::Phrap is not present
t/Phrap.t ..................... ok       
t/Phyml.t ..................... 1/47 # Required executable for 
Bio::Tools::Run::Phylo::Phyml is not present
t/Phyml.t ..................... ok     
t/Primate.t ................... 1/8 # Required executable for 
Bio::Tools::Run::Primate is not present
t/Primate.t ................... ok   
t/Primer3.t ................... 1/9 # Required executable for 
Bio::Tools::Run::Primer3 is not present
t/Primer3.t ................... ok   
t/Prints.t .................... 1/7 # Required executable for 
Bio::Tools::Run::Prints is not present
t/Prints.t .................... ok   
t/Probalign.t ................. 1/13 # Required executable for 
Bio::Tools::Run::Alignment::Probalign is not present
t/Probalign.t ................. ok     
t/Probcons.t .................. 1/11 # Required executable for 
Bio::Tools::Run::Alignment::Probcons is not present
t/Probcons.t .................. ok     
t/Profile.t ................... 1/7 # Required executable for 
Bio::Tools::Run::Profile is not present
t/Profile.t ................... ok   
t/Promoterwise.t .............. 1/9 # Required executable for 
Bio::Tools::Run::Promoterwise is not present
t/Promoterwise.t .............. ok   
t/ProtDist.t .................. 1/14 # Required executable for 
Bio::Tools::Run::Phylo::Phylip::ProtDist is not present
t/ProtDist.t .................. ok     
t/ProtPars.t .................. 1/11 # Required executable for 
Bio::Tools::Run::Phylo::Phylip::ProtPars is not present
t/ProtPars.t .................. ok     
t/Pseudowise.t ................ 1/18 # Required executable for 
Bio::Tools::Run::Pseudowise is not present
t/Pseudowise.t ................ ok     
t/QuickTree.t ................. 1/13 # Required executable for 
Bio::Tools::Run::Phylo::QuickTree is not present
t/QuickTree.t ................. ok     
t/RepeatMasker.t .............. 1/12 RepeatMasker program not found as  or 
not executable. 
# Required executable for Bio::Tools::Run::RepeatMasker is not present
t/RepeatMasker.t .............. ok     
t/SABlastPlus.t ............... 1/65 # Required executable for 
Bio::Tools::Run::BlastPlus is not present
# Looks like you planned 65 tests but ran 63.
t/SABlastPlus.t ............... Dubious, test returned 255 (wstat 65280, 
0xff00)
Failed 2/65 subtests 
(less 59 skipped subtests: 4 okay)
t/SLR.t ....................... 1/7 # Required executable for 
Bio::Tools::Run::Phylo::SLR is not present
t/SLR.t ....................... ok   
t/Samtools.t .................. ok     
t/Seg.t ....................... 1/8 # Required executable for 
Bio::Tools::Run::Seg is not present
t/Seg.t ....................... ok   
t/Semphy.t .................... 1/19 # Required executable for 
Bio::Tools::Run::Phylo::Semphy is not present
t/Semphy.t .................... ok     
t/SeqBoot.t ................... 1/9 # Required executable for 
Bio::Tools::Run::Phylo::Phylip::SeqBoot is not present
t/SeqBoot.t ................... ok   
t/Signalp.t ................... 1/7 # Required executable for 
Bio::Tools::Run::Signalp is not present
t/Signalp.t ................... ok   
t/Sim4.t ...................... 1/23 # Required executable for 
Bio::Tools::Run::Alignment::Sim4 is not present
t/Sim4.t ...................... ok     
t/Simprot.t ................... 1/6 # Required executable for 
Bio::Tools::Run::Simprot is not present
t/Simprot.t ................... ok   
t/SoapEU-function.t ........... skipped: The optional module Bio::DB::ESoap 
(or dependencies thereof) was not installed
t/SoapEU-unit.t ............... skipped: The optional module Bio::DB::ESoap 
(or dependencies thereof) was not installed
t/StandAloneFasta.t ........... 1/15 # Required executable for 
Bio::Tools::Run::Alignment::StandAloneFasta is not present
t/StandAloneFasta.t ........... ok     
t/TCoffee.t ................... 1/27 # Required executable for 
Bio::Tools::Run::Alignment::TCoffee is not present
t/TCoffee.t ................... ok     
t/TigrAssembler.t ............. 1/88 # Required executable for 
Bio::Tools::Run::TigrAssembler is not present
# Required executable for Bio::Tools::Run::TigrAssembler is not present
t/TigrAssembler.t ............. ok     
t/Tmhmm.t ..................... 1/9 # Required executable for 
Bio::Tools::Run::Tmhmm is not present
t/Tmhmm.t ..................... ok   
t/TribeMCL.t .................. ok     
t/Vista.t ..................... ok   
t/gmap-run.t .................. 1/8 # Required executable for 
Bio::Tools::Run::Alignment::Gmap is not present
t/gmap-run.t .................. ok   
t/tRNAscanSE.t ................ 1/12 # Required executable for 
Bio::Tools::Run::tRNAscanSE is not present
t/tRNAscanSE.t ................ ok     

Test Summary Report
-------------------
t/Blat.t                    (Wstat: 65280 Tests: 20 Failed: 0)
  Non-zero exit status: 255
  Parse errors: Bad plan.  You planned 33 tests but ran 20.
t/Eponine.t                 (Wstat: 65280 Tests: 2 Failed: 0)
  Non-zero exit status: 255
  Parse errors: Bad plan.  You planned 7 tests but ran 2.
t/SABlastPlus.t             (Wstat: 65280 Tests: 63 Failed: 0)
  Non-zero exit status: 255
  Parse errors: Bad plan.  You planned 65 tests but ran 63.
Files=80, Tests=2876, 39 wallclock secs ( 0.54 usr  0.23 sys + 32.54 cusr 
 4.94 csys = 38.25 CPU)
Result: FAIL
Failed 3/80 test programs. 0/2876 subtests failed.
  CJFIELDS/BioPerl-Run-1.006900.tar.gz
  ./Build test -- NOT OK
//hint// to see the cpan-testers results for installing this module, try:
  reports CJFIELDS/BioPerl-Run-1.006900.tar.gz
Running Build install
  make test had returned bad status, won't install without force


From guy.leonard at gmail.com  Wed Feb  6 13:35:38 2013
From: guy.leonard at gmail.com (guy.leonard at gmail.com)
Date: Wed, 6 Feb 2013 10:35:38 -0800 (PST)
Subject: [Bioperl-l] Next BioPerl release
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu>
References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu>
Message-ID: <3e4d717e-b58a-4bfd-943d-6f213bfae260@googlegroups.com>

Nice, super work. 

Will there be a rough list of feature changes/addition/deprecation, or 
shall I consult git logs?

On Tuesday, 5 February 2013 21:53:29 UTC, Christopher Fields wrote:
>
> All, 
>
> I am scheduling the next BioPerl CPAN release tentatively for March 1. 
>  Any help in triaging bug reports would be greatly appreciated!   
>
> Amongst all other changes, as mentioned in a separate thread we will 
> remove Bio::FeatureIO, now developed in a separate repository: 
>
>     https://github.com/bioperl/Bio-FeatureIO 
>
> Feedback, suggestions, etc are greatly appreciated. 
>
> chris 
> _______________________________________________ 
> Bioperl-l mailing list 
> Biop... at lists.open-bio.org <javascript:> 
> http://lists.open-bio.org/mailman/listinfo/bioperl-l 
>


From guy.leonard at gmail.com  Wed Feb  6 13:35:38 2013
From: guy.leonard at gmail.com (guy.leonard at gmail.com)
Date: Wed, 6 Feb 2013 10:35:38 -0800 (PST)
Subject: [Bioperl-l] Next BioPerl release
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu>
References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu>
Message-ID: <3e4d717e-b58a-4bfd-943d-6f213bfae260@googlegroups.com>

Nice, super work. 

Will there be a rough list of feature changes/addition/deprecation, or 
shall I consult git logs?

On Tuesday, 5 February 2013 21:53:29 UTC, Christopher Fields wrote:
>
> All, 
>
> I am scheduling the next BioPerl CPAN release tentatively for March 1. 
>  Any help in triaging bug reports would be greatly appreciated!   
>
> Amongst all other changes, as mentioned in a separate thread we will 
> remove Bio::FeatureIO, now developed in a separate repository: 
>
>     https://github.com/bioperl/Bio-FeatureIO 
>
> Feedback, suggestions, etc are greatly appreciated. 
>
> chris 
> _______________________________________________ 
> Bioperl-l mailing list 
> Biop... at lists.open-bio.org <javascript:> 
> http://lists.open-bio.org/mailman/listinfo/bioperl-l 
>


From sidd.basu at gmail.com  Wed Feb  6 14:36:17 2013
From: sidd.basu at gmail.com (Siddhartha Basu)
Date: Wed, 6 Feb 2013 13:36:17 -0600
Subject: [Bioperl-l]  Re: Next BioPerl release
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu>
References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu>
Message-ID: <5112b0b3.a5dc320a.4105.1fe3@mx.google.com>

Hi, 

On Tue, 05 Feb 2013, Fields, Christopher J wrote:

> All,
> 
> I am scheduling the next BioPerl CPAN release tentatively for March 1.  Any help in triaging bug reports would be greatly appreciated!  
> 
> Amongst all other changes, as mentioned in a separate thread we will remove Bio::FeatureIO, now developed in a separate repository:
> 
>     https://github.com/bioperl/Bio-FeatureIO
> 
> Feedback, suggestions, etc are greatly appreciated.

Here are CI build report on 5.12, 5.14 and 5.16 using travis. 
https://api.travis-ci.org/jobs/4623997/log.txt?deansi=true
https://api.travis-ci.org/jobs/4623998/log.txt?deansi=true
https://api.travis-ci.org/jobs/4623999/log.txt?deansi=true

Could not get 5.10 to work on travis. Though i activated the (--network)
option,  it still didn't run one of the test that needs network. Also, initially got
confused by the fact that though it has dist.ini,  the tests still has
to run through Build.PL. Running **dzil test** do not work.

Hope this helps.

thanks, 
-siddhartha

> 
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Wed Feb  6 14:46:49 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 6 Feb 2013 19:46:49 +0000
Subject: [Bioperl-l] Next BioPerl release
In-Reply-To: <3e4d717e-b58a-4bfd-943d-6f213bfae260@googlegroups.com>
References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu>
	<3e4d717e-b58a-4bfd-943d-6f213bfae260@googlegroups.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1A109@CHIMBX5.ad.uillinois.edu>

We've been a little better at keeping track of significant changes this time 'round.  There aren't a lot of major updates, but it's important to make sure we get a release out to ensure everyone (not just those familiar with git) can access them.

chris

On Feb 6, 2013, at 12:35 PM, <guy.leonard at gmail.com>
 wrote:

> Nice, super work. 
> 
> Will there be a rough list of feature changes/addition/deprecation, or 
> shall I consult git logs?
> 
> On Tuesday, 5 February 2013 21:53:29 UTC, Christopher Fields wrote:
>> 
>> All, 
>> 
>> I am scheduling the next BioPerl CPAN release tentatively for March 1. 
>> Any help in triaging bug reports would be greatly appreciated!   
>> 
>> Amongst all other changes, as mentioned in a separate thread we will 
>> remove Bio::FeatureIO, now developed in a separate repository: 
>> 
>>    https://github.com/bioperl/Bio-FeatureIO 
>> 
>> Feedback, suggestions, etc are greatly appreciated. 
>> 
>> chris 
>> _______________________________________________ 
>> Bioperl-l mailing list 
>> Biop... at lists.open-bio.org <javascript:> 
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l 
>> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Wed Feb  6 14:54:58 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 6 Feb 2013 19:54:58 +0000
Subject: [Bioperl-l] Next BioPerl release
In-Reply-To: <5112b0b3.a5dc320a.4105.1fe3@mx.google.com>
References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu>
	<5112b0b3.a5dc320a.4105.1fe3@mx.google.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1A1AB@CHIMBX5.ad.uillinois.edu>

On Feb 6, 2013, at 1:36 PM, Siddhartha Basu <sidd.basu at gmail.com>
 wrote:

> Hi, 
> 
> On Tue, 05 Feb 2013, Fields, Christopher J wrote:
> 
>> All,
>> 
>> I am scheduling the next BioPerl CPAN release tentatively for March 1.  Any help in triaging bug reports would be greatly appreciated!  
>> 
>> Amongst all other changes, as mentioned in a separate thread we will remove Bio::FeatureIO, now developed in a separate repository:
>> 
>>    https://github.com/bioperl/Bio-FeatureIO
>> 
>> Feedback, suggestions, etc are greatly appreciated.
> 
> Here are CI build report on 5.12, 5.14 and 5.16 using travis. 
> https://api.travis-ci.org/jobs/4623997/log.txt?deansi=true
> https://api.travis-ci.org/jobs/4623998/log.txt?deansi=true
> https://api.travis-ci.org/jobs/4623999/log.txt?deansi=true
> 
> Could not get 5.10 to work on travis. Though i activated the (--network)
> option,  it still didn't run one of the test that needs network. Also, initially got
> confused by the fact that though it has dist.ini,  the tests still has
> to run through Build.PL. Running **dzil test** do not work.
> 
> Hope this helps.
> 
> thanks, 
> -siddhartha

Just to point out, that was for Bio-FeatureIO.  Truthfully I'm not worried about that one yet; got to get over Mt. Everest first (the main release).  

Build.PL is there mainly as a convenience for users w/o Dist::Zilla, which, last I recall, had a higher dependency list than even BioPerl (though I may be mistaken).  I'll probably have to set up a Build.PL that can be clobbered by Dist::Zilla as needed.  Or we can just get rid of it and insist that dev. code has to be added via 'use lib' or PERL5LIB, and not allow installation.

chris


From sidd.basu at gmail.com  Wed Feb  6 15:26:06 2013
From: sidd.basu at gmail.com (Siddhartha Basu)
Date: Wed, 6 Feb 2013 14:26:06 -0600
Subject: [Bioperl-l]  Re: Next BioPerl release
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1A1AB@CHIMBX5.ad.uillinois.edu>
References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu>
	<5112b0b3.a5dc320a.4105.1fe3@mx.google.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1A1AB@CHIMBX5.ad.uillinois.edu>
Message-ID: <5112bc60.c69e320a.1e98.2028@mx.google.com>

On Wed, 06 Feb 2013, Fields, Christopher J wrote:

> On Feb 6, 2013, at 1:36 PM, Siddhartha Basu <sidd.basu at gmail.com>
>  wrote:
> 
> > Hi, 
> > 
> > On Tue, 05 Feb 2013, Fields, Christopher J wrote:
> > 
> >> All,
> >> 
> >> I am scheduling the next BioPerl CPAN release tentatively for March 1.  Any help in triaging bug reports would be greatly appreciated!  
> >> 
> >> Amongst all other changes, as mentioned in a separate thread we will remove Bio::FeatureIO, now developed in a separate repository:
> >> 
> >>    https://github.com/bioperl/Bio-FeatureIO
> >> 
> >> Feedback, suggestions, etc are greatly appreciated.
> > 
> > Here are CI build report on 5.12, 5.14 and 5.16 using travis. 
> > https://api.travis-ci.org/jobs/4623997/log.txt?deansi=true
> > https://api.travis-ci.org/jobs/4623998/log.txt?deansi=true
> > https://api.travis-ci.org/jobs/4623999/log.txt?deansi=true
> > 
> > Could not get 5.10 to work on travis. Though i activated the (--network)
> > option,  it still didn't run one of the test that needs network. Also, initially got
> > confused by the fact that though it has dist.ini,  the tests still has
> > to run through Build.PL. Running **dzil test** do not work.
> > 
> > Hope this helps.
> > 
> > thanks, 
> > -siddhartha
> 
> Just to point out, that was for Bio-FeatureIO.  Truthfully I'm not worried about that one yet; got to get over Mt. Everest first (the main release).  
So,  what are steps left for getting the release out to CPAN. Like are
there lot of feature branches still left to be merged,  are there a lot
of unit tests still not passing. Just trying to figure out anyway i
could be of any help to expedite the release process. However,  if they
are already taken care of,  please ignore.

> 
> Build.PL is there mainly as a convenience for users w/o Dist::Zilla, which, last I recall, had a higher dependency list than even BioPerl (though I may be mistaken).  I'll probably have to set up a Build.PL that can be clobbered by Dist::Zilla as needed.  
As far as the error i encountered, presence of Build.PL was blocking dzil
build/release process. And by default,  dzil expects to generate
Build.PL during its build/release process. However,  i am not sure which
mode is the most suitable for bioperl devs.
> Or we can just get rid of it and insist that dev. code has to be added via 'use lib' or PERL5LIB, and not allow installation.

thanks, 
-siddhartha

> 
> chris


From hlapp at drycafe.net  Wed Feb  6 16:30:33 2013
From: hlapp at drycafe.net (Hilmar Lapp)
Date: Wed, 6 Feb 2013 16:30:33 -0500
Subject: [Bioperl-l] dependencies on perl version
In-Reply-To: <20754.39343.128576.743448@gargle.gargle.HOWL>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
Message-ID: <A78F0D43-8296-45CF-9409-320D1FE7CA2F@drycafe.net>

Great points, George, and you're making a very compelling argument. I'm in total agreement. It's almost becoming a reason to having to be embarrassed to still be programming in Perl these days, so one might as well have fun while it lasts.

	-hilmar

On Feb 6, 2013, at 12:58 PM, George Hartzell wrote:

> Fields, Christopher J writes:
>> [...]
>> Right, it took ~8 yrs to go from 5.8 to 5.10.  I'd like to point
>> out that Python users are in the same boat: the Python version for
>> CentOS 5 is 2.4.3, and Biopython requires a minimum of python 2.5
>> (and recommends python 2.7).   
>> 
>> We can always state that perl 5.8 is supported for the upcoming
>> Bioperl release, but we're dropping v5.8 support for any future
>> releases. 
> 
> Do more than drop support for 5.8.
> 
> The Perl community has put a transparent and predictable process in
> place for releasing [generally] better versions of the language.  It
> means that Perl has a chance of continuing to be relevant, attracting
> new talent and actually *fixing* some of the s&%t that gives Perl a
> bad rap.  It gives people something to plan around, no one should be
> surprised that v 5.X.Y is coming out in mid 20ZZ.
> 
> BioPerl should do the same thing, declare a release policy that trails
> along with the Perl release schedule.  Keep it simple and no one can
> argue with it.  Support Perl releases as long as the releases
> themselves are supported.
> 
> Rather than expending energy supporting out of date platforms, put the
> energy into being modern (or Modern...), better distro building and
> packaging, testing, documentation and releasing so that the process of
> staying current is painless.
> 
> Look forward.  Keep it interesting and fun.
> 
> Everyone running Mac OS 9 on their Pismo, raise your hand.  Anyone
> make their living running sequencing gels in Plexiglas doohickeys on
> their lab bench?
> 
> I'm not suggesting that the BioPerl community is free to make
> arbitrary and capricious changes that makes it difficult for *anyone*
> to get anything done.  Churn is a waste of time.
> 
> But why should the all-volunteer BioPerl community be stuck supporting
> code from 12 years ago because it's cost effective for someone else to
> avoid spending *their* $/time/people to stay up to date.
> 
> Those sites that value stability/maturity/stagnation so highly have
> already accepted the cost/difficulty of nailing one of their feet to
> the floor as they try to run forward.  They recognize and depend on
> the benefits of having that stable base but generally they've also
> accepted the costs associated with their restrictive choices.  They
> know how to pull in separate kernel/driver updates so that they can
> actually run on nearly modern hardware.  They know, and live with, the
> fact that they're not going to have access to the shiny new stuff.
> And they know how to stay up to date, when they need to, with the
> software that their users need to be competitive (e.g. BioConductor
> and R).
> 
> As long as (if/when...) updating a BioPerl release is something that
> can reliably happen with a few cpanm invocations then the sites that
> otherwise favor punctuated equilibrium will learn to handle gradual
> change.
> 
> Those folks that are "stuck" on older releases always have the option
> of supporting professional Perl programmers to keep older releases
> going, backport changes, etc....  They're already buying support for
> their platforms (or freeloading and coping), let them put bread on the
> table at one of the bioinformatics consultancies or labs if they have
> something special they need.
> 
> Have fun.  Use sharp tools.  Do cool science.  Build cool things.  No
> one is paying you to be backwards compatible with the previous
> millennium.
> 
> g.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net :
===========================================================


From cjfields at illinois.edu  Wed Feb  6 17:11:06 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 6 Feb 2013 22:11:06 +0000
Subject: [Bioperl-l] BioPerl long-term, was Re:  dependencies on perl version
In-Reply-To: <CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>

George,

Should put your post on a pedestal :)

tl;dr version: I completely agree, but we need help in order to do this.

Long(-winded) version:

I agree completely, backwards compatibility is killing us.  But, we do need current and new people to get involved and help drive this forward.  We need people on all fronts, from coding and bug fixes to documentation and web site maintenance.  I've been driving this bus for a number of years now.  Not getting tired yet, but I am getting substantially busier with my current endeavors, so my time spent working on BioPerl has dwindled considerably.  Any additional support or sharing of responsibilities will help tremendously in keeping up momentum (if someone else wants to take the wheel for a bit, please let me know :).  

If we follow the perl release route, we should streamline the release process (think Dist::Zilla), end support of older versions of Perl, and work on a sustainable release schedule.  The fact that we have so many of us so-called 'old folks' speaking up in favor of this is a very good sign.  We do need a bit more than that; we need help.  BioPerl is a very large project.

A key point we need to address, which is very important for the future of BioPerl.  I use Perl quite a bit in my current work (dabble with Ruby and Python as well when I have to).  BioPerl?  A little, but not as much as I could.  

Shocked?  The main three reason I don't use it 'in anger':  performance, performance, and performance.  It is very important that we make a concerted effort to address this at all levels.  It could be as simple as completely separating parsing from object creation (where the bulk of performance problems seem to lie, but not all of them).  

A specific example: Heng Li once tested the performance of FASTQ parsing (perl, python, bioperl, biopython, his C code, etc). BioPerl's FASTQ couldn't even be measured; IIRC it went on for many hours until he killed it.  This was with the older version of the parser, but I'm willing to bet the newer one I wrote isn't any better.

This. needs. to. change.

I see no problem in stating any generic parsing and low-level interfaces are just as much a part of what BioPerl encompasses as the higher-level Bio::* classes themselves.  Steve and Jason were on to something with SearchIO; it's maybe not as performant as we would like, but it certainly is more flexible in terms of what can be done, b/c it separates out low-level parsing from object creation.  That's the general model we should look at.  There is a good reason Biopython is following this model with their SearchIO implementation (Peter C, are you reading this?)

We have a lot of very talented people involved with this project, both on the purely computational and purely biological end as well as the folks like me who straddle the two domains.  A lot of good code out there that can be used, wrapped, taken advantage of, including everything we currently have in BioPerl.  Let's come up with something that both works and works well, that people can use on a regular basis, even at a low level if they choose.  That alone would dissuade new users from writing up (yet another) custom FASTA/FASTQ/BLAST/GenBank/etc parser b/c the BioPerl one takes millennia to finish.  

A few examples on this front: Rob Buels created a generic parser for GFF3 (Bio::GFF3::LowLevel) with very few dependencies, we wrap this with the newer Bio::FeatureIO code.  Leon has Bio::SFF.  Lincoln of course wrote Bio::DB::Sam and Bio::DB::BigFile.  I have started a wrapper around Heng's FASTQ/FASTA parsing code (kseq), it seems to work quite well (~20M FASTQ in 30 sec last I recall?).  

So:

If it means targeting performance, backwards-compatibility be damned (using Devel::NYTProf?), we do that.

If it means creating a new Bio-NGS repo to focus some of these efforts, so be it.

If it means we get away from the Java-based interface stuff in favor of something more Perl-like (roles anyone?), then I'm all for it.

If it means we modularize BioPerl so this can be done, well, you probably know where I stand (yes).

If it means this is to be BioPerl 2.0, then let's move that direction, sooner than later.

But I can't do it alone.  We (not just me, but we) need to drive the direction we take.

First one who codes gets the gold ring.

chris

On Feb 6, 2013, at 12:47 PM, Aaron Mackey <amackey at virginia.edu>
 wrote:

> Huzzah!
> 
> --
> Aaron J. Mackey, PhD
> Assistant Professor
> Center for Public Health Genomics
> University of Virginia
> amackey at virginia.edu
> http://www.cphg.virginia.edu/mackey
> 
> 
> On Wed, Feb 6, 2013 at 12:58 PM, George Hartzell <hartzell at alerce.com> wrote:
> Fields, Christopher J writes:
>  > [...]
>  > Right, it took ~8 yrs to go from 5.8 to 5.10.  I'd like to point
>  > out that Python users are in the same boat: the Python version for
>  > CentOS 5 is 2.4.3, and Biopython requires a minimum of python 2.5
>  > (and recommends python 2.7).
>  >
>  > We can always state that perl 5.8 is supported for the upcoming
>  > Bioperl release, but we're dropping v5.8 support for any future
>  > releases.
> 
> Do more than drop support for 5.8.
> 
> The Perl community has put a transparent and predictable process in
> place for releasing [generally] better versions of the language.  It
> means that Perl has a chance of continuing to be relevant, attracting
> new talent and actually *fixing* some of the s&%t that gives Perl a
> bad rap.  It gives people something to plan around, no one should be
> surprised that v 5.X.Y is coming out in mid 20ZZ.
> 
> BioPerl should do the same thing, declare a release policy that trails
> along with the Perl release schedule.  Keep it simple and no one can
> argue with it.  Support Perl releases as long as the releases
> themselves are supported.
> 
> Rather than expending energy supporting out of date platforms, put the
> energy into being modern (or Modern...), better distro building and
> packaging, testing, documentation and releasing so that the process of
> staying current is painless.
> 
> Look forward.  Keep it interesting and fun.
> 
> Everyone running Mac OS 9 on their Pismo, raise your hand.  Anyone
> make their living running sequencing gels in Plexiglas doohickeys on
> their lab bench?
> 
> I'm not suggesting that the BioPerl community is free to make
> arbitrary and capricious changes that makes it difficult for *anyone*
> to get anything done.  Churn is a waste of time.
> 
> But why should the all-volunteer BioPerl community be stuck supporting
> code from 12 years ago because it's cost effective for someone else to
> avoid spending *their* $/time/people to stay up to date.
> 
> Those sites that value stability/maturity/stagnation so highly have
> already accepted the cost/difficulty of nailing one of their feet to
> the floor as they try to run forward.  They recognize and depend on
> the benefits of having that stable base but generally they've also
> accepted the costs associated with their restrictive choices.  They
> know how to pull in separate kernel/driver updates so that they can
> actually run on nearly modern hardware.  They know, and live with, the
> fact that they're not going to have access to the shiny new stuff.
> And they know how to stay up to date, when they need to, with the
> software that their users need to be competitive (e.g. BioConductor
> and R).
> 
> As long as (if/when...) updating a BioPerl release is something that
> can reliably happen with a few cpanm invocations then the sites that
> otherwise favor punctuated equilibrium will learn to handle gradual
> change.
> 
> Those folks that are "stuck" on older releases always have the option
> of supporting professional Perl programmers to keep older releases
> going, backport changes, etc....  They're already buying support for
> their platforms (or freeloading and coping), let them put bread on the
> table at one of the bioinformatics consultancies or labs if they have
> something special they need.
> 
> Have fun.  Use sharp tools.  Do cool science.  Build cool things.  No
> one is paying you to be backwards compatible with the previous
> millennium.
> 
> g.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From cjfields at illinois.edu  Wed Feb  6 17:34:42 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 6 Feb 2013 22:34:42 +0000
Subject: [Bioperl-l] BioPerl long-term,
 was Re:  dependencies on perl version
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1AF0C@CHIMBX5.ad.uillinois.edu>

I want to clarify, parser optimization isn't the only point we need to focus on by any means (and may not be the main one).  There is a lot of room for improvement top to bottom, that was one specific example I have long held to be an issue.

-c

On Feb 6, 2013, at 4:11 PM, "Fields, Christopher J" <cjfields at illinois.edu> wrote:

> Shocked?  The main three reason I don't use it 'in anger':  performance, performance, and performance.  It is very important that we make a concerted effort to address this at all levels.  It could be as simple as completely separating parsing from object creation (where the bulk of performance problems seem to lie, but not all of them).  
...


From p.j.a.cock at googlemail.com  Wed Feb  6 17:43:13 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Wed, 6 Feb 2013 22:43:13 +0000
Subject: [Bioperl-l] BioPerl long-term,
	was Re: dependencies on perl version
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
Message-ID: <CAKVJ-_6v2r4R=F-sEAtC9TCLsuU1VxNi6vk-E4gsd2e=Ri0pjQ@mail.gmail.com>

On Wed, Feb 6, 2013 at 10:11 PM, Fields, Christopher J
<cjfields at illinois.edu> wrote:
>
> I see no problem in stating any generic parsing and low-level interfaces
> are just as much a part of what BioPerl encompasses as the higher-level
> Bio::* classes themselves.  Steve and Jason were on to something with
> SearchIO; it's maybe not as performant as we would like, but it certainly
> is more flexible in terms of what can be done, b/c it separates out
> low-level parsing from object creation.  That's the general model we
> should look at.  There is a good reason Biopython is following this
> model with their SearchIO implementation (Peter C, are you reading this?)

Actually I don't think we did end up with that kind of separation in the
Biopython SearchIO - which is not so say it isn't an excellent model
to follow. Rather the Biopython SearchIO (like the BioPerl one) had
as the first goal a consistent object model across assorted file
formats.

The idea of a low level minimal overhead parsers (which are very
format specific), on which a heavier but consistent object model
can be built might be a good balance - the high level API has the
connivence, but if you give that up you can have more speed.
That's what I recommend with FASTQ and Biopython, e.g.
http://news.open-bio.org/news/2009/09/biopython-fast-fastq/

>
> I have started a wrapper around Heng's FASTQ/FASTA parsing
> code (kseq), it seems to work quite well (~20M FASTQ in 30 sec
> last I recall?).
>

I'd have to dig through my emails, but I think the BioRuby guys
looked at that too - as I recall while it was fast, the error handling
left something to be desired. Email me directly or on the BioRuby
list if you want to follow up on that.

Regards,

Peter


From cjfields at illinois.edu  Wed Feb  6 17:53:21 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 6 Feb 2013 22:53:21 +0000
Subject: [Bioperl-l] FASTQ, was Re:  BioPerl long-term,
 was Re: dependencies on perl version
In-Reply-To: <CAKVJ-_6v2r4R=F-sEAtC9TCLsuU1VxNi6vk-E4gsd2e=Ri0pjQ@mail.gmail.com>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
	<CAKVJ-_6v2r4R=F-sEAtC9TCLsuU1VxNi6vk-E4gsd2e=Ri0pjQ@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu>

On Feb 6, 2013, at 4:43 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:

> On Wed, Feb 6, 2013 at 10:11 PM, Fields, Christopher J
> <cjfields at illinois.edu> wrote:
>> 
>> I see no problem in stating any generic parsing and low-level interfaces
>> are just as much a part of what BioPerl encompasses as the higher-level
>> Bio::* classes themselves.  Steve and Jason were on to something with
>> SearchIO; it's maybe not as performant as we would like, but it certainly
>> is more flexible in terms of what can be done, b/c it separates out
>> low-level parsing from object creation.  That's the general model we
>> should look at.  There is a good reason Biopython is following this
>> model with their SearchIO implementation (Peter C, are you reading this?)
> 
> Actually I don't think we did end up with that kind of separation in the
> Biopython SearchIO - which is not so say it isn't an excellent model
> to follow. Rather the Biopython SearchIO (like the BioPerl one) had
> as the first goal a consistent object model across assorted file
> formats.
> 
> The idea of a low level minimal overhead parsers (which are very
> format specific), on which a heavier but consistent object model
> can be built might be a good balance - the high level API has the
> connivence, but if you give that up you can have more speed.
> That's what I recommend with FASTQ and Biopython, e.g.
> http://news.open-bio.org/news/2009/09/biopython-fast-fastq/
> 
>> 
>> I have started a wrapper around Heng's FASTQ/FASTA parsing
>> code (kseq), it seems to work quite well (~20M FASTQ in 30 sec
>> last I recall?).
>> 
> 
> I'd have to dig through my emails, but I think the BioRuby guys
> looked at that too - as I recall while it was fast, the error handling
> left something to be desired. Email me directly or on the BioRuby
> list if you want to follow up on that.
> 
> Regards,
> 
> Peter

I did a little on this, worth following up on, but I pulled the FASTQ test examples you created from the paper to test it out.  IIRC it parsed where it needed to, but I'm not sure how it handled bad sequences, so yes, worth looking into.  Maybe worth moving to open-bio-l for broader discussion.

chris


From whereverroadgoes at gmail.com  Wed Feb  6 16:59:04 2013
From: whereverroadgoes at gmail.com (Slym)
Date: Wed, 6 Feb 2013 13:59:04 -0800 (PST)
Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases
In-Reply-To: <87txpr26jj.fsf@topper.koldfront.dk>
References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com>
	<CAJ57qHHphLgEyfkEEyt2HVh+RahSWpiuhuaA08vi5ZxMwDDgTg@mail.gmail.com>
	<b2154001-d1eb-4266-a491-108d3e6ae77d@googlegroups.com>
	<CAJ57qHG9zFomG1wB4fN7hZZaByvP_EhxOHRTt2OrOZz__WgawQ@mail.gmail.com>
	<d5e347d1-cbaa-498a-9b64-a5242fdc4dd8@googlegroups.com>
	<87txpr26jj.fsf@topper.koldfront.dk>
Message-ID: <411e920d-e614-417d-9198-78bef9adba16@googlegroups.com>

Everything's working now! Thank you very much, especially to you Adam!


>


From carandraug+dev at gmail.com  Wed Feb  6 20:38:20 2013
From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=)
Date: Thu, 7 Feb 2013 01:38:20 +0000
Subject: [Bioperl-l] dependencies on perl version
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
Message-ID: <CAPOrs_0esYVUe_0gZHdAtk4orJQMO82fLjnfNL3Nap=BqX7RWw@mail.gmail.com>

On 5 February 2013 20:56, Fields, Christopher J <cjfields at illinois.edu> wrote:
> On Feb 5, 2013, at 2:06 PM, Carn? Draug <carandraug+dev at gmail.com> wrote:
>> how much perl backwards compatibility does bioperl needs to keep?
>
> Aim for 5.10.1, but be careful of smart-match.

Well, I solved my problem differently and ended up not needing any of
the new features. But next time I'll know. Thanks

Carn?


From pcantalupo at gmail.com  Wed Feb  6 23:04:08 2013
From: pcantalupo at gmail.com (Paul Cantalupo)
Date: Wed, 6 Feb 2013 23:04:08 -0500
Subject: [Bioperl-l] bug 3376 status needs updated
Message-ID: <CAJqbkv77bC3eWGsaOwwXFnGMrAZjVJSSU97CCRwJmMMPLQRjTQ@mail.gmail.com>

Hi,

A few months ago, I fixed bug 3376 (
https://github.com/bioperl/bioperl-live/commit/142e5d79e3a6593db32bf0af99048f47d01bd3f2).
The Redmine bug page (https://redmine.open-bio.org/issues/3376) hasn't been
updated to resolved or closed. Should I do this or is Chris the only one
who does that?

Thank you,

Paul


From cjfields at illinois.edu  Wed Feb  6 23:20:30 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Thu, 7 Feb 2013 04:20:30 +0000
Subject: [Bioperl-l] bug 3376 status needs updated
In-Reply-To: <CAJqbkv77bC3eWGsaOwwXFnGMrAZjVJSSU97CCRwJmMMPLQRjTQ@mail.gmail.com>
References: <CAJqbkv77bC3eWGsaOwwXFnGMrAZjVJSSU97CCRwJmMMPLQRjTQ@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1B45C@CHIMBX5.ad.uillinois.edu>

No, go ahead and close it.  Let me know if you run into perm. problems with it.

chris

On Feb 6, 2013, at 10:04 PM, Paul Cantalupo <pcantalupo at gmail.com>
 wrote:

> Hi,
> 
> A few months ago, I fixed bug 3376 (
> https://github.com/bioperl/bioperl-live/commit/142e5d79e3a6593db32bf0af99048f47d01bd3f2).
> The Redmine bug page (https://redmine.open-bio.org/issues/3376) hasn't been
> updated to resolved or closed. Should I do this or is Chris the only one
> who does that?
> 
> Thank you,
> 
> Paul
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From l.m.timmermans at students.uu.nl  Thu Feb  7 04:07:57 2013
From: l.m.timmermans at students.uu.nl (Leon Timmermans)
Date: Thu, 7 Feb 2013 10:07:57 +0100
Subject: [Bioperl-l] Next BioPerl release
In-Reply-To: <5112bc60.c69e320a.1e98.2028@mx.google.com>
References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu>
	<5112b0b3.a5dc320a.4105.1fe3@mx.google.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1A1AB@CHIMBX5.ad.uillinois.edu>
	<5112bc60.c69e320a.1e98.2028@mx.google.com>
Message-ID: <CAC1jpXDQG8NwaPKd8PEVqWs7NWHHAkrGaasCeJ+bKVy1z0he1Q@mail.gmail.com>

On Wed, Feb 6, 2013 at 9:26 PM, Siddhartha Basu <sidd.basu at gmail.com> wrote:
> As far as the error i encountered, presence of Build.PL was blocking dzil
> build/release process. And by default,  dzil expects to generate
> Build.PL during its build/release process. However,  i am not sure which
> mode is the most suitable for bioperl devs.

You can prune the Build.PL, and then let dzil add its own. We wouldn't
be the first to do that sort of thing.

Leon


From amackey at virginia.edu  Thu Feb  7 10:25:07 2013
From: amackey at virginia.edu (Aaron Mackey)
Date: Thu, 7 Feb 2013 10:25:07 -0500
Subject: [Bioperl-l] FASTQ, was Re: BioPerl long-term,
 was Re: dependencies on perl version
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
	<CAKVJ-_6v2r4R=F-sEAtC9TCLsuU1VxNi6vk-E4gsd2e=Ri0pjQ@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu>
Message-ID: <CAErFSojxeHBTcNK0GiYQ8D-MbPgzMvZ8xfnbeVU0-KaCNq7ZXw@mail.gmail.com>

You might also want to consider a lazy/pull-based parser to defer
parsing/object-building for pieces of the object that don't get used.  This
also usually provides some error tolerance.

-Aaron

--
Aaron J. Mackey, PhD
Assistant Professor
Center for Public Health Genomics
University of Virginia
amackey at virginia.edu
http://www.cphg.virginia.edu/mackey


On Wed, Feb 6, 2013 at 5:53 PM, Fields, Christopher J <cjfields at illinois.edu
> wrote:

> On Feb 6, 2013, at 4:43 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
>
> > On Wed, Feb 6, 2013 at 10:11 PM, Fields, Christopher J
> > <cjfields at illinois.edu> wrote:
> >>
> >> I see no problem in stating any generic parsing and low-level interfaces
> >> are just as much a part of what BioPerl encompasses as the higher-level
> >> Bio::* classes themselves.  Steve and Jason were on to something with
> >> SearchIO; it's maybe not as performant as we would like, but it
> certainly
> >> is more flexible in terms of what can be done, b/c it separates out
> >> low-level parsing from object creation.  That's the general model we
> >> should look at.  There is a good reason Biopython is following this
> >> model with their SearchIO implementation (Peter C, are you reading
> this?)
> >
> > Actually I don't think we did end up with that kind of separation in the
> > Biopython SearchIO - which is not so say it isn't an excellent model
> > to follow. Rather the Biopython SearchIO (like the BioPerl one) had
> > as the first goal a consistent object model across assorted file
> > formats.
> >
> > The idea of a low level minimal overhead parsers (which are very
> > format specific), on which a heavier but consistent object model
> > can be built might be a good balance - the high level API has the
> > connivence, but if you give that up you can have more speed.
> > That's what I recommend with FASTQ and Biopython, e.g.
> > http://news.open-bio.org/news/2009/09/biopython-fast-fastq/
> >
> >>
> >> I have started a wrapper around Heng's FASTQ/FASTA parsing
> >> code (kseq), it seems to work quite well (~20M FASTQ in 30 sec
> >> last I recall?).
> >>
> >
> > I'd have to dig through my emails, but I think the BioRuby guys
> > looked at that too - as I recall while it was fast, the error handling
> > left something to be desired. Email me directly or on the BioRuby
> > list if you want to follow up on that.
> >
> > Regards,
> >
> > Peter
>
> I did a little on this, worth following up on, but I pulled the FASTQ test
> examples you created from the paper to test it out.  IIRC it parsed where
> it needed to, but I'm not sure how it handled bad sequences, so yes, worth
> looking into.  Maybe worth moving to open-bio-l for broader discussion.
>
> chris
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From tiago.hori at gmail.com  Thu Feb  7 09:58:37 2013
From: tiago.hori at gmail.com (Tiago Hori)
Date: Thu, 7 Feb 2013 06:58:37 -0800 (PST)
Subject: [Bioperl-l] Search I::O
In-Reply-To: <6B0BCF1B-4B67-4697-9B34-8F822B4DC565@gmail.com>
References: <39b1269f-63a7-4b29-af79-8c93ab231abf@googlegroups.com>
	<6B0BCF1B-4B67-4697-9B34-8F822B4DC565@gmail.com>
Message-ID: <e5d61704-086a-4434-ae80-434252d1f55e@googlegroups.com>

Thanks, Jason! It is working Now.

So here is what I am trying to accomplish. For a given Blastx report, I 
want to extract the best BLASTx hit that is human, and does not contain 
unnamed or Predicted. I got very close, but I still can't get it to give me 
only the top BLAST hit, it gives me all blast hits that meet my criteria. I 
tried using "last" to stop it from looping through the hits, once it found 
a human one, but it didn't work. Can someone help? Here is my code so far 
(mostly stolen for the wiki).

use strict;
use Bio::SearchIO; 

my $in = new Bio::SearchIO(-format => 'blast', 
                           -file   => 'testsalmon.txt');
while( my $result = $in->next_result ) {
 ## $result is a Bio::Search::Result::ResultI compliant object
  while( my $hit = $result->next_hit ) {
  ## $hit is a Bio::Search::Hit::HitI compliant object    
    if( $hit->description !~ /[Uu]nnamed|PREDICTED|hypothetical/){        
      if( $hit->description =~ /Homo sapiens/){  
         while( my $hsp = $hit->next_hsp ) {
          ## $hsp is a Bio::Search::HSP::HSPI compliant object
              if( $hsp->length('total') > 50 ) {
                if ( $hsp->percent_identity >= 30) {
              if( $hsp->evalue <= 1e-05){
               print "Query=",   $result->query_name,"\t",
                     " Description=",    $hit->description,"\t",
                     " Hit=",        $hit->name,"\t",
                     " Length=",     $hsp->length('total'),"\t",
                     " Percent_id=", $hsp->percent_identity,"\t",
          }
        }
          }
     }
      }
    }
  }
}


T.


On Wednesday, February 6, 2013 6:46:47 PM UTC-3:30, Jason Stajich wrote:
>
> you are missing a comma after the -format => 'blast' 
> should be 
> my $in = Bio::SearchIO->new(-format => 'blast',   
>   -file => 'XXX' ); 
>
>
> On Feb 5, 2013, at 7:21 AM, Tiago Hori <tiago... at gmail.com <javascript:>> 
> wrote: 
>
> > Hi All, 
> > 
> > I am trying to find the best putative orthologs for 44K Atlantic Salmon 
> > sequences, and so I need to parse 44K BLAST reports to find the best 
> human 
> > hit. I am trying to learn Seach::IO, but when I try the first example on 
> > the HOWTO: use strict; 
> > use Bio::SearchIO; 
> > 
> > my $in = new Bio::SearchIO(-format => 'blast' 
> >               -file => 'C001R047.txt'); 
> > 
> > while( my $result = $in->next_result ) { 
> >  ## $result is a Bio::Search::Result::ResultI compliant object 
> >  while( my $hit = $result->next_hit ) { 
> >    ## $hit is a Bio::Search::Hit::HitI compliant object 
> >    while( my $hsp = $hit->next_hsp ) { 
> >      ## $hsp is a Bio::Search::HSP::HSPI compliant object 
> >      if( $hsp->length('total') > 50 ) { 
> >        if ( $hsp->percent_identity >= 75 ) { 
> >          print "Query=",   $result->query_name, 
> >            " Hit=",        $hit->name, 
> >            " Length=",     $hsp->length('total'), 
> >            " Percent_id=", $hsp->percent_identity, "\n"; 
> >        } 
> >      } 
> >    }   
> >  } 
> > } 
> > 
> > I get this error: Odd number of elements in hash assignment at 
> > /usr/local/share/perl/5.14.2/Bio/SearchIO.pm line 189. 
> > 
> > I am using BioPerl version 1.6.901. Is there a format problem with the 
> > blast reports? 
> > 
> > Any help would be greatly appreciated! 
> > 
> > T. 
> > _______________________________________________ 
> > Bioperl-l mailing list 
> > Biop... at lists.open-bio.org <javascript:> 
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l 
>
> Jason Stajich 
> jason.... at gmail.com <javascript:> 
> ja... at bioperl.org <javascript:> 
>
>


From cjfields at illinois.edu  Thu Feb  7 10:56:04 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Thu, 7 Feb 2013 15:56:04 +0000
Subject: [Bioperl-l] FASTQ, was Re: BioPerl long-term,
 was Re: dependencies on perl version
In-Reply-To: <CAErFSojxeHBTcNK0GiYQ8D-MbPgzMvZ8xfnbeVU0-KaCNq7ZXw@mail.gmail.com>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
	<CAKVJ-_6v2r4R=F-sEAtC9TCLsuU1VxNi6vk-E4gsd2e=Ri0pjQ@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu>
	<CAErFSojxeHBTcNK0GiYQ8D-MbPgzMvZ8xfnbeVU0-KaCNq7ZXw@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1C42B@CHIMBX5.ad.uillinois.edu>

This will likely be the approach for more NGS-friendly Bio::Seq class.  Calculation of the PHRED scores could also be deferred until needed.

seqtk has some C-based methods that we could possibly take advantage of, but will have to look into it.

chris

On Feb 7, 2013, at 9:25 AM, Aaron Mackey <amackey at virginia.edu> wrote:

> You might also want to consider a lazy/pull-based parser to defer parsing/object-building for pieces of the object that don't get used.  This also usually provides some error tolerance.
> 
> -Aaron
> 
> --
> Aaron J. Mackey, PhD
> Assistant Professor
> Center for Public Health Genomics
> University of Virginia
> amackey at virginia.edu
> http://www.cphg.virginia.edu/mackey
> 
> 
> On Wed, Feb 6, 2013 at 5:53 PM, Fields, Christopher J <cjfields at illinois.edu> wrote:
> On Feb 6, 2013, at 4:43 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> 
> > On Wed, Feb 6, 2013 at 10:11 PM, Fields, Christopher J
> > <cjfields at illinois.edu> wrote:
> >>
> >> I see no problem in stating any generic parsing and low-level interfaces
> >> are just as much a part of what BioPerl encompasses as the higher-level
> >> Bio::* classes themselves.  Steve and Jason were on to something with
> >> SearchIO; it's maybe not as performant as we would like, but it certainly
> >> is more flexible in terms of what can be done, b/c it separates out
> >> low-level parsing from object creation.  That's the general model we
> >> should look at.  There is a good reason Biopython is following this
> >> model with their SearchIO implementation (Peter C, are you reading this?)
> >
> > Actually I don't think we did end up with that kind of separation in the
> > Biopython SearchIO - which is not so say it isn't an excellent model
> > to follow. Rather the Biopython SearchIO (like the BioPerl one) had
> > as the first goal a consistent object model across assorted file
> > formats.
> >
> > The idea of a low level minimal overhead parsers (which are very
> > format specific), on which a heavier but consistent object model
> > can be built might be a good balance - the high level API has the
> > connivence, but if you give that up you can have more speed.
> > That's what I recommend with FASTQ and Biopython, e.g.
> > http://news.open-bio.org/news/2009/09/biopython-fast-fastq/
> >
> >>
> >> I have started a wrapper around Heng's FASTQ/FASTA parsing
> >> code (kseq), it seems to work quite well (~20M FASTQ in 30 sec
> >> last I recall?).
> >>
> >
> > I'd have to dig through my emails, but I think the BioRuby guys
> > looked at that too - as I recall while it was fast, the error handling
> > left something to be desired. Email me directly or on the BioRuby
> > list if you want to follow up on that.
> >
> > Regards,
> >
> > Peter
> 
> I did a little on this, worth following up on, but I pulled the FASTQ test examples you created from the paper to test it out.  IIRC it parsed where it needed to, but I'm not sure how it handled bad sequences, so yes, worth looking into.  Maybe worth moving to open-bio-l for broader discussion.
> 
> chris
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From amackey at virginia.edu  Thu Feb  7 11:09:14 2013
From: amackey at virginia.edu (Aaron Mackey)
Date: Thu, 7 Feb 2013 11:09:14 -0500
Subject: [Bioperl-l] FASTQ, was Re: BioPerl long-term,
 was Re: dependencies on perl version
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1C42B@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
	<CAKVJ-_6v2r4R=F-sEAtC9TCLsuU1VxNi6vk-E4gsd2e=Ri0pjQ@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu>
	<CAErFSojxeHBTcNK0GiYQ8D-MbPgzMvZ8xfnbeVU0-KaCNq7ZXw@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1C42B@CHIMBX5.ad.uillinois.edu>
Message-ID: <CAErFSoitVuxPbBHbHcEh=dZ+A8qPjjmNvF14iYBVK=FKRKL5ig@mail.gmail.com>

e.g., a pull-based FASTQ parser that did nothing else at the top level but
"chunk" the file into as-yet-unparsed four-line blobs could appear to work
very fast, if the user code did nothing but count the number of entries:

  while (my $seq = $seqio->nextseq) { $ct++ };

in other words, you defer *everything* except the minimal amount of
parsing/logic required to detect object boundaries.

This is, in fact, the exact opposite of the event-based SearchIO "push"
parsers, which always perform the most parsing possible, despite the user
never accessing most of the material.

Lastly, with respect to performance, if the parsing/object building
operation is not simply IO bound, then parallel parser/object-building CPU
threads could be considered, which could then dynamically adapt to
pre-parse attributes (e.g. quality scores) that the calling code was
actually using.  What's the state of thread-safe Perl these days?

-Aaron


On Thu, Feb 7, 2013 at 10:56 AM, Fields, Christopher J <
cjfields at illinois.edu> wrote:

> This will likely be the approach for more NGS-friendly Bio::Seq class.
>  Calculation of the PHRED scores could also be deferred until needed.
>
> seqtk has some C-based methods that we could possibly take advantage of,
> but will have to look into it.
>
> chris
>
> On Feb 7, 2013, at 9:25 AM, Aaron Mackey <amackey at virginia.edu> wrote:
>
> > You might also want to consider a lazy/pull-based parser to defer
> parsing/object-building for pieces of the object that don't get used.  This
> also usually provides some error tolerance.
> >
> > -Aaron
>


From sidd.basu at gmail.com  Thu Feb  7 11:38:47 2013
From: sidd.basu at gmail.com (Siddhartha Basu)
Date: Thu, 7 Feb 2013 10:38:47 -0600
Subject: [Bioperl-l]  Re: FASTQ, was Re:BioPerl long-term,
	was Re:	dependencies on perl version
In-Reply-To: <CAErFSoitVuxPbBHbHcEh=dZ+A8qPjjmNvF14iYBVK=FKRKL5ig@mail.gmail.com>
References: <CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
	<CAKVJ-_6v2r4R=F-sEAtC9TCLsuU1VxNi6vk-E4gsd2e=Ri0pjQ@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu>
	<CAErFSojxeHBTcNK0GiYQ8D-MbPgzMvZ8xfnbeVU0-KaCNq7ZXw@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1C42B@CHIMBX5.ad.uillinois.edu>
	<CAErFSoitVuxPbBHbHcEh=dZ+A8qPjjmNvF14iYBVK=FKRKL5ig@mail.gmail.com>
Message-ID: <5113d899.ea64320a.489a.262d@mx.google.com>

Another approach might be use map-reduce(Hadoop) if possible. I have
seen one implementation in biopython's GFF3 parser.
http://bcbio.wordpress.com/2009/03/22/mapreduce-implementation-of-gff-parsing-for-biopython/

-siddhartha


On Thu, 07 Feb 2013, Aaron Mackey wrote:

> e.g., a pull-based FASTQ parser that did nothing else at the top level but
> "chunk" the file into as-yet-unparsed four-line blobs could appear to work
> very fast, if the user code did nothing but count the number of entries:
> 
>   while (my $seq = $seqio->nextseq) { $ct++ };
> 
> in other words, you defer *everything* except the minimal amount of
> parsing/logic required to detect object boundaries.
> 
> This is, in fact, the exact opposite of the event-based SearchIO "push"
> parsers, which always perform the most parsing possible, despite the user
> never accessing most of the material.
> 
> Lastly, with respect to performance, if the parsing/object building
> operation is not simply IO bound, then parallel parser/object-building CPU
> threads could be considered, which could then dynamically adapt to
> pre-parse attributes (e.g. quality scores) that the calling code was
> actually using.  What's the state of thread-safe Perl these days?
> 
> -Aaron
> 
> 
> On Thu, Feb 7, 2013 at 10:56 AM, Fields, Christopher J <
> cjfields at illinois.edu> wrote:
> 
> > This will likely be the approach for more NGS-friendly Bio::Seq class.
> >  Calculation of the PHRED scores could also be deferred until needed.
> >
> > seqtk has some C-based methods that we could possibly take advantage of,
> > but will have to look into it.
> >
> > chris
> >
> > On Feb 7, 2013, at 9:25 AM, Aaron Mackey <amackey at virginia.edu> wrote:
> >
> > > You might also want to consider a lazy/pull-based parser to defer
> > parsing/object-building for pieces of the object that don't get used.  This
> > also usually provides some error tolerance.
> > >
> > > -Aaron
> >
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Thu Feb  7 11:55:53 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Thu, 7 Feb 2013 16:55:53 +0000
Subject: [Bioperl-l] FASTQ, was Re:BioPerl long-term,
	was Re:	dependencies on perl version
In-Reply-To: <5113d899.ea64320a.489a.262d@mx.google.com>
References: <CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
	<CAKVJ-_6v2r4R=F-sEAtC9TCLsuU1VxNi6vk-E4gsd2e=Ri0pjQ@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu>
	<CAErFSojxeHBTcNK0GiYQ8D-MbPgzMvZ8xfnbeVU0-KaCNq7ZXw@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1C42B@CHIMBX5.ad.uillinois.edu>
	<CAErFSoitVuxPbBHbHcEh=dZ+A8qPjjmNvF14iYBVK=FKRKL5ig@mail.gmail.com>
	<5113d899.ea64320a.489a.262d@mx.google.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1C7B8@CHIMBX5.ad.uillinois.edu>

I think we will want to allow for a multitude of implementations.  SeqIO already allows for that to a degree, but multiple backend implementations (say, different ways of parsing/processing FASTQ and others) isn't supported yet.

chris

On Feb 7, 2013, at 10:38 AM, Siddhartha Basu <sidd.basu at gmail.com> wrote:

> Another approach might be use map-reduce(Hadoop) if possible. I have
> seen one implementation in biopython's GFF3 parser.
> http://bcbio.wordpress.com/2009/03/22/mapreduce-implementation-of-gff-parsing-for-biopython/
> 
> -siddhartha
> 
> 
> On Thu, 07 Feb 2013, Aaron Mackey wrote:
> 
>> e.g., a pull-based FASTQ parser that did nothing else at the top level but
>> "chunk" the file into as-yet-unparsed four-line blobs could appear to work
>> very fast, if the user code did nothing but count the number of entries:
>> 
>>  while (my $seq = $seqio->nextseq) { $ct++ };
>> 
>> in other words, you defer *everything* except the minimal amount of
>> parsing/logic required to detect object boundaries.
>> 
>> This is, in fact, the exact opposite of the event-based SearchIO "push"
>> parsers, which always perform the most parsing possible, despite the user
>> never accessing most of the material.
>> 
>> Lastly, with respect to performance, if the parsing/object building
>> operation is not simply IO bound, then parallel parser/object-building CPU
>> threads could be considered, which could then dynamically adapt to
>> pre-parse attributes (e.g. quality scores) that the calling code was
>> actually using.  What's the state of thread-safe Perl these days?
>> 
>> -Aaron
>> 
>> 
>> On Thu, Feb 7, 2013 at 10:56 AM, Fields, Christopher J <
>> cjfields at illinois.edu> wrote:
>> 
>>> This will likely be the approach for more NGS-friendly Bio::Seq class.
>>> Calculation of the PHRED scores could also be deferred until needed.
>>> 
>>> seqtk has some C-based methods that we could possibly take advantage of,
>>> but will have to look into it.
>>> 
>>> chris
>>> 
>>> On Feb 7, 2013, at 9:25 AM, Aaron Mackey <amackey at virginia.edu> wrote:
>>> 
>>>> You might also want to consider a lazy/pull-based parser to defer
>>> parsing/object-building for pieces of the object that don't get used.  This
>>> also usually provides some error tolerance.
>>>> 
>>>> -Aaron
>>> 
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Thu Feb  7 12:01:07 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Thu, 7 Feb 2013 17:01:07 +0000
Subject: [Bioperl-l] FASTQ, was Re: BioPerl long-term,
 was Re: dependencies on perl version
In-Reply-To: <CAErFSoitVuxPbBHbHcEh=dZ+A8qPjjmNvF14iYBVK=FKRKL5ig@mail.gmail.com>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
	<CAKVJ-_6v2r4R=F-sEAtC9TCLsuU1VxNi6vk-E4gsd2e=Ri0pjQ@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu>
	<CAErFSojxeHBTcNK0GiYQ8D-MbPgzMvZ8xfnbeVU0-KaCNq7ZXw@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1C42B@CHIMBX5.ad.uillinois.edu>
	<CAErFSoitVuxPbBHbHcEh=dZ+A8qPjjmNvF14iYBVK=FKRKL5ig@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1C7EF@CHIMBX5.ad.uillinois.edu>

re: thread-safe perl, so-so at best from what I understand.

chris

On Feb 7, 2013, at 10:09 AM, Aaron Mackey <amackey at virginia.edu> wrote:

> e.g., a pull-based FASTQ parser that did nothing else at the top level but "chunk" the file into as-yet-unparsed four-line blobs could appear to work very fast, if the user code did nothing but count the number of entries:
> 
>   while (my $seq = $seqio->nextseq) { $ct++ };
> 
> in other words, you defer *everything* except the minimal amount of parsing/logic required to detect object boundaries.
> 
> This is, in fact, the exact opposite of the event-based SearchIO "push" parsers, which always perform the most parsing possible, despite the user never accessing most of the material.
> 
> Lastly, with respect to performance, if the parsing/object building operation is not simply IO bound, then parallel parser/object-building CPU threads could be considered, which could then dynamically adapt to pre-parse attributes (e.g. quality scores) that the calling code was actually using.  What's the state of thread-safe Perl these days?
> 
> -Aaron
> 
> 
> On Thu, Feb 7, 2013 at 10:56 AM, Fields, Christopher J <cjfields at illinois.edu> wrote:
> This will likely be the approach for more NGS-friendly Bio::Seq class.  Calculation of the PHRED scores could also be deferred until needed.
> 
> seqtk has some C-based methods that we could possibly take advantage of, but will have to look into it.
> 
> chris
> 
> On Feb 7, 2013, at 9:25 AM, Aaron Mackey <amackey at virginia.edu> wrote:
> 
> > You might also want to consider a lazy/pull-based parser to defer parsing/object-building for pieces of the object that don't get used.  This also usually provides some error tolerance.
> >
> > -Aaron


From hartzell at alerce.com  Thu Feb  7 16:36:24 2013
From: hartzell at alerce.com (George Hartzell)
Date: Thu, 7 Feb 2013 13:36:24 -0800
Subject: [Bioperl-l]  BioPerl long-term,
	was Re:  dependencies on perl version
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
Message-ID: <20756.7768.125680.662488@gargle.gargle.HOWL>

Fields, Christopher J writes:
 > George,
 > 
 > Should put your post on a pedestal :)
 > 
 > tl;dr version: I completely agree, but we need help in order to do this.
 > [...]

And therein lies the [a] problem.  Don't look at me....

I'm not coding on bioinformatics problems these days (though I'm
available...) so _maybe_ I shouldn't have gotten up on the soapbox.

But I'm so sick of getting into arguments (or walking away from
them...) with Ruby and Python [and lisp and *PHP*] fans; Perl is dead,
you can't write good code in Perl, look - Ruby has GEMS!, etc...

Perl of the olden days was an easy language in which to write really
shitty code.  Even the Perl of the BioPerl heyday wasn't really much
help; role your own OO, role your own distro-building, mountains of
monkey-work to provide consistent POD, versioning, etc...

But that's not the Perl that I use.  I have Moose and Moo.  TAP and
the things built on it.  Dist::Zilla.  PerlTidy.  PerlCritic.  cpanm.
MetaCPAN.  Pinto.  GitHub.  Perlbrew.  Wow.

It isn't any harder to write good code, for measures that I care
about, using Perl than it is *any* of the other similar languages.

And it's just as easy, and happens just as frequently, for people to
write shitty (undocumented, untested, poorly managed, poorly packaged,
...) stuff in the other languages.

GET OFF MY LAWN, KID! (Yeah, I know...)

But BioPerl *is* dying.  You might be standing on the shoulders of
giants when you use it to solve a problem, but you *definitely* have
those same giants (and their extended families) on your shoulders
every time I see you try move the project forward.  All of that
history has become the tail that's wagging the dog.

If all y'all are going to keep the thing alive, moving forward and
contributing to new great works then make Apple your hero.  Deprecate
the stuff that's holding you back, give folks a path forward and move
on.

Have fun.  Use sharp tools.  Do cool science.  Build cool things.
Advance your careers (forgot that one last time).  Be reasonable and
professional.

Supporting last year's projects is someone else's business
opportunity.

g.

ps.  Are all y'all following this thread?

     http://news.ycombinator.com/item?id=5123022

Maybe someone should search down for this bit: "Where to start? Any
list of this [sic] projects?" and insert a plug for the various
open-bio projects.  (But "someone" doesn't work here, he said...).


From cjfields at illinois.edu  Thu Feb  7 18:12:19 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Thu, 7 Feb 2013 23:12:19 +0000
Subject: [Bioperl-l] BioPerl long-term,
 was Re:  dependencies on perl version
In-Reply-To: <20756.7768.125680.662488@gargle.gargle.HOWL>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
	<20756.7768.125680.662488@gargle.gargle.HOWL>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1D071@CHIMBX5.ad.uillinois.edu>

On Feb 7, 2013, at 3:36 PM, George Hartzell <hartzell at alerce.com> wrote:

> Fields, Christopher J writes:
>> George,
>> 
>> Should put your post on a pedestal :)
>> 
>> tl;dr version: I completely agree, but we need help in order to do this.
>> [...]
> 
> And therein lies the [a] problem.  Don't look at me....
> 
> I'm not coding on bioinformatics problems these days (though I'm
> available...) so _maybe_ I shouldn't have gotten up on the soapbox.
> 
> But I'm so sick of getting into arguments (or walking away from
> them...) with Ruby and Python [and lisp and *PHP*] fans; Perl is dead,
> you can't write good code in Perl, look - Ruby has GEMS!, etc?

Right, but that's a perception not just in the Bio* world.  It's larger and more pervasive than that.  

> Perl of the olden days was an easy language in which to write really
> shitty code.  Even the Perl of the BioPerl heyday wasn't really much
> help; role your own OO, role your own distro-building, mountains of
> monkey-work to provide consistent POD, versioning, etc...
> 
> But that's not the Perl that I use.  I have Moose and Moo.  TAP and
> the things built on it.  Dist::Zilla.  PerlTidy.  PerlCritic.  cpanm.
> MetaCPAN.  Pinto.  GitHub.  Perlbrew.  Wow.

Yes, and that is the direction we need to go in.

> It isn't any harder to write good code, for measures that I care
> about, using Perl than it is *any* of the other similar languages.
> 
> And it's just as easy, and happens just as frequently, for people to
> write shitty (undocumented, untested, poorly managed, poorly packaged,
> ...) stuff in the other languages.

Oh, I know.  I'm working on some very nice looking but terribly implemented Python code now.

> GET OFF MY LAWN, KID! (Yeah, I know...)
> 
> But BioPerl *is* dying.  You might be standing on the shoulders of
> giants when you use it to solve a problem, but you *definitely* have
> those same giants (and their extended families) on your shoulders
> every time I see you try move the project forward.  All of that
> history has become the tail that's wagging the dog.

Yep.

> If all y'all are going to keep the thing alive, moving forward and
> contributing to new great works then make Apple your hero.  Deprecate
> the stuff that's holding you back, give folks a path forward and move
> on.

That's fine.

> Have fun.  Use sharp tools.  Do cool science.  Build cool things.
> Advance your careers (forgot that one last time).  Be reasonable and
> professional.
> 
> Supporting last year's projects is someone else's business
> opportunity.
> 
> g.

Right, but this isn't just my show.  I can't do this alone; it's simply too much code and I don't have even 1/4 the time I used to have.

> ps.  Are all y'all following this thread?
> 
>     http://news.ycombinator.com/item?id=5123022
> 
> Maybe someone should search down for this bit: "Where to start? Any
> list of this [sic] projects?" and insert a plug for the various
> open-bio projects.  (But "someone" doesn't work here, he said?).

Read the original guy's post.  He's completely delusional (okay, maybe not *completely*, but he comes across as quite bitter and unrealistic).  

Frankly I don't feel so bad if he wants to leave.  He doesn't like messy things.  Biology is messy, if one doesn't understand that then computational biology is not for them.

chris


From carandraug+dev at gmail.com  Thu Feb  7 23:12:22 2013
From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=)
Date: Fri, 8 Feb 2013 04:12:22 +0000
Subject: [Bioperl-l] BioPerl long-term,
	was Re: dependencies on perl version
Message-ID: <CAPOrs_1+oYc20aMvUKOKdeX78XwdZaduh7LKeEG=UQrRgYB6+A@mail.gmail.com>

On 6 February 2013 22:11, "Fields, Christopher J" <cjfields at illinois.edu> wrote:
> [...]
> So:
>
> If it means targeting performance, backwards-compatibility be damned (using Devel::NYTProf?), we do that.
>
> If it means creating a new Bio-NGS repo to focus some of these efforts, so be it.
>
> If it means we get away from the Java-based interface stuff in favor of something more Perl-like (roles anyone?), then I'm all for it.
>
> If it means we modularize BioPerl so this can be done, well, you probably know where I stand (yes).
>
> If it means this is to be BioPerl 2.0, then let's move that direction, sooner than later.
>
> But I can't do it alone.  We (not just me, but we) need to drive the direction we take.
>
> First one who codes gets the gold ring.

Hi

I know I'm not much involved with bioperl development but here's my
suggestion as maintainer of another quite modular free software
project. I swear I'm not promoting it. Skip to the last paragraph for
the very short version.

Octave Forge is now a collection of packages for GNU Octave, each
released independently whenever its maintainer sees fit. But it wasn't
like that before. For a long time, everything was released at the same
time, there was no independent packages. Then it was decided to split
it into sections: main, extra and nonfree (free software dependent on
non-free libraries, now purged), and inside those, it was split into
packages, each with its own maintainer. But some packages were (and
are) more active that the others. Some packages even came from single
contributions and we never heard from the authors again. And so, with
time, cruft settled in.

We didn't want to remove the code, but no one was interested or
comfortable enough on the field, to fix it either. Packages that had a
much more active development were being dragged down by code that no
one was maintaining. So we broke with that and each package is now
released independently. We have packages that haven't been released in
3 years yes, but that just shows the packages that no one cares about.
Those have been marked as unmaintained and anyone can come around and
make a release if they care about it.

As the maintainer of the project, I do *not* make the releases of the
packages. The package maintainers prepares everything and uploads
them, I only run a handful of tests (takes me 10min), upload it to our
server, and make the official announcement. I am also the maintainer
of one of the packages, and have often made releases of unmaintained
packages because I needed it. That's to show, if they are important
enough for someone, they will get a release somehow. If they are not
important, why would we waste our time on them anyway? We now around 5
package releases per month, many of them being minor releases with a
handful of bug fixes. Preparing a release of a small package is much
easier and much less trouble than preparing a giant release
encompassing all of them at the same time.

Short version:
I'd recommend to split the project into much smaller ones. Some of the
small ones will wither and die but those are the less important ones,
and will allow the others, the ones that people care about, freedom to
grow faster. Bioperl would still be just one project, that
incorporates a hundred or so of smaller modules. Let those who care
the most about a specific module to take care of it and make the
releases. Releasing a module becomes much simpler, which means more
releases, more activity, and the smaller code base for each module
also make it less intimidating for new contributors.

Carn?


From hartzell at alerce.com  Fri Feb  8 01:17:17 2013
From: hartzell at alerce.com (George Hartzell)
Date: Thu, 7 Feb 2013 22:17:17 -0800
Subject: [Bioperl-l] injecting a bit of levity....
Message-ID: <20756.39021.553502.116384@gargle.gargle.HOWL>


Perl's not dead.  It's FAMOUS!

  http://imgs.xkcd.com/comics/perl_problems.png

g.


From carandraug+dev at gmail.com  Fri Feb  8 01:57:30 2013
From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=)
Date: Fri, 8 Feb 2013 06:57:30 +0000
Subject: [Bioperl-l] getting a Bio::Search::HSP::HSPI from Bio::SimpleAlign
 (to find differences between sequences)
Message-ID: <CAPOrs_084-eh9kq=uWk19jvLagKKGr2qOs3HpGLpBt7YOLaO4A@mail.gmail.com>

Hi

I already have a Bio::SimpleAlign object (got it after using TCoffee
through bioperl-run module) and I'm trying to get a
Bio::Search::HSP::HSPI object from a pair of the aligned sequences.
How can I do this? I want to use the seq_inds method to compare the
sequences.

Here's my actual problem just in case I should be trying to fix it
some other way. I have a bunch of sequences from protein isoforms.
They have small differences between them, point-mutations, small
insertions or deletions, nothing too big. I want to make a table of
the mutations that each of them has against the consensus sequence. I
already made the alignment and got have the consensus with
"$align->consensus_string". Now, I want to get something like:

isoform1: Ala67Gly, His90_Met91insGln
isoform2: ....

The seq_inds method from the Bio::Search::HSP::HSPI class seems to do
the part of finding the differences, but how can I get one? I can't
find it on the documentation.

Any tips, and even showing a different approach to my problem, are
most appreciated. Thanks,

Carn?


From l.m.timmermans at students.uu.nl  Fri Feb  8 06:18:58 2013
From: l.m.timmermans at students.uu.nl (Leon Timmermans)
Date: Fri, 8 Feb 2013 12:18:58 +0100
Subject: [Bioperl-l] BioPerl long-term,
	was Re: dependencies on perl version
In-Reply-To: <20756.7768.125680.662488@gargle.gargle.HOWL>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
	<20756.7768.125680.662488@gargle.gargle.HOWL>
Message-ID: <CAC1jpXA-bu20fP0WsRi=bJKxnBkfL=KJyB5n8h_XMh6eTOq3uQ@mail.gmail.com>

On Thu, Feb 7, 2013 at 10:36 PM, George Hartzell <hartzell at alerce.com> wrote:
> But I'm so sick of getting into arguments (or walking away from
> them...) with Ruby and Python [and lisp and *PHP*] fans; Perl is dead,
> you can't write good code in Perl, look - Ruby has GEMS!, etc...
>
> Perl of the olden days was an easy language in which to write really
> shitty code.  Even the Perl of the BioPerl heyday wasn't really much
> help; role your own OO, role your own distro-building, mountains of
> monkey-work to provide consistent POD, versioning, etc...
>
> But that's not the Perl that I use.  I have Moose and Moo.  TAP and
> the things built on it.  Dist::Zilla.  PerlTidy.  PerlCritic.  cpanm.
> MetaCPAN.  Pinto.  GitHub.  Perlbrew.  Wow.

I share that experience.

> But BioPerl *is* dying.  You might be standing on the shoulders of
> giants when you use it to solve a problem, but you *definitely* have
> those same giants (and their extended families) on your shoulders
> every time I see you try move the project forward.  All of that
> history has become the tail that's wagging the dog.

I share your sentiment. Most of BioPerl is architected so badly I
can't stomach it most days, and I've worked on hairy codebases
included perl itself. There's just too much sick and wrong. It's like
hundreds of dot-com-era cgi scripts.

The problem (which is common in scientific computing) is that once
code works it's effectively abandoned. BioPerl is essentially a
gathering of more than a thousand such modules.

> If all y'all are going to keep the thing alive, moving forward and
> contributing to new great works then make Apple your hero.  Deprecate
> the stuff that's holding you back, give folks a path forward and move
> on.

That would be lovely, but who is going to do that? We're suffering
from the tragedy of the commons.

> Have fun.  Use sharp tools.  Do cool science.  Build cool things.
> Advance your careers (forgot that one last time).  Be reasonable and
> professional.

Sounds like good advice to me :-)

> Supporting last year's projects is someone else's business
> opportunity.

True!

> ps.  Are all y'all following this thread?
>
>      http://news.ycombinator.com/item?id=5123022
>
> Maybe someone should search down for this bit: "Where to start? Any
> list of this [sic] projects?" and insert a plug for the various
> open-bio projects.  (But "someone" doesn't work here, he said...).

Interesting discussion, though the original post is too cynical even
for my taste.

Leon


From cjfields at illinois.edu  Fri Feb  8 09:08:56 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Fri, 8 Feb 2013 14:08:56 +0000
Subject: [Bioperl-l] BioPerl long-term,
 was Re: dependencies on perl version
In-Reply-To: <CAC1jpXA-bu20fP0WsRi=bJKxnBkfL=KJyB5n8h_XMh6eTOq3uQ@mail.gmail.com>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
	<20756.7768.125680.662488@gargle.gargle.HOWL>
	<CAC1jpXA-bu20fP0WsRi=bJKxnBkfL=KJyB5n8h_XMh6eTOq3uQ@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1DA2D@CHIMBX5.ad.uillinois.edu>

On Feb 8, 2013, at 5:18 AM, Leon Timmermans <l.m.timmermans at students.uu.nl> wrote:

> On Thu, Feb 7, 2013 at 10:36 PM, George Hartzell <hartzell at alerce.com> wrote:
>> But I'm so sick of getting into arguments (or walking away from
>> them...) with Ruby and Python [and lisp and *PHP*] fans; Perl is dead,
>> you can't write good code in Perl, look - Ruby has GEMS!, etc...
>> 
>> Perl of the olden days was an easy language in which to write really
>> shitty code.  Even the Perl of the BioPerl heyday wasn't really much
>> help; role your own OO, role your own distro-building, mountains of
>> monkey-work to provide consistent POD, versioning, etc...
>> 
>> But that's not the Perl that I use.  I have Moose and Moo.  TAP and
>> the things built on it.  Dist::Zilla.  PerlTidy.  PerlCritic.  cpanm.
>> MetaCPAN.  Pinto.  GitHub.  Perlbrew.  Wow.
> 
> I share that experience.
> 
>> But BioPerl *is* dying.  You might be standing on the shoulders of
>> giants when you use it to solve a problem, but you *definitely* have
>> those same giants (and their extended families) on your shoulders
>> every time I see you try move the project forward.  All of that
>> history has become the tail that's wagging the dog.
> 
> I share your sentiment. Most of BioPerl is architected so badly I
> can't stomach it most days, and I've worked on hairy codebases
> included perl itself. There's just too much sick and wrong. It's like
> hundreds of dot-com-era cgi scripts.
> 
> The problem (which is common in scientific computing) is that once
> code works it's effectively abandoned. BioPerl is essentially a
> gathering of more than a thousand such modules.

Yep, the progression from 'it works' to 'it works very well' tends to have very high activation energy.  Many of the fixes tend to be more bandaids (get it working) than fundamental surgery.  I tried my hand at this, got a few things done.

>> If all y'all are going to keep the thing alive, moving forward and
>> contributing to new great works then make Apple your hero.  Deprecate
>> the stuff that's holding you back, give folks a path forward and move
>> on.
> 
> That would be lovely, but who is going to do that? We're suffering
> from the tragedy of the commons.

Spot on, but we could break that path for the time being.  I think BioPerl as is will have to be in maintenance mode; we need a new effort to break with older perl, older practices.  

>> Have fun.  Use sharp tools.  Do cool science.  Build cool things.
>> Advance your careers (forgot that one last time).  Be reasonable and
>> professional.
> 
> Sounds like good advice to me :-)
> 
>> Supporting last year's projects is someone else's business
>> opportunity.
> 
> True!

We just need to make a bioperl 1.x branch for the maintenance bit, rechristen 'master' as 'v2', and just move on to fixing the f****** code.  Let's move on that.

>> ps.  Are all y'all following this thread?
>> 
>>     http://news.ycombinator.com/item?id=5123022
>> 
>> Maybe someone should search down for this bit: "Where to start? Any
>> list of this [sic] projects?" and insert a plug for the various
>> open-bio projects.  (But "someone" doesn't work here, he said...).
> 
> Interesting discussion, though the original post is too cynical even
> for my taste.
> 
> Leon

Yes, that's not unusual unfortunately.  We have a number of physicists and mathematicians here who have started their initial forays into computational biology, they're all startled at how noisy it is and how messy code can.  Of course their disciplines have had the benefit of teaching students how to (somewhat decently) code for the last 40 years.

chris


From l.m.timmermans at students.uu.nl  Fri Feb  8 07:08:06 2013
From: l.m.timmermans at students.uu.nl (Leon Timmermans)
Date: Fri, 8 Feb 2013 13:08:06 +0100
Subject: [Bioperl-l] BioPerl long-term,
	was Re: dependencies on perl version
In-Reply-To: <CAPOrs_1+oYc20aMvUKOKdeX78XwdZaduh7LKeEG=UQrRgYB6+A@mail.gmail.com>
References: <CAPOrs_1+oYc20aMvUKOKdeX78XwdZaduh7LKeEG=UQrRgYB6+A@mail.gmail.com>
Message-ID: <CAC1jpXAZJK=B_GDOTb=zznj=p+bmTQq9QrD6Lkw+do7kM89K2w@mail.gmail.com>

On Fri, Feb 8, 2013 at 5:12 AM, Carn? Draug <carandraug+dev at gmail.com> wrote:
> Short version:
> I'd recommend to split the project into much smaller ones. Some of the
> small ones will wither and die but those are the less important ones,
> and will allow the others, the ones that people care about, freedom to
> grow faster. Bioperl would still be just one project, that
> incorporates a hundred or so of smaller modules. Let those who care
> the most about a specific module to take care of it and make the
> releases. Releasing a module becomes much simpler, which means more
> releases, more activity, and the smaller code base for each module
> also make it less intimidating for new contributors.

That has been a goal for some time now, but it's fairly complicated.
Not only do we have a LOT of modules (bioperl-live alone is more than
900), they also have complicated dependencies. I've attached the
results of my static dependency analysis of bioperl-live. I suspect
this split-up needs to done by automated graph analysis, it's too much
to do by hand.

Leon
-------------- next part --------------
A non-text attachment was scrubbed...
Name: deps.dot
Type: application/octet-stream
Size: 93463 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20130208/bdbbda1e/attachment-0003.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: deps.png
Type: image/png
Size: 6694525 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20130208/bdbbda1e/attachment-0003.png>

From sebastien.moretti at unil.ch  Fri Feb  8 11:19:29 2013
From: sebastien.moretti at unil.ch (=?ISO-8859-1?Q?Moretti_S=E9bastien?=)
Date: Fri, 08 Feb 2013 17:19:29 +0100
Subject: [Bioperl-l] PhyloXML
Message-ID: <51152591.9010402@unil.ch>

Hi

I would like to add some XML to an existing PhyloXML tree.

No problem to read and write it.
I would like to add <name>smthg</name> after the <phylogeny> tag as in 
http://www.phyloxml.org/examples_syntax/phyloxml_syntax_example_1.html
but get problems with add_phyloXML_annotation() :

Can't locate object method "annotation" via package "Bio::Tree::Tree" at
         /software/share/perl5/vendor_perl/Bio/TreeIO/phyloxml.pm line 
984, <GEN0> line 1 (#1)
     (F) You called a method correctly, and it correctly indicated a package
     functioning as a class, but that package doesn't define that particular
     method, nor does any of its base classes.  See perlobj.

Uncaught exception from user code:
         Can't locate object method "annotation" via package 
"Bio::Tree::Tree" at 
/software/share/perl5/vendor_perl/Bio/TreeIO/phyloxml.pm line 984, 
<GEN0> line 1.
  at /software/share/perl5/vendor_perl/Bio/TreeIO/phyloxml.pm line 984
 
Bio::TreeIO::phyloxml::element_default('Bio::TreeIO::phyloxml=HASH(0x134b1268)') 
called at /software/share/perl5/vendor_perl/Bio/TreeIO/phyloxml.pm line 670
 
Bio::TreeIO::phyloxml::processXMLNode('Bio::TreeIO::phyloxml=HASH(0x134b1268)') 
called at /software/share/perl5/vendor_perl/Bio/TreeIO/phyloxml.pm line 309
 
Bio::TreeIO::phyloxml::add_phyloXML_annotation('Bio::TreeIO::phyloxml=HASH(0x134b1268)', 
'-obj', 'Bio::Tree::Tree=HASH(0x13525258)', '-xml', '<name>SUMF 
family</name>') called at ./add_annotation_to_phyloxml.pl line 40


I think I do something wrong but what ?
Here is the code

my $treeio = new Bio::TreeIO(-file   => "$infile",
                              -format => 'phyloxml',
                             );
my $tree = $treeio->next_tree;

# Add annotation
$treeio->add_phyloXML_annotation(-obj => $tree,
                                  -xml => '<name>SUMF family</name>',
                                 );

-- 
S?bastien Moretti


From cjfields at illinois.edu  Sat Feb  9 01:25:17 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Sat, 9 Feb 2013 06:25:17 +0000
Subject: [Bioperl-l] BioPerl future
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1F217@CHIMBX5.ad.uillinois.edu>

All,

(cross-posting to gmod-gbrowse)

I want to gauge the community's thoughts on a few things.  At the moment I think we can safely say that BioPerl 1.x is in maintenance mode.  By 'maintenance mode', I mean that we can only do so much with it w/o breaking backwards compatibility with old scripts.  We need a way forward so that we can address fundamental problems within the core codebase, namely speed.

I am thinking at the moment of pushing a 'v1' branch next week after I make an official announcement, with a new 1.6 release coming out from that branch (as already announced, tentatively scheduled for March 1).  That frees up master for any code development, removal of modules/cruft, etc.  This will open an initial path forward and at least enable us to do more.  Make sense?  This of course means that any code reliant on v1 should pull from that branch instead of 'master'.  

Thoughts?  

chris


From cjfields at illinois.edu  Sat Feb  9 01:43:24 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Sat, 9 Feb 2013 06:43:24 +0000
Subject: [Bioperl-l] BioPerl long-term,
 was Re: dependencies on perl version
In-Reply-To: <CAC1jpXAZJK=B_GDOTb=zznj=p+bmTQq9QrD6Lkw+do7kM89K2w@mail.gmail.com>
References: <CAPOrs_1+oYc20aMvUKOKdeX78XwdZaduh7LKeEG=UQrRgYB6+A@mail.gmail.com>
	<CAC1jpXAZJK=B_GDOTb=zznj=p+bmTQq9QrD6Lkw+do7kM89K2w@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1F2C6@CHIMBX5.ad.uillinois.edu>

On Feb 8, 2013, at 6:08 AM, Leon Timmermans <l.m.timmermans at students.uu.nl> wrote:

> On Fri, Feb 8, 2013 at 5:12 AM, Carn? Draug <carandraug+dev at gmail.com> wrote:
>> Short version:
>> I'd recommend to split the project into much smaller ones. Some of the
>> small ones will wither and die but those are the less important ones,
>> and will allow the others, the ones that people care about, freedom to
>> grow faster. Bioperl would still be just one project, that
>> incorporates a hundred or so of smaller modules. Let those who care
>> the most about a specific module to take care of it and make the
>> releases. Releasing a module becomes much simpler, which means more
>> releases, more activity, and the smaller code base for each module
>> also make it less intimidating for new contributors.
> 
> That has been a goal for some time now, but it's fairly complicated.
> Not only do we have a LOT of modules (bioperl-live alone is more than
> 900), they also have complicated dependencies. I've attached the
> results of my static dependency analysis of bioperl-live. I suspect
> this split-up needs to done by automated graph analysis, it's too much
> to do by hand.
> 
> Leon
> <deps.dot><deps.png>

Leon, 

I'm hoping we can do this sooner than later.  In fact, if we proceed with make a 'v1' branch or something similar, we can start extricating out code sooner than later (next few weeks).

chris


From cjfields at illinois.edu  Sat Feb  9 08:51:35 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Sat, 9 Feb 2013 13:51:35 +0000
Subject: [Bioperl-l] [Gmod-gbrowse] BioPerl future
Message-ID: <prc698q0fqtymq1n70jhdi5w.1360417710993@email.android.com>

Sheldon,

The branch is where the old (v1.x) code would reside.  Master branch would be v2.

Chris


Sent via phone


-------- Original message --------
From: Sheldon McKay <sheldon.mckay at gmail.com>
Date:
To: "Fields, Christopher J" <cjfields at illinois.edu>
Cc: BioPerl List <Bioperl-l at lists.open-bio.org>,gmod-gbrowse at lists.sourceforge.net
Subject: Re: [Gmod-gbrowse] BioPerl future


Hi Chris,

This sounds like a good idea.  I think it will eventually allow bioperl to evolve into a leaner, meaner package that would be more likely to be adopted by new or isolated bioinformaticians, who tend to be put off by the size and complexity of bioperl as it now stands.

One question I have is whether the name of branch v1 might be perceived as a step backward.  How about v2?

Sheldon

On Saturday, February 9, 2013, Fields, Christopher J wrote:
All,

(cross-posting to gmod-gbrowse)

I want to gauge the community's thoughts on a few things.  At the moment I think we can safely say that BioPerl 1.x is in maintenance mode.  By 'maintenance mode', I mean that we can only do so much with it w/o breaking backwards compatibility with old scripts.  We need a way forward so that we can address fundamental problems within the core codebase, namely speed.

I am thinking at the moment of pushing a 'v1' branch next week after I make an official announcement, with a new 1.6 release coming out from that branch (as already announced, tentatively scheduled for March 1).  That frees up master for any code development, removal of modules/cruft, etc.  This will open an initial path forward and at least enable us to do more.  Make sense?  This of course means that any code reliant on v1 should pull from that branch instead of 'master'.

Thoughts?

chris
------------------------------------------------------------------------------
Free Next-Gen Firewall Hardware Offer
Buy your Sophos next-gen firewall before the end March 2013
and get the hardware for free! Learn more.
http://p.sf.net/sfu/sophos-d2d-feb
_______________________________________________
Gmod-gbrowse mailing list
Gmod-gbrowse at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse


--
Sheldon McKay, PhD
Computational Biologist
DNA Learning Center
Cold Spring Harbor Laboratory
1 Bungtown Rd
Cold Spring Harbor, NY 11724
(516) 367-5185
www.dnalc.org<http://www.dnalc.org>


From sheldon.mckay at gmail.com  Sat Feb  9 08:04:50 2013
From: sheldon.mckay at gmail.com (Sheldon McKay)
Date: Sat, 9 Feb 2013 08:04:50 -0500
Subject: [Bioperl-l] [Gmod-gbrowse] BioPerl future
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1F217@CHIMBX5.ad.uillinois.edu>
References: <118F034CF4C3EF48A96F86CE585B94BF6CE1F217@CHIMBX5.ad.uillinois.edu>
Message-ID: <CAEs59kkOhJ-czn_aXOcP+yOszQdGGLgaAMNp+u_0MqS=xXapng@mail.gmail.com>

Hi Chris,

This sounds like a good idea.  I think it will eventually allow bioperl to
evolve into a leaner, meaner package that would be more likely to be
adopted by new or isolated bioinformaticians, who tend to be put off by
the size and complexity of bioperl as it now stands.

One question I have is whether the name of branch v1 might be perceived as
a step backward.  How about v2?

Sheldon

On Saturday, February 9, 2013, Fields, Christopher J wrote:

> All,
>
> (cross-posting to gmod-gbrowse)
>
> I want to gauge the community's thoughts on a few things.  At the moment I
> think we can safely say that BioPerl 1.x is in maintenance mode.  By
> 'maintenance mode', I mean that we can only do so much with it w/o breaking
> backwards compatibility with old scripts.  We need a way forward so that we
> can address fundamental problems within the core codebase, namely speed.
>
> I am thinking at the moment of pushing a 'v1' branch next week after I
> make an official announcement, with a new 1.6 release coming out from that
> branch (as already announced, tentatively scheduled for March 1).  That
> frees up master for any code development, removal of modules/cruft, etc.
>  This will open an initial path forward and at least enable us to do more.
>  Make sense?  This of course means that any code reliant on v1 should pull
> from that branch instead of 'master'.
>
> Thoughts?
>
> chris
>
> ------------------------------------------------------------------------------
> Free Next-Gen Firewall Hardware Offer
> Buy your Sophos next-gen firewall before the end March 2013
> and get the hardware for free! Learn more.
> http://p.sf.net/sfu/sophos-d2d-feb
> _______________________________________________
> Gmod-gbrowse mailing list
> Gmod-gbrowse at lists.sourceforge.net <javascript:;>
> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
>


-- 
Sheldon McKay, PhD
Computational Biologist
DNA Learning Center
Cold Spring Harbor Laboratory
1 Bungtown Rd
Cold Spring Harbor, NY 11724
(516) 367-5185
www.dnalc.org


From cjfields at illinois.edu  Sat Feb  9 23:25:14 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Sun, 10 Feb 2013 04:25:14 +0000
Subject: [Bioperl-l] BioPerl future
In-Reply-To: <2F53583C-9E7D-4D6A-A4C2-E5C27DDBA493@illinois.edu>
References: <2F53583C-9E7D-4D6A-A4C2-E5C27DDBA493@illinois.edu>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1FC4C@CHIMBX5.ad.uillinois.edu>

Apologies if you receive this twice. I never received the replies from the gbrowse list through bioperl-l so it is possible there were mail issues last night.

------------------------

All,

(cross-posting to gmod-gbrowse)

I want to gauge the community's thoughts on a few things.  At the moment I think we can safely say that BioPerl 1.x is in maintenance mode.  By 'maintenance mode', I mean that we can only do so much with it w/o breaking backwards compatibility with old scripts.  We need a way forward so that we can address fundamental problems within the core codebase, namely speed.

I am thinking at the moment of pushing a 'v1' branch next week after I make an official announcement, with a new 1.6 release coming out from that branch (as already announced, tentatively scheduled for March 1).  That frees up master for any code development, removal of modules/cruft, etc.  This will open an initial path forward and at least enable us to do more.  Make sense?  This of course means that any code reliant on v1 should pull from that branch instead of 'master'.  

Thoughts?  

chris


From genehack at genehack.org  Sat Feb  9 23:36:07 2013
From: genehack at genehack.org (John SJ Anderson)
Date: Sat, 9 Feb 2013 20:36:07 -0800
Subject: [Bioperl-l] BioPerl future
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1FC4C@CHIMBX5.ad.uillinois.edu>
References: <2F53583C-9E7D-4D6A-A4C2-E5C27DDBA493@illinois.edu>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1FC4C@CHIMBX5.ad.uillinois.edu>
Message-ID: <668BED38-61AE-4D21-A3BD-B7AEC9361EF6@genehack.org>

On Feb 9, 2013, at 8:25 PM, "Fields, Christopher J" <cjfields at illinois.edu> wrote:

> Thoughts?  

+1

The other thing to maybe give some advance thought to is organization of the new development. Maybe instead of one big repository, we can encourage the more loosely coupled small pieces that everybody seems to realize we need by having more, smaller repositories. 

j.

-- 
John SJ Anderson // genehack at genehack.org


From carandraug+dev at gmail.com  Sun Feb 10 13:40:33 2013
From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=)
Date: Sun, 10 Feb 2013 18:40:33 +0000
Subject: [Bioperl-l] BioPerl future
Message-ID: <CAPOrs_21WBiRwngD8_U4di_0WnXCz8cUHjv+oL6_m_UadBMfDg@mail.gmail.com>

On 10 February 2013 17:00,  <bioperl-l-request at lists.open-bio.org> wrote:
> Message: 3
> Date: Sat, 9 Feb 2013 20:36:07 -0800
> From: John SJ Anderson <genehack at genehack.org>
> Subject: Re: [Bioperl-l] BioPerl future
> To: "Fields, Christopher J" <cjfields at illinois.edu>
> Cc: BioPerl List <Bioperl-l at lists.open-bio.org>
> Message-ID: <668BED38-61AE-4D21-A3BD-B7AEC9361EF6 at genehack.org>
> Content-Type: text/plain; charset=us-ascii
>
> On Feb 9, 2013, at 8:25 PM, "Fields, Christopher J" <cjfields at illinois.edu> wrote:
>
>> Thoughts?
>
> +1
>
> The other thing to maybe give some advance thought to is organization of the new development. Maybe instead of one big repository, we can encourage the more loosely coupled small pieces that everybody seems to realize we need by having more, smaller repositories.

For those interested, I have just added instructions on the wiki on
how to split a subset of modules, tests, files, etc from the
bioperl-live repository into a new repository while keeping their old
history.

http://www.bioperl.org/wiki/Using_Git/Advanced#Split_a_module_from_bioperl-live

Carn?


From cjfields at illinois.edu  Sun Feb 10 15:08:35 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Sun, 10 Feb 2013 20:08:35 +0000
Subject: [Bioperl-l] BioPerl future
In-Reply-To: <CAPOrs_21WBiRwngD8_U4di_0WnXCz8cUHjv+oL6_m_UadBMfDg@mail.gmail.com>
References: <CAPOrs_21WBiRwngD8_U4di_0WnXCz8cUHjv+oL6_m_UadBMfDg@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE20632@CHIMBX5.ad.uillinois.edu>

On Feb 10, 2013, at 12:40 PM, Carn? Draug <carandraug+dev at gmail.com> wrote:

> On 10 February 2013 17:00,  <bioperl-l-request at lists.open-bio.org> wrote:
>> Message: 3
>> Date: Sat, 9 Feb 2013 20:36:07 -0800
>> From: John SJ Anderson <genehack at genehack.org>
>> Subject: Re: [Bioperl-l] BioPerl future
>> To: "Fields, Christopher J" <cjfields at illinois.edu>
>> Cc: BioPerl List <Bioperl-l at lists.open-bio.org>
>> Message-ID: <668BED38-61AE-4D21-A3BD-B7AEC9361EF6 at genehack.org>
>> Content-Type: text/plain; charset=us-ascii
>> 
>> On Feb 9, 2013, at 8:25 PM, "Fields, Christopher J" <cjfields at illinois.edu> wrote:
>> 
>>> Thoughts?
>> 
>> +1
>> 
>> The other thing to maybe give some advance thought to is organization of the new development. Maybe instead of one big repository, we can encourage the more loosely coupled small pieces that everybody seems to realize we need by having more, smaller repositories.
> 
> For those interested, I have just added instructions on the wiki on
> how to split a subset of modules, tests, files, etc from the
> bioperl-live repository into a new repository while keeping their old
> history.
> 
> http://www.bioperl.org/wiki/Using_Git/Advanced#Split_a_module_from_bioperl-live
> 
> Carn?

It's probably worth looking at this page as well, then:

http://www.bioperl.org/wiki/BioPerl_Modularization

We should probably merge the two.

chris


From hlapp at drycafe.net  Sun Feb 10 20:03:34 2013
From: hlapp at drycafe.net (Hilmar Lapp)
Date: Sun, 10 Feb 2013 20:03:34 -0500
Subject: [Bioperl-l] PhyloXML
In-Reply-To: <51152591.9010402@unil.ch>
References: <51152591.9010402@unil.ch>
Message-ID: <F041F111-CF8F-4096-9968-5F8CA5DCA866@drycafe.net>

On Feb 8, 2013, at 11:19 AM, Moretti S?bastien <sebastien.moretti at unil.ch> wrote:

> # Add annotation
> $treeio->add_phyloXML_annotation(-obj => $tree,
>                                -xml => '<name>SUMF family</name>',
>                               );

If you really have $treeio in your script in this line and not $tree, then that's at least one problem. But the stack suggests that the above isn't the exact line in your script - can you confirm that?

	-hilmar

-- 
===========================================================
: Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net :
===========================================================


From sebastien.moretti at unil.ch  Mon Feb 11 02:08:22 2013
From: sebastien.moretti at unil.ch (=?ISO-8859-1?Q?S=E9bastien_MORETTI?=)
Date: Mon, 11 Feb 2013 08:08:22 +0100
Subject: [Bioperl-l] PhyloXML
In-Reply-To: <F041F111-CF8F-4096-9968-5F8CA5DCA866@drycafe.net>
References: <51152591.9010402@unil.ch>
	<F041F111-CF8F-4096-9968-5F8CA5DCA866@drycafe.net>
Message-ID: <511898E6.7060400@unil.ch>

>> # Add annotation
>> $treeio->add_phyloXML_annotation(-obj => $tree,
>>                                 -xml => '<name>SUMF family</name>',
>>                                );
>
> If you really have $treeio in your script in this line and not $tree, then that's at least one problem. But the stack suggests that the above isn't the exact line in your script - can you confirm that?
>
> 	-hilmar

I replaced $treeio by $tree in the above line but still get an error.
Don't see what you mean by "the stack suggests that the above isn't the 
exact line in your script"

The only think I changed is the length of the xml string I try to 
insert. But get the same error with an empty xml string.


my $treeio = new Bio::TreeIO(-file   => "$infile",
                              -format => 'phyloxml',
                             );
my $tree = $treeio->next_tree;

# Add annotation
$tree->add_phyloXML_annotation(-obj => $tree,
                                -xml => '<name>SUMF family</name>',
                               );

Can't locate object method "add_phyloXML_annotation" via package
	"Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, <GEN0> 
line 1 (#1)
     (F) You called a method correctly, and it correctly indicated a package
     functioning as a class, but that package doesn't define that particular
     method, nor does any of its base classes.  See perlobj.

Uncaught exception from user code:
	Can't locate object method "add_phyloXML_annotation" via package 
"Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, <GEN0> line 1.
  at ./add_annotation_to_phyloxml.pl line 40


-- 
S?bastien Moretti
Department of Ecology and Evolution,
Biophore, University of Lausanne,
CH-1015 Lausanne, Switzerland
Tel.: +41 (21) 692 4221/4079
http://bioinfo.unil.ch/


From saladi1 at illinois.edu  Tue Feb 12 16:24:34 2013
From: saladi1 at illinois.edu (Shyam Saladi)
Date: Tue, 12 Feb 2013 13:24:34 -0800
Subject: [Bioperl-l] Bio::Tools::SeqStats->count_codons
Message-ID: <CAARX5cX31P-SwDAb1mfiCTUG00bBq_m37Eb3rBemSqD1TBo_nw@mail.gmail.com>

Hi,

I am using the count_codons method from Bio::Tools::SeqStats and keep
getting "AMBIGUOUS" codons, but I can't figure out why exactly.

When I translate the same sequence that gives the error using another
standard utility like (ExPASy - Translate), it seems to work alright.

An example sequence is below. Could anyone lend some insight?

Thanks,
Shyam


AAA     AAC     AAG     AAT     ACA     ACC     ACG     ACT     AGA     AGC
    AGT     *AMBIGUOUS*       ATA     ATC     ATG     ATT     CAA     CAC
  CAG     CAT     CCA     CCC     CCG     CCT     CGA     CGC     CGG
CGT     CTA     CTC     CTG     CTT     GAA     GAC     GAG     GAT     GCA
    GCC     GCG     GCT     GGA     GGC     GGG     GGT     GTA     GTC
GTG     GTT     TAA     TAC     TAT     TCA     TCC     TCG     TCT     TGG
    TGT     TTA     TTC     TTG     TTT     count   filename
1.722488038277511961722488038277511961722
2.966507177033492822966507177033492822967
1.531100478468899521531100478468899521531
0.9569377990430622009569377990430622009569
 0.4784688995215311004784688995215311004785
 1.722488038277511961722488038277511961722
1.33971291866028708133971291866028708134
 1.913875598086124401913875598086124401914
0.1913875598086124401913875598086124401914
 0.7655502392344497607655502392344497607656
 1.435406698564593301435406698564593301435       *
0.09569377990430622009569377990430622009569*
0.3827751196172248803827751196172248803828
 2.488038277511961722488038277511961722488
3.349282296650717703349282296650717703349
3.636363636363636363636363636363636363636
2.870813397129186602870813397129186602871
0.3827751196172248803827751196172248803828
 1.626794258373205741626794258373205741627
0.4784688995215311004784688995215311004785
 1.722488038277511961722488038277511961722
0.5741626794258373205741626794258373205742
 1.052631578947368421052631578947368421053
1.244019138755980861244019138755980861244
0.3827751196172248803827751196172248803828
 0.7655502392344497607655502392344497607656
 0.1913875598086124401913875598086124401914
 2.488038277511961722488038277511961722488
0.4784688995215311004784688995215311004785
 0.6698564593301435406698564593301435406699
 2.105263157894736842105263157894736842105
0.8612440191387559808612440191387559808612
 2.870813397129186602870813397129186602871
1.435406698564593301435406698564593301435
1.722488038277511961722488038277511961722
2.775119617224880382775119617224880382775
2.00956937799043062200956937799043062201
 2.488038277511961722488038277511961722488
3.540669856459330143540669856459330143541
2.00956937799043062200956937799043062201
 0.1913875598086124401913875598086124401914
 2.392344497607655502392344497607655502392
0.8612440191387559808612440191387559808612
 5.454545454545454545454545454545454545455
1.913875598086124401913875598086124401914
0.8612440191387559808612440191387559808612
 4.593301435406698564593301435406698564593
2.679425837320574162679425837320574162679
0.09569377990430622009569377990430622009569
1.148325358851674641148325358851674641148
1.148325358851674641148325358851674641148
0.8612440191387559808612440191387559808612
 0.4784688995215311004784688995215311004785
 2.105263157894736842105263157894736842105
0.9569377990430622009569377990430622009569
 0.9569377990430622009569377990430622009569
 0.09569377990430622009569377990430622009569
2.679425837320574162679425837320574162679
2.966507177033492822966507177033492822967
3.062200956937799043062200956937799043062
2.775119617224880382775119617224880382775       1045    temp.seq

ATGGCACGTTTTTTTATTGATCGTCCCATCTTTGCGTGGGTGATCGCCTTAATTATTATGTTGGCGGGGGTGCTTTCAATTCGCACCCTGCCGGTTTCTCAATATCCCAGCATTGCACCGCCAACCGTGGTGATCAGTGCTAACTACCCTGGTGCATCGGCCAAGATTGTTGAAGACTCAGTGACTCAGGTGATTGAGCAACGCATGAAGGGTATCGATCACCTACGTTATATTGCCTCAACCAGCGATAGTTTCGGTAATGCTGAAATCACTTTGACCTTCAATGCCGAAGCCGATCCTGATATTGCTCAGGTACAAGTTCAGAACAAATTGCAGGGTGCAATGACCCTGTTACCACAAGAGGTACAGGCTCAAGGGGTTGACGTTAACAAATCAAGTTCTGGCTTYTTGATGGTGCTGGGTTTCGTATCGACTGACGGTTCCTTAGATAAAGGCGACATCGCCGACTATGTGGGTGCAAACGTACAAGATCCCATGAGCCGTGTACCGGGCGTGGGTGAAATTCAGCTGTTTGGTGCCCAATATGCGATGCGTATATGGCTTGATCCTTTAAAACTGACTCAATATAACTTGACCAGTTTAGAGGTGATCTCGGCGATTCGTGCTCAAAACGCGCAGGTGTCTGCGGGTCAGTTGGGTGGTACGCCGTCAATTCAAGGGCAAGAACTTAACGCCACTGTTTCGGCGCAAAGTCGTTTGCAAACCCCTGAAGAGTTTCGCAAGATTATCCTGAAGTCTGATACTTCGGGTGCGAATGTGTTCCTCGGTGATGTGGCGCGCGTAGAGTTAGGTTCAGAGAGTTATGCCGTTGTCTCGTTCTACAATGGTAAGCCTGCTACTGGTTTAGCGATTAAACTGGCGACAGGCGCAAACGCGTTGGATACCGCTGAAGCTGTTCGTGATAAAGTTGAAGAATTGCGACCTTTCTTCCCGCAAGGGTTGGATGTTGTTTATCCCTACGATACTACGCCATTCGTTGAGAAATCGATAGAAGGCGTGGTACACACCCTGCTCGAAGCGATTGTTCTGGTGTTTGTCATCATGTACCTCTTCCTGCAAAACTTCCGTGCGACCTTAATTCCGACGATTGCGGTACCAGTGGTCTTGCTGGGAACGTTTGCGATTTTGTCGGCCACGGGCTTCTCTATCAACACCCTTACCATGTTTGCTATGGTGCTGGCGATTGGTCTGTTGGTGGACGACGCCATCGTGGTGGTTGAAAACGTTGAGCGGGTGATGTCGGAAGAAGGGTTGAGCCCACTCGAAGCGACTCGTAAATCGATGGATCAAATCACTGGCGCCTTAGTTGGTATTGGTTTGACGTTATCTGCTGTATTTGTGCCAATGGCATTTATGTCGGGTTCTACTGGGGTCATTTACCGTCAGTTCTCGATCACTATCGTGTCTGCGATGGCATTGTCGGTATTAGTGGCCTTGATTTTAACGCCGGCACTTTGTGCCACTATGTTAAAACCCGTGCAGAAGGGACATGGTCATATTGAAACCGGTTTCTTCGGTTGGTTTAACCGTAACTTTGATCGCTTAACTAACCGTTACGAATCCAGTGTGGCGGGCATAGTGAAGCGTGGCTTTAGAGTCATGATGATTTATGTGGCTTTAGTGGTCGCCGTCGGTTGGATCTTCATGCGTATGCCAACTGCATTCTTACCCGATGAAGACCAAGGTATCTTGTTTACGCAGGCGATTTTGCCAACAAACTCGACTCAAGAAAGTACCCTCAAAGTGCTGGATAAGGTATCCGATCACTTCATGGCTGAAGAAGGCGTGAGATCGGTATTCAGCGTGGCGGGCTTTAGCTTTGCGGGTCAAGGCCAAAACATGGGTATCGCTTTCGTTGGCTTGAAGGATTGGTCAGAGCGTGAAGCACCTGGTATGGATGTGCAGTCTATTGCGGGTCGTGCTATGGGTGCCTTTAGTCAAATTAAAGACGCCTTCGTATTTGCCTTCGTACCACCTGCGGTTATTGAGCTGGGTACGGCGAATGGTTTTGACATGTACCTGCAAGATAAAAACGGTCAAGGCCACGATAAGTTAATAGCGGCTCGTAACCAATTGCTGGGTATGGCGGCTCAGAATCCAAACCTTATGGGTGTTCGCCCTAATGGTCAGGAAGATGCGCCAATCTATCAATTGCATATTGATCATGCAAAGTTGAGCGCATTAGGCGTTGATATTGCTAACGTTAACAGTGTGTTGGCAACTGCTTGGGGTGGTTCCTATGTGAACGATTTTATCGACCGCGGCCGTGTGAAAAAGGTATTTGTGCAAGGTGATGCCCAATACCGTATGCAGCCTGAAGACCTCAACACTTGGTACGTGCGTAACAACAAGGGTGACATGGTGCCATTTTCGGCCTTTGCAACAGGTTCTTGGGAATACGGCTCACCGCGTCTAGAACGTTTTAACGGTTTACCAGCGGTGAATATTCAAGGCGCAACTGCACCAGGCTTTAGTACGGGTGCTGCCATGACTATCATGGAGGACTTAGTTAAGCAGCTACCACCTGGCTTTGGCATCGAGTGGAACGGCTTATCCTACGAGGAACGTTTATCGGGTAACCAAGCACCAGCCTTGTATGCGTTGTCGATTCTGGTGGTATTCCTTGTATTAGCAGCCTTGTATGAAAGCTGGTCAGTACCGTTTGCGGTTATCCTTGTGGTTCCATTGGGGATTATCGGTGCTCTATTGGCGATGAATGGTCGAGGCTTGCCTAACGACGTGTTCTTCCAAGTGGGTCTGTTAACAACGGTTGGTTTGGCAACCAAGAACGCCATCTTGATTGTGGAATTTGCAAAAGAATTCTACGAGAAGGGGGCGGGTCTGGTTGAGGCGACCTTACATGCGGTCCGCGTGCGTTTACGTCCGATTTTAATGACGTCGCTCGCTTTTGGTCTGGGGGTTGTACCGCTAGCCATTAGTACAGGTGTGGGTTCGGGCAGTCAGAACGCCATTGGTACCGGTGTACTTGGCGGTATGATGAGTTCGACCTTCTTAGGTATCTTCTTCGTGCCACTGTTCTTCGTCATTGTTGAGCGGATCTTCAGTAAACGAGAGCGAAAAGCGAAAGAGAAAAATCCTACGTCGACGGATTAA


From bosborne11 at verizon.net  Tue Feb 12 21:30:08 2013
From: bosborne11 at verizon.net (Brian Osborne)
Date: Tue, 12 Feb 2013 21:30:08 -0500
Subject: [Bioperl-l] Bio::Tools::SeqStats->count_codons
In-Reply-To: <CAARX5cX31P-SwDAb1mfiCTUG00bBq_m37Eb3rBemSqD1TBo_nw@mail.gmail.com>
References: <CAARX5cX31P-SwDAb1mfiCTUG00bBq_m37Eb3rBemSqD1TBo_nw@mail.gmail.com>
Message-ID: <C13C35A7-4DBE-4797-A584-DCB6AF772D25@verizon.net>

Shyam,

An ambiguous codon would be one that has a character other than [ACTGU] in it. I see '!' in your sequences, that would create an ambiguous codon.

Brian O.


On Feb 12, 2013, at 4:24 PM, Shyam Saladi <saladi1 at illinois.edu> wrote:

> Hi,
> 
> I am using the count_codons method from Bio::Tools::SeqStats and keep
> getting "AMBIGUOUS" codons, but I can't figure out why exactly.
> 
> When I translate the same sequence that gives the error using another
> standard utility like (ExPASy - Translate), it seems to work alright.
> 
> An example sequence is below. Could anyone lend some insight?
> 
> Thanks,
> Shyam
> 
> 
> 
> AAA     AAC     AAG     AAT     ACA     ACC     ACG     ACT     AGA     AGC
>    AGT     *AMBIGUOUS*       ATA     ATC     ATG     ATT     CAA     CAC
>  CAG     CAT     CCA     CCC     CCG     CCT     CGA     CGC     CGG
> CGT     CTA     CTC     CTG     CTT     GAA     GAC     GAG     GAT     GCA
>    GCC     GCG     GCT     GGA     GGC     GGG     GGT     GTA     GTC
> GTG     GTT     TAA     TAC     TAT     TCA     TCC     TCG     TCT     TGG
>    TGT     TTA     TTC     TTG     TTT     count   filename
> 1.722488038277511961722488038277511961722
> 2.966507177033492822966507177033492822967
> 1.531100478468899521531100478468899521531
> 0.9569377990430622009569377990430622009569
> 0.4784688995215311004784688995215311004785
> 1.722488038277511961722488038277511961722
> 1.33971291866028708133971291866028708134
> 1.913875598086124401913875598086124401914
> 0.1913875598086124401913875598086124401914
> 0.7655502392344497607655502392344497607656
> 1.435406698564593301435406698564593301435       *
> 0.09569377990430622009569377990430622009569*
> 0.3827751196172248803827751196172248803828
> 2.488038277511961722488038277511961722488
> 3.349282296650717703349282296650717703349
> 3.636363636363636363636363636363636363636
> 2.870813397129186602870813397129186602871
> 0.3827751196172248803827751196172248803828
> 1.626794258373205741626794258373205741627
> 0.4784688995215311004784688995215311004785
> 1.722488038277511961722488038277511961722
> 0.5741626794258373205741626794258373205742
> 1.052631578947368421052631578947368421053
> 1.244019138755980861244019138755980861244
> 0.3827751196172248803827751196172248803828
> 0.7655502392344497607655502392344497607656
> 0.1913875598086124401913875598086124401914
> 2.488038277511961722488038277511961722488
> 0.4784688995215311004784688995215311004785
> 0.6698564593301435406698564593301435406699
> 2.105263157894736842105263157894736842105
> 0.8612440191387559808612440191387559808612
> 2.870813397129186602870813397129186602871
> 1.435406698564593301435406698564593301435
> 1.722488038277511961722488038277511961722
> 2.775119617224880382775119617224880382775
> 2.00956937799043062200956937799043062201
> 2.488038277511961722488038277511961722488
> 3.540669856459330143540669856459330143541
> 2.00956937799043062200956937799043062201
> 0.1913875598086124401913875598086124401914
> 2.392344497607655502392344497607655502392
> 0.8612440191387559808612440191387559808612
> 5.454545454545454545454545454545454545455
> 1.913875598086124401913875598086124401914
> 0.8612440191387559808612440191387559808612
> 4.593301435406698564593301435406698564593
> 2.679425837320574162679425837320574162679
> 0.09569377990430622009569377990430622009569
> 1.148325358851674641148325358851674641148
> 1.148325358851674641148325358851674641148
> 0.8612440191387559808612440191387559808612
> 0.4784688995215311004784688995215311004785
> 2.105263157894736842105263157894736842105
> 0.9569377990430622009569377990430622009569
> 0.9569377990430622009569377990430622009569
> 0.09569377990430622009569377990430622009569
> 2.679425837320574162679425837320574162679
> 2.966507177033492822966507177033492822967
> 3.062200956937799043062200956937799043062
> 2.775119617224880382775119617224880382775       1045    temp.seq
> 
> ATGGCACGTTTTTTTATTGATCGTCCCATCTTTGCGTGGGTGATCGCCTTAATTATTATGTTGGCGGGGGTGCTTTCAATTCGCACCCTGCCGGTTTCTCAATATCCCAGCATTGCACCGCCAACCGTGGTGATCAGTGCTAACTACCCTGGTGCATCGGCCAAGATTGTTGAAGACTCAGTGACTCAGGTGATTGAGCAACGCATGAAGGGTATCGATCACCTACGTTATATTGCCTCAACCAGCGATAGTTTCGGTAATGCTGAAATCACTTTGACCTTCAATGCCGAAGCCGATCCTGATATTGCTCAGGTACAAGTTCAGAACAAATTGCAGGGTGCAATGACCCTGTTACCACAAGAGGTACAGGCTCAAGGGGTTGACGTTAACAAATCAAGTTCTGGCTTYTTGATGGTGCTGGGTTTCGTATCGACTGACGGTTCCTTAGATAAAGGCGACATCGCCGACTATGTGGGTGCAAACGTACAAGATCCCATGAGCCGTGTACCGGGCGTGGGTGAAATTCAGCTGTTTGGTGCCCAATATGCGATGCGTATATGGCTTGATCCTTTAAAACTGACTCAATATAACTTGACCAGTTTAGAGGTGATCTCGGCGATTCGTGCTCAAAACGCGCAGGTGTCTGCGGGTCAGTTGGGTGGTACGCCGTCAATTCAAGGGCAAGAACTTAACGCCACTGTTTCGGCGCAAAGTCGTTTGCAAACCCCTGAAGAGTTTCGCAAGATTATCCTGAAGTCTGATACTTCGGGTGCGAATGTGTTCCTCGGTGATGTGGCGCGCGTAGAGTTAGGTTCAGAGAGTTATGCCGTTGTCTCGTTCTACAATGGTAAGCCTGCTACTGGTTTAGCGATTAAACTGGCGACAGGCGCAAACGCGTTGGATACCGCTGAAGCTGTTCGTGATAAAGTTGAAGAATTGCGACCTTTCTTCCCGCAAGGGTTGGATGTTGTTTATCCCTACGATACTAC!
> GCCATTCGTTGAGAAATCGATAGAAGGCGTGGTACACACCCTGCTCGAAGCGATTGTTCTGGTGTTTGTCATCATGTACCTCTTCCTGCAAAACTTCCGTGCGACCTTAATTCCGACGATTGCGGTACCAGTGGTCTTGCTGGGAACGTTTGCGATTTTGTCGGCCACGGGCTTCTCTATCAACACCCTTACCATGTTTGCTATGGTGCTGGCGATTGGTCTGTTGGTGGACGACGCCATCGTGGTGGTTGAAAACGTTGAGCGGGTGATGTCGGAAGAAGGGTTGAGCCCACTCGAAGCGACTCGTAAATCGATGGATCAAATCACTGGCGCCTTAGTTGGTATTGGTTTGACGTTATCTGCTGTATTTGTGCCAATGGCATTTATGTCGGGTTCTACTGGGGTCATTTACCGTCAGTTCTCGATCACTATCGTGTCTGCGATGGCATTGTCGGTATTAGTGGCCTTGATTTTAACGCCGGCACTTTGTGCCACTATGTTAAAACCCGTGCAGAAGGGACATGGTCATATTGAAACCGGTTTCTTCGGTTGGTTTAACCGTAACTTTGATCGCTTAACTAACCGTTACGAATCCAGTGTGGCGGGCATAGTGAAGCGTGGCTTTAGAGTCATGATGATTTATGTGGCTTTAGTGGTCGCCGTCGGTTGGATCTTCATGCGTATGCCAACTGCATTCTTACCCGATGAAGACCAAGGTATCTTGTTTACGCAGGCGATTTTGCCAACAAACTCGACTCAAGAAAGTACCCTCAAAGTGCTGGATAAGGTATCCGATCACTTCATGGCTGAAGAAGGCGTGAGATCGGTATTCAGCGTGGCGGGCTTTAGCTTTGCGGGTCAAGGCCAAAACATGGGTATCGCTTTCGTTGGCTTGAAGGATTGGTCAGAGCGTGAAGCACCTGGTATGGATGTGCAGTCTATTGCGGGTCGTGCTATGGGTGCCTTTAGTCAAATTAAAGACGCCTTC!
> GTATTTGCCTTCGTACCACCTGCGGTTATTGAGCTGGGTACGGCGAATGGTTTTGACATGTACCTGCAAG
> ATAAAAACGGTCAAGGCCACGATAAGTTAATAGCGGCTCGTAACCAATTGCTGGGTATGGCGGCTCAGAATCCAAACCTTATGGGTGTTCGCCCTAATGGTCAGGAAGATGCGCCAATCTATCAATTGCATATTGATCATGCAAAGTTGAGCGCATTAGGCGTTGATATTGCTAACGTTAACAGTGTGTTGGCAACTGCTTGGGGTGGTTCCTATGTGAACGATTTTATCGACCGCGGCCGTGTGAAAAAGGTATTTGTGCAAGGTGATGCCCAATACCGTATGCAGCCTGAAGACCTCAACACTTGGTACGTGCGTAACAACAAGGGTGACATGGTGCCATTTTCGGCCTTTGCAACAGGTTCTTGGGAATACGGCTCACCGCGTCTAGAACGTTTTAACGGTTTACCAGCGGTGAATATTCAAGGCGCAACTGCACCAGGCTTTAGTACGGGTGCTGCCATGACTATCATGGAGGACTTAGTTAAGCAGCTACCACCTGGCTTTGGCATCGAGTGGAACGGCTTATCCTACGAGGAACGTTTATCGGGTAACCAAGCACCAGCCTTGTATGCGTTGTCGATTCTGGTGGTATTCCTTGTATTAGCAGCCTTGTATGAAAGCTGGTCAGTACCGTTTGCGGTTATCCTTGTGGTTCCATTGGGGATTATCGGTGCTCTATTGGCGATGAATGGTCGAGGCTTGCCTAACGACGTGTTCTTCCAAGTGGGTCTGTTAACAACGGTTGGTTTGGCAACCAAGAACGCCATCTTGATTGTGGAATTTGCAAAAGAATTCTACGAGAAGGGGGCGGGTCTGGTTGAGGCGACCTTACATGCGGTCCGCGTGCGTTTACGTCCGATTTTAATGACGTCGCTCGCTTTTGGTCTGGGGGTTGTACCGCTAGCCATTAGTACAGGTGTGGGTTCGGGCAGTCAGAACGCCATTGGTACCGGTGTACTTGGCGGTATGATGAGTTCGACCTTCTTA!
> GGTATCTTCTTCGTGCCACTGTTCTTCGTCATTGTTGAGCGGATCTTCAGTAAACGAGAGCGAAAAGCGAAAGAGAAAAATCCTACGTCGACGGATTAA
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Wed Feb 13 10:18:10 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 13 Feb 2013 15:18:10 +0000
Subject: [Bioperl-l] [ANNOUNCEMENT] BioPerl Future Development
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE24CF5@CHIMBX5.ad.uillinois.edu>

All,

tl;dr: A lot of change is coming.  Be forewarned and be prepared.

This is an 'official' announcement to the BioPerl community on future BioPerl plans.  We have decided to move continued maintenance of Bioperl release series over to the new 'v1' branch.  This branch will be the point where any future versions of 1.6.x code will be released, starting with the (already-scheduled) March 1 release.  The 'master' branch will become the main focal point for future development of BioPerl going into an eventual v2 release, with a focus on performance enhancements, addressing newer technologies like NGS and large data, code cleanup, and simplifying the code base.

We welcome any help with code improvements. GMOD folks? Want to help? This is a good opportunity to address BioPerl short-comings in the code base! 

What this means for anyone using BioPerl currently:

1) We anticipate significant issues if you are relying on the 'master' branch for anything.  To inelegantly state it, the core developers are taking back the 'master' branch for future development. Please please please do not rely on the 'master' branch for stable code; if you are reliant on the BioPerl 1.6.x, make sure to use 'v1'.  We can revisit whether to make 'v1' the default checkout branch if/when the need arises.

2) Expect not to find some modules.  We will be migrating modules requiring external dependencies and other associated chunks of the code base out into their own repositories over the next year to help future maintenance; the eventual intent is to release all of these independently on CPAN.  We will completely remove all code previously marked as deprecated, and we may immediately deprecate additional modules if needed (this will of course be discussed on list).

3) Expect version numbering to change significantly.  Because we are releasing code in separate repositories, I fully expect downstream versioning problems if we stick with the current system (e.g. all bioperl-live modules having the same version).  It will be too much of a headache to sync versions for all modules as this will entail making a full release of all bioperl code, one of the main reasons we are splitting out code to begin with.  At the moment, no specific versioning scheme has been chosen, though I *highly* recommend using X.Y versioning for simplicity (e.g. no more 3-point versions).  This is the standard that Lincoln has adopted for Bio::Graphics and GBrowse.

4) Expect quick deprecation of methods within modules as needed.  These should of course be brought up to the mail list prior to actual implementation, but I would anticipate some things changing as we try to adopt a more consistent method naming scheme.

5) The same steps outlined for bioperl-live will apply for bioperl-run modules.  We will have to decide the best approach to use for those, e.g. whether to separate them out based on task (alignment), application group (NGS, BLAST, RNA), etc. and how these may fit organically with bioperl-live modules where appropriate.

6) Do not expect a new CPAN release of such code until Dec 2013.  Even then it will be in an alpha stage.  We are all busy campers.

We do not anticipate significant changes to bioperl-network or bioperl-db at this time beyond updating them to deal with new changes. 

I'm sure there are many other points that need to be discussed.   Please reply over the next week if you have any concerns. 

chris


From cjfields at illinois.edu  Wed Feb 13 11:01:07 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 13 Feb 2013 16:01:07 +0000
Subject: [Bioperl-l] Test-pls ignore
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE2506D@CHIMBX5.ad.uillinois.edu>

testing the mail list to see if it is working.

-c


From sebastien.moretti at unil.ch  Wed Feb 13 11:21:23 2013
From: sebastien.moretti at unil.ch (=?ISO-8859-1?Q?Moretti_S=E9bastien?=)
Date: Wed, 13 Feb 2013 17:21:23 +0100
Subject: [Bioperl-l] PhyloXML
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE24E8D@CHIMBX5.ad.uillinois.edu>
References: <51152591.9010402@unil.ch>
	<F041F111-CF8F-4096-9968-5F8CA5DCA866@drycafe.net>
	<511898E6.7060400@unil.ch>
	<118F034CF4C3EF48A96F86CE585B94BF6CE24E8D@CHIMBX5.ad.uillinois.edu>
Message-ID: <511BBD83.2000708@unil.ch>

>>>> # Add annotation
>>>> $treeio->add_phyloXML_annotation(-obj => $tree,
>>>>                                 -xml => '<name>SUMF family</name>',
>>>>                                );
>>>
>>> If you really have $treeio in your script in this line and not $tree, then that's at least one problem. But the stack suggests that the above isn't the exact line in your script - can you confirm that?
>>>
>>> 	-hilmar
>>
>> I replaced $treeio by $tree in the above line but still get an error.
>> Don't see what you mean by "the stack suggests that the above isn't the exact line in your script"
>>
>> The only think I changed is the length of the xml string I try to insert. But get the same error with an empty xml string.
>>
>>
>>
>> my $treeio = new Bio::TreeIO(-file   => "$infile",
>>                              -format => 'phyloxml',
>>                             );
>> my $tree = $treeio->next_tree;
>>
>> # Add annotation
>> $tree->add_phyloXML_annotation(-obj => $tree,
>>                                -xml => '<name>SUMF family</name>',
>>                               );
>>
>> Can't locate object method "add_phyloXML_annotation" via package
>> 	"Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, <GEN0> line 1 (#1)
>>     (F) You called a method correctly, and it correctly indicated a package
>>     functioning as a class, but that package doesn't define that particular
>>     method, nor does any of its base classes.  See perlobj.
>>
>> Uncaught exception from user code:
>> 	Can't locate object method "add_phyloXML_annotation" via package "Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, <GEN0> line 1.
>> at ./add_annotation_to_phyloxml.pl line 40
>
> Will have to look into this.  One problem we have is that phyloXML support has dwindled, so if anyone wants to take this on I would be more than happy to help them get started.
>
> chris

You mean that BioPerl 1.6.901 has not a full support of PhyloXML ?
The problem I have is "expected" ?

-- 
S?bastien Moretti


From cjfields at illinois.edu  Wed Feb 13 10:47:17 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 13 Feb 2013 15:47:17 +0000
Subject: [Bioperl-l] PhyloXML
In-Reply-To: <511898E6.7060400@unil.ch>
References: <51152591.9010402@unil.ch>
	<F041F111-CF8F-4096-9968-5F8CA5DCA866@drycafe.net>
	<511898E6.7060400@unil.ch>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE24E8D@CHIMBX5.ad.uillinois.edu>

On Feb 11, 2013, at 1:08 AM, S?bastien MORETTI <sebastien.moretti at unil.ch> wrote:

>>> # Add annotation
>>> $treeio->add_phyloXML_annotation(-obj => $tree,
>>>                                -xml => '<name>SUMF family</name>',
>>>                               );
>> 
>> If you really have $treeio in your script in this line and not $tree, then that's at least one problem. But the stack suggests that the above isn't the exact line in your script - can you confirm that?
>> 
>> 	-hilmar
> 
> I replaced $treeio by $tree in the above line but still get an error.
> Don't see what you mean by "the stack suggests that the above isn't the exact line in your script"
> 
> The only think I changed is the length of the xml string I try to insert. But get the same error with an empty xml string.
> 
> 
> 
> my $treeio = new Bio::TreeIO(-file   => "$infile",
>                             -format => 'phyloxml',
>                            );
> my $tree = $treeio->next_tree;
> 
> # Add annotation
> $tree->add_phyloXML_annotation(-obj => $tree,
>                               -xml => '<name>SUMF family</name>',
>                              );
> 
> Can't locate object method "add_phyloXML_annotation" via package
> 	"Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, <GEN0> line 1 (#1)
>    (F) You called a method correctly, and it correctly indicated a package
>    functioning as a class, but that package doesn't define that particular
>    method, nor does any of its base classes.  See perlobj.
> 
> Uncaught exception from user code:
> 	Can't locate object method "add_phyloXML_annotation" via package "Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, <GEN0> line 1.
> at ./add_annotation_to_phyloxml.pl line 40
> 
> 
> 
> -- 
> S?bastien Moretti
> Department of Ecology and Evolution,
> Biophore, University of Lausanne,
> CH-1015 Lausanne, Switzerland
> Tel.: +41 (21) 692 4221/4079
> http://bioinfo.unil.ch/\

Will have to look into this.  One problem we have is that phyloXML support has dwindled, so if anyone wants to take this on I would be more than happy to help them get started.

chris


From carandraug+dev at gmail.com  Wed Feb 13 12:23:23 2013
From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=)
Date: Wed, 13 Feb 2013 17:23:23 +0000
Subject: [Bioperl-l] Next BioPerl release
Message-ID: <CAPOrs_0HoMHm6u5VFgCRONsv8YF_OX5TE1dJLTS+qBTRuh_Btw@mail.gmail.com>

On 5 February 2013 21:53, Fields, Christopher J <cjfields at illinois.edu> wrote:
> I am scheduling the next BioPerl CPAN release tentatively for March 1.  Any help in triaging bug reports would be greatly appreciated!

Hi

is this release of bioperl-live only or also includes bioperl-run?

Carn?


From cjfields at illinois.edu  Wed Feb 13 12:08:21 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 13 Feb 2013 17:08:21 +0000
Subject: [Bioperl-l] PhyloXML
In-Reply-To: <511BBD83.2000708@unil.ch>
References: <51152591.9010402@unil.ch>
	<F041F111-CF8F-4096-9968-5F8CA5DCA866@drycafe.net>
	<511898E6.7060400@unil.ch>
	<118F034CF4C3EF48A96F86CE585B94BF6CE24E8D@CHIMBX5.ad.uillinois.edu>
	<511BBD83.2000708@unil.ch>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE2532B@CHIMBX5.ad.uillinois.edu>

On Feb 13, 2013, at 10:21 AM, Moretti S?bastien <sebastien.moretti at unil.ch> wrote:

>>>>> # Add annotation
>>>>> $treeio->add_phyloXML_annotation(-obj => $tree,
>>>>>                                -xml => '<name>SUMF family</name>',
>>>>>                               );
>>>> 
>>>> If you really have $treeio in your script in this line and not $tree, then that's at least one problem. But the stack suggests that the above isn't the exact line in your script - can you confirm that?
>>>> 
>>>> 	-hilmar
>>> 
>>> I replaced $treeio by $tree in the above line but still get an error.
>>> Don't see what you mean by "the stack suggests that the above isn't the exact line in your script"
>>> 
>>> The only think I changed is the length of the xml string I try to insert. But get the same error with an empty xml string.
>>> 
>>> 
>>> 
>>> my $treeio = new Bio::TreeIO(-file   => "$infile",
>>>                             -format => 'phyloxml',
>>>                            );
>>> my $tree = $treeio->next_tree;
>>> 
>>> # Add annotation
>>> $tree->add_phyloXML_annotation(-obj => $tree,
>>>                               -xml => '<name>SUMF family</name>',
>>>                              );
>>> 
>>> Can't locate object method "add_phyloXML_annotation" via package
>>> 	"Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, <GEN0> line 1 (#1)
>>>    (F) You called a method correctly, and it correctly indicated a package
>>>    functioning as a class, but that package doesn't define that particular
>>>    method, nor does any of its base classes.  See perlobj.
>>> 
>>> Uncaught exception from user code:
>>> 	
>>> at ./add_annotation_to_phyloxml.pl line 40
>> 
>> Will have to look into this.  One problem we have is that phyloXML support has dwindled, so if anyone wants to take this on I would be more than happy to help them get started.
>> 
>> chris
> 
> You mean that BioPerl 1.6.901 has not a full support of PhyloXML ?
> The problem I have is "expected" ?
> 
> -- 
> S?bastien Moretti

I think it handles most of phyloXML fine, but the implementation of the parser is a little tricky.  I tried cleaning this up a few years back but didn't make much progress.

The function is in Bio::TreeIO::phyloxml, so the correct call should be (as you previously had it):

    $treeio->add_phyloXML_annotation(-obj => $tree,
                              -xml => '<name>SUMF family</name>',
                             );

My guess is that Bio::Tree::Tree was AnnotatableI at one point but that was removed, will have to trace that back.  Can you file a bug on this?

https://redmine.open-bio.org/

chris


From cjfields at illinois.edu  Wed Feb 13 13:05:53 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 13 Feb 2013 18:05:53 +0000
Subject: [Bioperl-l] Next BioPerl release
In-Reply-To: <CAPOrs_0HoMHm6u5VFgCRONsv8YF_OX5TE1dJLTS+qBTRuh_Btw@mail.gmail.com>
References: <CAPOrs_0HoMHm6u5VFgCRONsv8YF_OX5TE1dJLTS+qBTRuh_Btw@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE25573@CHIMBX5.ad.uillinois.edu>

On Feb 13, 2013, at 11:23 AM, Carn? Draug <carandraug+dev at gmail.com> wrote:

> On 5 February 2013 21:53, Fields, Christopher J <cjfields at illinois.edu> wrote:
>> I am scheduling the next BioPerl CPAN release tentatively for March 1.  Any help in triaging bug reports would be greatly appreciated!
> 
> Hi
> 
> is this release of bioperl-live only or also includes bioperl-run?
> 
> Carn?

We can work on a bioperl-run release.  It's too much to handle both in one go.  The problem I have faced with bioperl-run in the past is similar to bioperl-live, that the tools used are a moving target and that makes the wrappers easily out-of-date.  I would really like a more flexible generic way of defining these that would allow for easier maintenance.

chris


From l.m.timmermans at students.uu.nl  Wed Feb 13 14:44:22 2013
From: l.m.timmermans at students.uu.nl (Leon Timmermans)
Date: Wed, 13 Feb 2013 20:44:22 +0100
Subject: [Bioperl-l] Next BioPerl release
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE25573@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_0HoMHm6u5VFgCRONsv8YF_OX5TE1dJLTS+qBTRuh_Btw@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE25573@CHIMBX5.ad.uillinois.edu>
Message-ID: <CAC1jpXBf+uOXHKpxb7o8t3pYttnnRF35A49zY5M-3mEOuniGCA@mail.gmail.com>

On Wed, Feb 13, 2013 at 7:05 PM, Fields, Christopher J
<cjfields at illinois.edu> wrote:
> We can work on a bioperl-run release.  It's too much to handle both in one go.  The problem I have faced with bioperl-run in the past is similar to bioperl-live, that the tools used are a moving target and that makes the wrappers easily out-of-date.  I would really like a more flexible generic way of defining these that would allow for easier maintenance.

Also, bioperl-run needs to be cut into smaller distributions even more
than bioperl-live. Few people if anyone at all has all tools it tries
to wrap at hand, so its almost impossible to pass its testing suite.

We need dists that can realistically pass.

Leon


From cjfields at illinois.edu  Wed Feb 13 16:04:26 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 13 Feb 2013 21:04:26 +0000
Subject: [Bioperl-l] Next BioPerl release
In-Reply-To: <CAC1jpXBf+uOXHKpxb7o8t3pYttnnRF35A49zY5M-3mEOuniGCA@mail.gmail.com>
References: <CAPOrs_0HoMHm6u5VFgCRONsv8YF_OX5TE1dJLTS+qBTRuh_Btw@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE25573@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBf+uOXHKpxb7o8t3pYttnnRF35A49zY5M-3mEOuniGCA@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE25B07@CHIMBX5.ad.uillinois.edu>

On Feb 13, 2013, at 1:44 PM, Leon Timmermans <l.m.timmermans at students.uu.nl> wrote:

> On Wed, Feb 13, 2013 at 7:05 PM, Fields, Christopher J
> <cjfields at illinois.edu> wrote:
>> We can work on a bioperl-run release.  It's too much to handle both in one go.  The problem I have faced with bioperl-run in the past is similar to bioperl-live, that the tools used are a moving target and that makes the wrappers easily out-of-date.  I would really like a more flexible generic way of defining these that would allow for easier maintenance.
> 
> Also, bioperl-run needs to be cut into smaller distributions even more
> than bioperl-live. Few people if anyone at all has all tools it tries
> to wrap at hand, so its almost impossible to pass its testing suite.
> 
> We need dists that can realistically pass.
> 
> Leon

Yup.  It's a mess.

chris


From florent.angly at gmail.com  Wed Feb 13 17:33:14 2013
From: florent.angly at gmail.com (Florent Angly)
Date: Thu, 14 Feb 2013 08:33:14 +1000
Subject: [Bioperl-l] [ANNOUNCEMENT] BioPerl Future Development
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE24CF5@CHIMBX5.ad.uillinois.edu>
References: <118F034CF4C3EF48A96F86CE585B94BF6CE24CF5@CHIMBX5.ad.uillinois.edu>
Message-ID: <511C14AA.9030107@gmail.com>

On 14/02/13 01:18, Fields, Christopher J wrote:
> I*highly*  recommend using X.Y versioning for simplicity (e.g. no more 3-point versions)
Yes, I support the X.Y versioning as well.
Florent


From l.m.timmermans at students.uu.nl  Wed Feb 13 18:12:06 2013
From: l.m.timmermans at students.uu.nl (Leon Timmermans)
Date: Thu, 14 Feb 2013 00:12:06 +0100
Subject: [Bioperl-l] [ANNOUNCEMENT] BioPerl Future Development
In-Reply-To: <511C14AA.9030107@gmail.com>
References: <118F034CF4C3EF48A96F86CE585B94BF6CE24CF5@CHIMBX5.ad.uillinois.edu>
	<511C14AA.9030107@gmail.com>
Message-ID: <CAC1jpXBk9prChjjeHmnykWh4j7FRMN1adY0ibzM8uqH1+Z5uGA@mail.gmail.com>

On Wed, Feb 13, 2013 at 11:33 PM, Florent Angly <florent.angly at gmail.com> wrote:
> On 14/02/13 01:18, Fields, Christopher J wrote:
>>
>> I*highly*  recommend using X.Y versioning for simplicity (e.g. no more
>> 3-point versions)
>
> Yes, I support the X.Y versioning as well.
> Florent

See also: http://www.dagolden.com/index.php/369/version-numbers-should-be-boring/

Leon


From daisieh at gmail.com  Thu Feb 14 00:21:15 2013
From: daisieh at gmail.com (Daisie Huang)
Date: Wed, 13 Feb 2013 21:21:15 -0800 (PST)
Subject: [Bioperl-l] Question regarding while loops for reading files
In-Reply-To: <CADdQm2mHL-_X+bPh=cVwp1_xMCrVGhe0=D75Uf410X_L=qHz3g@mail.gmail.com>
References: <CADdQm2mHL-_X+bPh=cVwp1_xMCrVGhe0=D75Uf410X_L=qHz3g@mail.gmail.com>
Message-ID: <3cbbba3b-759d-4281-9592-6b690aea92ab@googlegroups.com>

I think you need to reset the pointer to the filehandle before you go 
through the while loop the second time: seek $fh,0,0

On Wednesday, February 13, 2013 6:46:41 PM UTC-8, Tiago Hori wrote:
>
> Hey Guys,
>
> I am still at the same place. I am writing these little pieces of code to 
> try to learn the language better, so any advice would be useful. I am again 
> parsing through tab delimited files and now trying to find fish from on id 
> (in these case families AS5 and AS9), retrieve the weights and average 
> them. When I started I did it for one family and it worked (instead of the 
> @families I had a scalar $family set to AS5). But really it is more useful 
> to look at more than one family at time (I should mention that are 2 types 
> of fish per family one ends in PS , the other doesn't). So I tried to use a 
> foreach loop to go through the file twice, once with a the search value set 
> to AS5 and a second time to AS9. It works for AS5, but for some reason, the 
> foreach loop sets $test to AS9 the second time, but it doesn't go through 
> the while loop. What am I doing wrong? 
>
> here is the code:
>
> #! /usr/bin/perl
> use strict;
> use warnings;
>
> my $file = $ARGV[0];
> my @family = ('AS5','AS9');
> my $i;
> my $ii;
> my $test;
>
> open (my $fh, "<", $file) or die ("Can't open $file: $!");
>
> foreach (@family){
>     $test = $_;
>     my @data_weight_2N = ();
>     my @data_weight_3N = ();
>     while (<$fh>){
>         chomp;  
>         my $line = $_;
>         my @data  = split ("\t", $line);
>         if ($data[0] !~ /[0-9]*/){
>         next;}
>         elsif ($data[1] eq "ABF09-$test"){
>             $i += 1; 
>             push (@data_weight_2N,  $data[6]);
>         }elsif ($data[1] eq "ABF09-".$test."PS"){
>         $ii += 1;
>             push (@data_weight_3N,$data[6]);
>     }
> }
>     my $mean_2N = &average (\@data_weight_2N);
>     my $stdev_2N = &stdev (\@data_weight_2N);
>     my $stderr_2N = ($stdev_2N/sqrt($i));
>
>     print "These are the the avearge weight, stdev and stderr for $test 
> 2N:\t", $mean_2N,"\t",$stdev_2N,"\t",$stderr_2N, "\n";
>
>     my $mean_3N = &average (\@data_weight_3N);
>     my $stdev_3N = &stdev (\@data_weight_3N);
>     my $stderr_3N = ($stdev_3N/sqrt($i));
>
>     print "These are the the avearge weight, stdev and stderr for $test 
> 3N:\t", $mean_3N,"\t",$stdev_3N,"\t",$stderr_3N, "\n";
> }
>
> close ($fh);
>
> sub average{
>         my($data) = @_;
>         if (not @$data) {
>                 print ("Empty array\n");
>                 return 0;
>         }
>         my $total = 0;
>         foreach (@$data) {
>                 $total += $_;
>         }
>         my $average = $total / @$data;
>         return $average;
> }
>
> sub stdev{
>         my($data) = @_;
>         if(@$data == 1){
>                 return 0;
>         }
>         my $average = &average($data);
>         my $sqtotal = 0;
>         foreach(@$data) {
>                 $sqtotal += ($average-$_) ** 2;
>         }
>         my $std = ($sqtotal / (@$data-1)) ** 0.5;
>         return $std;
> }
>
> Thanks,
>
> T.
>
> -- 
> "Education is not to be used to promote obscurantism." - Theodonius 
> Dobzhansky.
>
> "Gracias a la vida que me ha dado tanto
> Me ha dado el sonido y el abecedario
> Con ?l, las palabras que pienso y declaro
> Madre, amigo, hermano
> Y luz alumbrando la ruta del alma del que estoy amando
>
> Gracias a la vida que me ha dado tanto
> Me ha dado la marcha de mis pies cansados
> Con ellos anduve ciudades y charcos
> Playas y desiertos, monta?as y llanos
> Y la casa tuya, tu calle y tu patio"
>
> Violeta Parra - Gracias a la Vida
>
> Tiago S. F. Hori. PhD.
> Ocean Science Center-Memorial University of Newfoundland 
>


From sebastien.moretti at unil.ch  Thu Feb 14 03:09:06 2013
From: sebastien.moretti at unil.ch (=?ISO-8859-1?Q?S=E9bastien_MORETTI?=)
Date: Thu, 14 Feb 2013 09:09:06 +0100
Subject: [Bioperl-l] PhyloXML
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE2532B@CHIMBX5.ad.uillinois.edu>
References: <51152591.9010402@unil.ch>
	<F041F111-CF8F-4096-9968-5F8CA5DCA866@drycafe.net>
	<511898E6.7060400@unil.ch>
	<118F034CF4C3EF48A96F86CE585B94BF6CE24E8D@CHIMBX5.ad.uillinois.edu>
	<511BBD83.2000708@unil.ch>
	<118F034CF4C3EF48A96F86CE585B94BF6CE2532B@CHIMBX5.ad.uillinois.edu>
Message-ID: <511C9BA2.9000508@unil.ch>

>>>>>> # Add annotation
>>>>>> $treeio->add_phyloXML_annotation(-obj => $tree,
>>>>>>                                 -xml => '<name>SUMF family</name>',
>>>>>>                                );
>>>>>
>>>>> If you really have $treeio in your script in this line and not $tree, then that's at least one problem. But the stack suggests that the above isn't the exact line in your script - can you confirm that?
>>>>>
>>>>> 	-hilmar
>>>>
>>>> I replaced $treeio by $tree in the above line but still get an error.
>>>> Don't see what you mean by "the stack suggests that the above isn't the exact line in your script"
>>>>
>>>> The only think I changed is the length of the xml string I try to insert. But get the same error with an empty xml string.
>>>>
>>>>
>>>>
>>>> my $treeio = new Bio::TreeIO(-file   => "$infile",
>>>>                              -format => 'phyloxml',
>>>>                             );
>>>> my $tree = $treeio->next_tree;
>>>>
>>>> # Add annotation
>>>> $tree->add_phyloXML_annotation(-obj => $tree,
>>>>                                -xml => '<name>SUMF family</name>',
>>>>                               );
>>>>
>>>> Can't locate object method "add_phyloXML_annotation" via package
>>>> 	"Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, <GEN0> line 1 (#1)
>>>>     (F) You called a method correctly, and it correctly indicated a package
>>>>     functioning as a class, but that package doesn't define that particular
>>>>     method, nor does any of its base classes.  See perlobj.
>>>>
>>>> Uncaught exception from user code:
>>>> 	
>>>> at ./add_annotation_to_phyloxml.pl line 40
>>>
>>> Will have to look into this.  One problem we have is that phyloXML support has dwindled, so if anyone wants to take this on I would be more than happy to help them get started.
>>>
>>> chris
>>
>> You mean that BioPerl 1.6.901 has not a full support of PhyloXML ?
>> The problem I have is "expected" ?
>>
>> --
>> S?bastien Moretti
>
> I think it handles most of phyloXML fine, but the implementation of the parser is a little tricky.  I tried cleaning this up a few years back but didn't make much progress.
>
> The function is in Bio::TreeIO::phyloxml, so the correct call should be (as you previously had it):
>
>      $treeio->add_phyloXML_annotation(-obj => $tree,
>                                -xml => '<name>SUMF family</name>',
>                               );
>
> My guess is that Bio::Tree::Tree was AnnotatableI at one point but that was removed, will have to trace that back.  Can you file a bug on this?
>
> https://redmine.open-bio.org/
>
> chris

I will fill a bug on this.

I'd be happy to try to contribute to the phyloxml code.
But don't know how to proceed for BioPerl.

-- 
S?bastien Moretti


From hartzell at alerce.com  Thu Feb 14 15:04:44 2013
From: hartzell at alerce.com (George Hartzell)
Date: Thu, 14 Feb 2013 12:04:44 -0800
Subject: [Bioperl-l] Question regarding while loops for reading files
In-Reply-To: <3cbbba3b-759d-4281-9592-6b690aea92ab@googlegroups.com>
References: <CADdQm2mHL-_X+bPh=cVwp1_xMCrVGhe0=D75Uf410X_L=qHz3g@mail.gmail.com>
	<3cbbba3b-759d-4281-9592-6b690aea92ab@googlegroups.com>
Message-ID: <20765.17244.185833.755900@gargle.gargle.HOWL>


I think that it's important to get feedback on code that one has
written and to try to understand how/what/why someone else has done in
their code.  To that end....

Since Tiago's using this to learn the language better I can't resist
some comments beyond resetting the file handle.

For grins I rewrote it using Text::CSV_XS and Statistics::Basic and to
take a single pass through the data file using a multilevel data
structure.

I resisted the urge to rewrite it in Moose.  Didn't even have an urge
to rewrite it in R.  Funny, that....

The script is here

  Tiago.pl
    https://gist.github.com/hartzell/4955401

With something like what I think the data looks like here:

    https://gist.github.com/hartzell/4955570

Even without that big of a rewrite, I had a bunch of local comments
which are inline below.

Daisie Huang writes:
 > [...]
 > On Wednesday, February 13, 2013 6:46:41 PM UTC-8, Tiago Hori wrote:
 > >
 > > Hey Guys,
 > >
 > > I am still at the same place. I am writing these little pieces of code to 
 > > try to learn the language better, so any advice would be useful.
 > > [...]
 > > here is the code:
 > >
 > > #! /usr/bin/perl
 > > use strict;
 > > use warnings;
 > >
 > > my $file = $ARGV[0];

Slightly better would be $filename, so that when you step up to
Path::Class you can differentiate a file object from a file name
string.

 > > my @family = ('AS5','AS9');

Better would be @families, plural.  See the use of $family below.

 > > my $i;
 > > my $ii;

As far as I can tell, these are just counting the number of things
that you push onto the various arrays.  You don't need them, referring
to the list in scalar context will give you its size.

 > > my $test;

You use this to hold the name of the family, so it's not particularly
evocative.  You should also restrict it's scope to within the loop.
See the comment for the foreach loop.

 > > open (my $fh, "<", $file) or die ("Can't open $file: $!");

You made my day, three arg. open *and* you checked for errors.  Nice!

 > > foreach (@family){

Better would be

  for my $family (@families) {

which is evocative and restricts the scope of $family to the for loop
(and for is 4 characters shorter than foreach...).

 > >     $test = $_;

No longer need this, using $family declared in the for loop with the
proper scoping.

 > >     my @data_weight_2N = ();
 > >     my @data_weight_3N = ();
 > >     while (<$fh>){
 > >         chomp;  
 > >         my $line = $_;
 > >         my @data  = split ("\t", $line);

Don't parse CSV (TSV) files yourself.  Get in the habit of using
Text::CSV_XS.

 > >         if ($data[0] !~ /[0-9]*/){
 > >         next;}
 > >         elsif ($data[1] eq "ABF09-$test"){
 > >             $i += 1; 

You don't need the counter.

 > >             push (@data_weight_2N,  $data[6]);
 > >         }elsif ($data[1] eq "ABF09-".$test."PS"){
 > >         $ii += 1;

You don't need the counter.

 > >             push (@data_weight_3N,$data[6]);
 > >     }
 > > }
 > >     my $mean_2N = &average (\@data_weight_2N);
 > >     my $stdev_2N = &stdev (\@data_weight_2N);

You don't need the ampersands on the subroutine calls.  They're old
school <joke> and just encourage people to make fun of our language for its
use of all those funny punctuation marks </joke>.

 > >     my $stderr_2N = ($stdev_2N/sqrt($i));

Unless I'm mistaken, this is equivalent

    my $stderr_2N = ($stdev_2N/sqrt(scalar @data_weight_2N));

and you don't need the counter, the explicit use of scalar there might
even be redundant (I'm a coward).  You use the same trick in your
subroutine defn's below.

 > >
 > >     print "These are the the avearge weight, stdev and stderr for $test 
 > > 2N:\t", $mean_2N,"\t",$stdev_2N,"\t",$stderr_2N, "\n";
 > >
 > >     my $mean_3N = &average (\@data_weight_3N);
 > >     my $stdev_3N = &stdev (\@data_weight_3N);
 > >     my $stderr_3N = ($stdev_3N/sqrt($i));
 > >
 > >     print "These are the the avearge weight, stdev and stderr for $test 
 > > 3N:\t", $mean_3N,"\t",$stdev_3N,"\t",$stderr_3N, "\n";
 > > }
 > >
 > > close ($fh);

Ah, rats.  You checked whether open worked, you need to do the same
thing on close too!

  close ($fh) or die !$;

Or you could just

  use autodie qw(open close);

and then they'll die appropriately when they have to and you don't
have to bother with the checking.

 > > sub average{
 > >         my($data) = @_;
 > >         if (not @$data) {
 > >                 print ("Empty array\n");
 > >                 return 0;
 > >         }
 > >         my $total = 0;
 > >         foreach (@$data) {
 > >                 $total += $_;
 > >         }

  use List::AllUtils qw(sum); # somewhere up at the top of the script...

  my $total = sum(@$data);
  if (not defined $total) {
     print "Empty array\n";
     return;
  }

List::AllUtils is your friend.  Learn to use it.

Your returning 0 for an empty list is probably the wrong thing, isn't
it possible to the total to actually be 0?  Just return instead.
Don't return undef, just return (and let perl take context into
account for you).

You probably don't actually want to spew "Empty array" out into your
output stream, imagine writing a script that postprocesses your output
and having to deal with it.  If you really need to say it, send it to
standard error with

  print STDERR "Empty array\n";

 > >         my $average = $total / @$data;
 > >         return $average;

If you don't really need the error message, then you can get to

  my $total = sum(@$data);
  return unless $total;
  return $total / @$data;

And if an empty data array is *truly* unexpected, maybe you should
just die/carp.

 > > }
 > >
 > > sub stdev{
 > >         my($data) = @_;
 > >         if(@$data == 1){
 > >                 return 0;
 > >         }
 > >         my $average = &average($data);
 > >         my $sqtotal = 0;
 > >         foreach(@$data) {
 > >                 $sqtotal += ($average-$_) ** 2;
 > >         }
 > >         my $std = ($sqtotal / (@$data-1)) ** 0.5;
 > >         return $std;
 > > }

Ditto on the use of List::AllUtils, etc...

Phew.

The only other thing I'd like to see would be an arrangement that
let's you write simple tests.  A simple sol'n would be to package the
entire main part of the code up into e.g. a subroutine that returns a
hashref keyed by family, containing a hashref keyed by 2N/3N/... and
then you could just:

  use Test::More;
  
  use Tiago qw(summarize);
  
  my $output = summarize("test_data.tsv");
  
  is($output->{AS5}->{'2N}, "42", "Got the magic number")
  
  # etc...
  
  done_testing;
  
Thanks for sharing your code.  Keep practicing!

g.


From carandraug+dev at gmail.com  Thu Feb 14 17:13:45 2013
From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=)
Date: Thu, 14 Feb 2013 22:13:45 +0000
Subject: [Bioperl-l] bioperl in Google Summer of Code 2013
Message-ID: <CAPOrs_2GA-h1hM73+jZ13Mjh3w3ZDh7jupQ4jHYcG=560jTQPg@mail.gmail.com>

Hi

we got word of it on another project I'm involved with and I was
wondering. Is bioperl going to apply for the Google Summer of Code
this year?

http://www.google-melange.com/gsoc/homepage/google/gsoc2013

Carn?


From hlapp at drycafe.net  Fri Feb 15 09:28:30 2013
From: hlapp at drycafe.net (Hilmar Lapp)
Date: Fri, 15 Feb 2013 09:28:30 -0500
Subject: [Bioperl-l] bioperl in Google Summer of Code 2013
In-Reply-To: <CAPOrs_2GA-h1hM73+jZ13Mjh3w3ZDh7jupQ4jHYcG=560jTQPg@mail.gmail.com>
References: <CAPOrs_2GA-h1hM73+jZ13Mjh3w3ZDh7jupQ4jHYcG=560jTQPg@mail.gmail.com>
Message-ID: <50A8D5F3-7248-49C1-AA1D-969C2BCA38A9@drycafe.net>

I presume the OBF does as an umbrella organization on behalf of all Bio* projects. If you fancy proposing a project idea or mentoring, now is not a bad time to think about that or looking for co-mentors.

-hilmar

Sent with a tap.

On Feb 14, 2013, at 5:13 PM, Carn? Draug <carandraug+dev at gmail.com> wrote:

> Hi
> 
> we got word of it on another project I'm involved with and I was
> wondering. Is bioperl going to apply for the Google Summer of Code
> this year?
> 
> http://www.google-melange.com/gsoc/homepage/google/gsoc2013
> 
> Carn?
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From p.j.a.cock at googlemail.com  Fri Feb 15 09:47:39 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Fri, 15 Feb 2013 14:47:39 +0000
Subject: [Bioperl-l] bioperl in Google Summer of Code 2013
In-Reply-To: <50A8D5F3-7248-49C1-AA1D-969C2BCA38A9@drycafe.net>
References: <CAPOrs_2GA-h1hM73+jZ13Mjh3w3ZDh7jupQ4jHYcG=560jTQPg@mail.gmail.com>
	<50A8D5F3-7248-49C1-AA1D-969C2BCA38A9@drycafe.net>
Message-ID: <CAKVJ-_5M9r9ZA7=KLFhzejcJ36dL11f_2kCrJBp1vR5+S9BF3Q@mail.gmail.com>

On Fri, Feb 15, 2013 at 2:28 PM, Hilmar Lapp <hlapp at drycafe.net> wrote:
> I presume the OBF does as an umbrella organization on behalf of all Bio*
> projects. If you fancy proposing a project idea or mentoring, now is not a
> bad time to think about that or looking for co-mentors.
>
> -hilmar

Yes, the plan is that as in the last few years, the OBF will apply to
GSoC and cover for BioPerl, BioJava, BioRuby, Biopython etc. At
this stage the Bio* projects would be wise to start coming up with
some good project ideas and experienced developers thinking about
being a mentor. For potential students, getting involved in the
community early is a good idea (e.g. bug reports, or better fixing
existing bugs)

See also:
http://lists.open-bio.org/mailman/listinfo/gsoc
http://lists.open-bio.org/mailman/listinfo/gsoc-mentors

Peter


From cjfields at illinois.edu  Fri Feb 15 09:59:43 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Fri, 15 Feb 2013 14:59:43 +0000
Subject: [Bioperl-l] bioperl in Google Summer of Code 2013
In-Reply-To: <CAKVJ-_5M9r9ZA7=KLFhzejcJ36dL11f_2kCrJBp1vR5+S9BF3Q@mail.gmail.com>
References: <CAPOrs_2GA-h1hM73+jZ13Mjh3w3ZDh7jupQ4jHYcG=560jTQPg@mail.gmail.com>
	<50A8D5F3-7248-49C1-AA1D-969C2BCA38A9@drycafe.net>
	<CAKVJ-_5M9r9ZA7=KLFhzejcJ36dL11f_2kCrJBp1vR5+S9BF3Q@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE28328@CHIMBX5.ad.uillinois.edu>

On Feb 15, 2013, at 8:47 AM, Peter Cock <p.j.a.cock at googlemail.com> wrote:

> On Fri, Feb 15, 2013 at 2:28 PM, Hilmar Lapp <hlapp at drycafe.net> wrote:
>> I presume the OBF does as an umbrella organization on behalf of all Bio*
>> projects. If you fancy proposing a project idea or mentoring, now is not a
>> bad time to think about that or looking for co-mentors.
>> 
>> -hilmar
> 
> Yes, the plan is that as in the last few years, the OBF will apply to
> GSoC and cover for BioPerl, BioJava, BioRuby, Biopython etc. At
> this stage the Bio* projects would be wise to start coming up with
> some good project ideas and experienced developers thinking about
> being a mentor. For potential students, getting involved in the
> community early is a good idea (e.g. bug reports, or better fixing
> existing bugs)
> 
> See also:
> http://lists.open-bio.org/mailman/listinfo/gsoc
> http://lists.open-bio.org/mailman/listinfo/gsoc-mentors
> 
> Peter

At the moment I'm not sure if Rob is heading this up or if the baton will be passed on to someone else.  I can't take charge of writing up a proposal at the moment but I can certainly help edit.

chris


From scott at scottcain.net  Fri Feb 15 14:18:37 2013
From: scott at scottcain.net (Scott Cain)
Date: Fri, 15 Feb 2013 14:18:37 -0500
Subject: [Bioperl-l] sequence-region directives in gff files
In-Reply-To: <CAPOrs_3r_cay3d59uBXCNqKwGHRBOBy+c+XOzvrfMeHdbzNTLg@mail.gmail.com>
References: <CAPOrs_3r_cay3d59uBXCNqKwGHRBOBy+c+XOzvrfMeHdbzNTLg@mail.gmail.com>
Message-ID: <CA+JTaox4SeQueWRpvgmq7GpdJ=EzQe6t3Lim2yn6y=_dBcp95A@mail.gmail.com>

Hi Carn?,

Thanks for pointing this out; I was only sort of paying attention to
the FeatureIO discussion, and it hadn't occurred to me that my commit
was the problem.

I believe I've reproduced the functionality from that commit, and I
even added a test that makes use of the added method (yes, I know, it
surprised me too!).  All of the tests now pass for me in the FeatureIO
master.  I'm putting it on my todo list to check that the Chado loader
that makes use of Bio::FeatureIO still works as expected with the new
incarnation.

Thanks,
Scott


On Wed, Feb 13, 2013 at 5:22 AM, Carn? Draug <carandraug+dev at gmail.com> wrote:
> Hi Scott
>
> 3 years ago, the code for the Bio::SeqFeatureIO::* modules was split
> from bioperl-live into a separate repository[1]. Because the code was
> not removed from the bioperl-live repository, people ended up patching
> on both sides, leading to 2 branches of development. Last weekend I
> merged them back together with the exception of one commit that would
> not longer apply[2].
>
> This commit was authored by you with the following commit message:
> "tiny change to Bio::FeatureIO::gff to allow the gmod chado gff3 bulk
> loader to not choke when the gff file has ##sequence-region
> directives.  The loader is documented not to support this, but now it
> will quitely ignore those directives."
>
> Do you think you could take a look at it?
>
> Thank you,
> Carn?
>
> [1] https://github.com/bioperl/Bio-FeatureIO
> [2] https://github.com/bioperl/bioperl-live/commit/7218728b66ad297953676236077fd0ec757378c0


-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   scott at scottcain dot net
GMOD Coordinator (http://gmod.org/)                     216-392-3087
Ontario Institute for Cancer Research


From carandraug+dev at gmail.com  Tue Feb 19 13:52:57 2013
From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=)
Date: Tue, 19 Feb 2013 18:52:57 +0000
Subject: [Bioperl-l] bioperl in Google Summer of Code 2013
In-Reply-To: <CAPOrs_0u2Qpft6_pWMaj3Wdf_-ZPOfnoYoOaevdCL443hnUsoA@mail.gmail.com>
References: <CAPOrs_2GA-h1hM73+jZ13Mjh3w3ZDh7jupQ4jHYcG=560jTQPg@mail.gmail.com>
	<50A8D5F3-7248-49C1-AA1D-969C2BCA38A9@drycafe.net>
	<CAKVJ-_5M9r9ZA7=KLFhzejcJ36dL11f_2kCrJBp1vR5+S9BF3Q@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE28328@CHIMBX5.ad.uillinois.edu>
	<CAPOrs_0u2Qpft6_pWMaj3Wdf_-ZPOfnoYoOaevdCL443hnUsoA@mail.gmail.com>
Message-ID: <CAPOrs_0kiyqSfvS7ZgEkWwbAaiA2L5fV9U2r5U9cROTvyMGLRw@mail.gmail.com>

On 15 February 2013 14:28, Hilmar Lapp <hlapp at drycafe.net> wrote:
> [...]
> If you fancy proposing a project idea or mentoring, now is not a bad time to think about that or looking for co-mentors.

On 15 February 2013 14:59, Fields, Christopher J <cjfields at illinois.edu> wrote:
> At the moment I'm not sure if Rob is heading this up or if the baton will be passed on to someone else.  I can't take charge of writing up a proposal at the moment but I can certainly help edit.

I would like to participate this year as a student.

I do not have however, have any bioperl itch that would last a summer
to fix. The largest of them is to implement BLAST using NCBI's server.
They have made available a SOAP-based BLAST and doing this has been on
my todo for ages. Would you suggest any other project for bioperl?

Carn?


From peymanalavi at yahoo.com  Tue Feb 19 16:16:49 2013
From: peymanalavi at yahoo.com (peyman alavi)
Date: Tue, 19 Feb 2013 13:16:49 -0800 (PST)
Subject: [Bioperl-l] BioGraphics: Bio::SCF installation through cpan fails
Message-ID: <1361308609.90384.YahooMailNeo@web120901.mail.ne1.yahoo.com>

Hello,
I am having
problems for a while trying to install the Bio::SCF module on my Vista32. Now, I know that Bio::SCF isn't really a Bioperl module, but I need it for Bio::Graphics, and I thought perhaps other people had experienced the same problem before.? I
have installed zlib and io_lib (both their last available versions), but it
looks like sth. (presumably with io_lib) is missing. I should be very grateful
if someone could tell me what still needs to be done!
Here are
the paths where the io_lib "library" and "include" directories are installed, and I
set them to cpan before trying to install Bio::SCF:
o conf
makepl_arg ?LIBS=-Lc:/MinGW/msys/1.0/local/lib INC=-Ic:/MinGW/msys/1.0/local/include?
And the
following is what I get on the STDOUT:
?
Set up gcc environment - 4.7.2
[32m
cpan shell -- CPAN exploration and modules installation (v1.9800)
Enter 'h' for help.[0m
?
[32m??? makepl_arg???????? [LIBS=-Lc:/MinGW/msys/1.0/local/lib
INC=-Ic:/MinGW/msys/1.0/local/include][0m
[32mPlease use 'o conf commit' to make the config permanent![0m
?
[32m[0m
[32mReading 'D:\Perl\cpan\Metadata'[0m
[32m? Database was generated on
Sun, 17 Feb 2013 12:17:02 GMT[0m
[32mRunning install for module 'Bio::SCF'[0m
[32mRunning make for L/LD/LDS/Bio-SCF-1.03.tar.gz[0m
[32mChecksum for
D:\Perl\cpan\sources\authors\id\L\LD\LDS\Bio-SCF-1.03.tar.gz ok[0m
[32mScanning cache D:\Perl/cpan/build for sizes[0m
[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32mDONE[0m
[32mBio-SCF-1.03/[0m
[32mBio-SCF-1.03/t/[0m
[32mBio-SCF-1.03/t/scf.t[0m
[32mBio-SCF-1.03/eg/[0m
[32mBio-SCF-1.03/eg/write_test_obj.pl[0m
[32mBio-SCF-1.03/eg/write_test_tied.pl[0m
[32mBio-SCF-1.03/eg/read_test_obj.pl[0m
[32mBio-SCF-1.03/eg/read_test_tied.pl[0m
[32mBio-SCF-1.03/SCF/[0m
[32mBio-SCF-1.03/SCF/Arrays.pm[0m
[32mBio-SCF-1.03/DISCLAIMER[0m
[32mBio-SCF-1.03/README[0m
[32mBio-SCF-1.03/SCF.pm[0m
[32mBio-SCF-1.03/SCF.xs[0m
[32mBio-SCF-1.03/Changes[0m
[32mBio-SCF-1.03/test.scf[0m
[32mBio-SCF-1.03/Makefile.PL[0m
[32mBio-SCF-1.03/META.yml[0m
[32mBio-SCF-1.03/INSTALL[0m
[32mBio-SCF-1.03/MANIFEST[0m
[32m
? CPAN.pm: Building
L/LD/LDS/Bio-SCF-1.03.tar.gz[0m
?
Set up gcc environment - 4.7.2
Checking if your kit is complete...
Looks good
Writing Makefile for Bio::SCF
Writing MYMETA.yml and MYMETA.json
cp SCF.pm blib\lib\Bio\SCF.pm
cp SCF/Arrays.pm blib\lib\Bio\SCF\Arrays.pm
D:\Perl\bin\perl.exe D:\Perl\site\lib\ExtUtils\xsubpp? -typemap D:\Perl\lib\ExtUtils\typemap? SCF.xs > SCF.xsc &&
D:\Perl\bin\perl.exe -MExtUtils::Command -e mv -- SCF.xsc SCF.c
Please specify prototyping behavior for SCF.xs (see perlxs manual)
c:/MinGW/bin/gcc.exe -c? -Ic:/MinGW/msys/1.0/local/include ???????????? -DNDEBUG
-DWIN32 -D_CONSOLE -DNO_STRICT -DHAVE_DES_FCRYPT -DUSE_SITECUSTOMIZE
-DPERL_IMPLICIT_CONTEXT -DPERL_IMPLICIT_SYS -DUSE_PERLIO -D_USE_32BIT_TIME_T
-DPERL_MSVCRT_READFIX -DHASATTRIBUTE -fno-strict-aliasing -mms-bitfields -O2 ??????? ??-DVERSION=\"1.03\" ??????? -DXS_VERSION=\"1.03\"? "-ID:\Perl\lib\CORE"? -DLITTLE_ENDIAN SCF.c
In file included from c:/MinGW/msys/1.0/local/include/io_lib/scf.h:31:0,
???????????????? from SCF.xs:12:
c:/MinGW/msys/1.0/local/include/io_lib/mFILE.h:23:0: warning:
"MF_APPEND" redefined [enabled by default]
In file included from
c:\mingw\bin\../lib/gcc/mingw32/4.7.2/../../../../include/windows.h:55:0,
???????????????? from
D:\Perl\lib\CORE/win32.h:61,
???????????????? from
D:\Perl\lib\CORE/win32thread.h:4,
???????????????? from
D:\Perl\lib\CORE/perl.h:2825,
???????????????? from SCF.xs:5:
c:\mingw\bin\../lib/gcc/mingw32/4.7.2/../../../../include/winuser.h:131:0:
note: this is the location of the previous definition
SCF.xs: In function 'XS_Bio__SCF_get_scf_pointer':
SCF.xs:35:2: warning: passing argument 3 of '(*Perl_ILIO_ptr((struct
PerlInterpreter *)Perl_get_context()))->pNameStat' from incompatible pointer
type [enabled by default]
SCF.xs:35:2: note: expected 'struct _stati64 *' but argument is of type
'struct stat *'
Running Mkbootstrap for Bio::SCF ()
D:\Perl\bin\perl.exe -MExtUtils::Command -e chmod -- 644 SCF.bs
D:\Perl\bin\perl.exe -MExtUtils::Mksymlists \
???? -e
"Mksymlists('NAME'=>\"Bio::SCF\", 'DLBASE' => 'SCF',
'DL_FUNCS' => {? }, 'FUNCLIST' =>
[], 'IMPORTS' => {? }, 'DL_VARS' =>
[]);"
Set up gcc environment - 4.7.2
dlltool --def SCF.def --output-exp dll.exp
c:\MinGW\bin\g++.exe -o blib\arch\auto\Bio\SCF\SCF.dll -Wl,--base-file
-Wl,dll.base -mdll -L"D:\Perl\lib\CORE" SCF.o?? D:\Perl\lib\CORE\perl512.lib
c:\MinGW\lib\libkernel32.a c:\MinGW\lib\libuser32.a c:\MinGW\lib\libgdi32.a
c:\MinGW\lib\libwinspool.a c:\MinGW\lib\libcomdlg32.a c:\MinGW\lib\libadvapi32.a
c:\MinGW\lib\libshell32.a c:\MinGW\lib\libole32.a c:\MinGW\lib\liboleaut32.a
c:\MinGW\lib\libnetapi32.a c:\MinGW\lib\libuuid.a c:\MinGW\lib\libws2_32.a
c:\MinGW\lib\libmpr.a c:\MinGW\lib\libwinmm.a c:\MinGW\lib\libversion.a
c:\MinGW\lib\libodbc32.a c:\MinGW\lib\libodbccp32.a c:\MinGW\lib\libcomctl32.a
c:\MinGW\lib\libmsvcrt.a dll.exp
Warning: resolving _VirtualQuery at 12 by linking to _VirtualQuery
Use --enable-stdcall-fixup to disable these warnings
Use --disable-stdcall-fixup to disable these fixups
Warning: resolving _VirtualProtect at 16 by linking to _VirtualProtect
Warning: resolving _EnterCriticalSection at 4 by linking to
_EnterCriticalSection
Warning: resolving _TlsGetValue at 4 by linking to _TlsGetValue
Warning: resolving _GetLastError at 0 by linking to _GetLastError
Warning: resolving _LeaveCriticalSection at 4 by linking to
_LeaveCriticalSection
Warning: resolving _DeleteCriticalSection at 4 by linking to
_DeleteCriticalSection
Warning: resolving _InitializeCriticalSection at 4 by linking to
_InitializeCriticalSection
SCF.o:SCF.c:(.text+0xf35): undefined reference to `mfreopen'
SCF.o:SCF.c:(.text+0xf4b): undefined reference to `mfwrite_scf'
SCF.o:SCF.c:(.text+0xf6a): undefined reference to `mfflush'
SCF.o:SCF.c:(.text+0xf72): undefined reference to `mfdestroy'
SCF.o:SCF.c:(.text+0x1138): undefined reference to `write_scf'
SCF.o:SCF.c:(.text+0x16ac): undefined reference to `scf_deallocate'
SCF.o:SCF.c:(.text+0x17b1): undefined reference to `mfreopen'
SCF.o:SCF.c:(.text+0x17c1): undefined reference to `mfread_scf'
SCF.o:SCF.c:(.text+0x19bd): undefined reference to `read_scf'
c:/mingw/bin/../lib/gcc/mingw32/4.7.2/../../../../mingw32/bin/ld.exe:
SCF.o: bad reloc address 0xa4 in section `.rdata'
c:/mingw/bin/../lib/gcc/mingw32/4.7.2/../../../../mingw32/bin/ld.exe:
final link failed: Invalid operation
collect2.exe: error: ld returned 1 exit status
dmake.exe:? Error code 129, while
making 'blib\arch\auto\Bio\SCF\SCF.dll'
[32m? LDS/Bio-SCF-1.03.tar.gz[0m
[31m? D:\Perl\site\bin\dmake.exe
-- NOT OK[0m
[32mRunning make test[0m
[32m? Can't test without successful
make[0m
[32mRunning make install[0m
[32m? Make had returned bad
status, install seems impossible[0m
[32mFailed during this command:
?LDS/Bio-SCF-1.03.tar.gz????????????????????? : make NO[0m
[32m[0m
[31mWarning: Configuration not saved.[0m
[32mLockfile removed.[0m
?
?
?Thanks in advance for any useful
suggestions/help!!
Peyman


From scott at scottcain.net  Tue Feb 19 18:39:44 2013
From: scott at scottcain.net (Scott Cain)
Date: Tue, 19 Feb 2013 18:39:44 -0500
Subject: [Bioperl-l] BioGraphics: Bio::SCF installation through cpan
	fails
In-Reply-To: <1361308609.90384.YahooMailNeo@web120901.mail.ne1.yahoo.com>
References: <1361308609.90384.YahooMailNeo@web120901.mail.ne1.yahoo.com>
Message-ID: <777246AB-2EF0-403D-9652-8EA8390D5C53@scottcain.net>

Hi Peyman,

I have no idea what might be required to get staden and Bio::SCF installed on a windows machine; you have my sympathies for having to go through it. 

But what I wanted to touch on was what you wrote, that is, that you "need" it for Bio::Graphics. I just wanted to point out that you don't need it unless you want to be able to display traces from ABI sequencers (which most people don't really care to do these days). Bioi::SCF is listed as a recommended module, not a required one.

Scott


Sent from my iPad

On Feb 19, 2013, at 4:16 PM, peyman alavi <peymanalavi at yahoo.com> wrote:

> Hello,
> I am having
> problems for a while trying to install the Bio::SCF module on my Vista32. Now, I know that Bio::SCF isn't really a Bioperl module, but I need it for Bio::Graphics, and I thought perhaps other people had experienced the same problem before.  I
> have installed zlib and io_lib (both their last available versions), but it
> looks like sth. (presumably with io_lib) is missing. I should be very grateful
> if someone could tell me what still needs to be done!
> Here are
> the paths where the io_lib "library" and "include" directories are installed, and I
> set them to cpan before trying to install Bio::SCF:
> o conf
> makepl_arg ?LIBS=-Lc:/MinGW/msys/1.0/local/lib INC=-Ic:/MinGW/msys/1.0/local/include?
> And the
> following is what I get on the STDOUT:
>  
> Set up gcc environment - 4.7.2
> [32m
> cpan shell -- CPAN exploration and modules installation (v1.9800)
> Enter 'h' for help.[0m
>  
> [32m    makepl_arg         [LIBS=-Lc:/MinGW/msys/1.0/local/lib
> INC=-Ic:/MinGW/msys/1.0/local/include][0m
> [32mPlease use 'o conf commit' to make the config permanent![0m
>  
> [32m[0m
> [32mReading 'D:\Perl\cpan\Metadata'[0m
> [32m  Database was generated on
> Sun, 17 Feb 2013 12:17:02 GMT[0m
> [32mRunning install for module 'Bio::SCF'[0m
> [32mRunning make for L/LD/LDS/Bio-SCF-1.03.tar.gz[0m
> [32mChecksum for
> D:\Perl\cpan\sources\authors\id\L\LD\LDS\Bio-SCF-1.03.tar.gz ok[0m
> [32mScanning cache D:\Perl/cpan/build for sizes[0m
> [32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32mDONE[0m
> [32mBio-SCF-1.03/[0m
> [32mBio-SCF-1.03/t/[0m
> [32mBio-SCF-1.03/t/scf.t[0m
> [32mBio-SCF-1.03/eg/[0m
> [32mBio-SCF-1.03/eg/write_test_obj.pl[0m
> [32mBio-SCF-1.03/eg/write_test_tied.pl[0m
> [32mBio-SCF-1.03/eg/read_test_obj.pl[0m
> [32mBio-SCF-1.03/eg/read_test_tied.pl[0m
> [32mBio-SCF-1.03/SCF/[0m
> [32mBio-SCF-1.03/SCF/Arrays.pm[0m
> [32mBio-SCF-1.03/DISCLAIMER[0m
> [32mBio-SCF-1.03/README[0m
> [32mBio-SCF-1.03/SCF.pm[0m
> [32mBio-SCF-1.03/SCF.xs[0m
> [32mBio-SCF-1.03/Changes[0m
> [32mBio-SCF-1.03/test.scf[0m
> [32mBio-SCF-1.03/Makefile.PL[0m
> [32mBio-SCF-1.03/META.yml[0m
> [32mBio-SCF-1.03/INSTALL[0m
> [32mBio-SCF-1.03/MANIFEST[0m
> [32m
>   CPAN.pm: Building
> L/LD/LDS/Bio-SCF-1.03.tar.gz[0m
>  
> Set up gcc environment - 4.7.2
> Checking if your kit is complete...
> Looks good
> Writing Makefile for Bio::SCF
> Writing MYMETA.yml and MYMETA.json
> cp SCF.pm blib\lib\Bio\SCF.pm
> cp SCF/Arrays.pm blib\lib\Bio\SCF\Arrays.pm
> D:\Perl\bin\perl.exe D:\Perl\site\lib\ExtUtils\xsubpp  -typemap D:\Perl\lib\ExtUtils\typemap  SCF.xs > SCF.xsc &&
> D:\Perl\bin\perl.exe -MExtUtils::Command -e mv -- SCF.xsc SCF.c
> Please specify prototyping behavior for SCF.xs (see perlxs manual)
> c:/MinGW/bin/gcc.exe -c  -Ic:/MinGW/msys/1.0/local/include              -DNDEBUG
> -DWIN32 -D_CONSOLE -DNO_STRICT -DHAVE_DES_FCRYPT -DUSE_SITECUSTOMIZE
> -DPERL_IMPLICIT_CONTEXT -DPERL_IMPLICIT_SYS -DUSE_PERLIO -D_USE_32BIT_TIME_T
> -DPERL_MSVCRT_READFIX -DHASATTRIBUTE -fno-strict-aliasing -mms-bitfields -O2           -DVERSION=\"1.03\"         -DXS_VERSION=\"1.03\"  "-ID:\Perl\lib\CORE"  -DLITTLE_ENDIAN SCF.c
> In file included from c:/MinGW/msys/1.0/local/include/io_lib/scf.h:31:0,
>                  from SCF.xs:12:
> c:/MinGW/msys/1.0/local/include/io_lib/mFILE.h:23:0: warning:
> "MF_APPEND" redefined [enabled by default]
> In file included from
> c:\mingw\bin\../lib/gcc/mingw32/4.7.2/../../../../include/windows.h:55:0,
>                  from
> D:\Perl\lib\CORE/win32.h:61,
>                  from
> D:\Perl\lib\CORE/win32thread.h:4,
>                  from
> D:\Perl\lib\CORE/perl.h:2825,
>                  from SCF.xs:5:
> c:\mingw\bin\../lib/gcc/mingw32/4.7.2/../../../../include/winuser.h:131:0:
> note: this is the location of the previous definition
> SCF.xs: In function 'XS_Bio__SCF_get_scf_pointer':
> SCF.xs:35:2: warning: passing argument 3 of '(*Perl_ILIO_ptr((struct
> PerlInterpreter *)Perl_get_context()))->pNameStat' from incompatible pointer
> type [enabled by default]
> SCF.xs:35:2: note: expected 'struct _stati64 *' but argument is of type
> 'struct stat *'
> Running Mkbootstrap for Bio::SCF ()
> D:\Perl\bin\perl.exe -MExtUtils::Command -e chmod -- 644 SCF.bs
> D:\Perl\bin\perl.exe -MExtUtils::Mksymlists \
>      -e
> "Mksymlists('NAME'=>\"Bio::SCF\", 'DLBASE' => 'SCF',
> 'DL_FUNCS' => {  }, 'FUNCLIST' =>
> [], 'IMPORTS' => {  }, 'DL_VARS' =>
> []);"
> Set up gcc environment - 4.7.2
> dlltool --def SCF.def --output-exp dll.exp
> c:\MinGW\bin\g++.exe -o blib\arch\auto\Bio\SCF\SCF.dll -Wl,--base-file
> -Wl,dll.base -mdll -L"D:\Perl\lib\CORE" SCF.o   D:\Perl\lib\CORE\perl512.lib
> c:\MinGW\lib\libkernel32.a c:\MinGW\lib\libuser32.a c:\MinGW\lib\libgdi32.a
> c:\MinGW\lib\libwinspool.a c:\MinGW\lib\libcomdlg32.a c:\MinGW\lib\libadvapi32.a
> c:\MinGW\lib\libshell32.a c:\MinGW\lib\libole32.a c:\MinGW\lib\liboleaut32.a
> c:\MinGW\lib\libnetapi32.a c:\MinGW\lib\libuuid.a c:\MinGW\lib\libws2_32.a
> c:\MinGW\lib\libmpr.a c:\MinGW\lib\libwinmm.a c:\MinGW\lib\libversion.a
> c:\MinGW\lib\libodbc32.a c:\MinGW\lib\libodbccp32.a c:\MinGW\lib\libcomctl32.a
> c:\MinGW\lib\libmsvcrt.a dll.exp
> Warning: resolving _VirtualQuery at 12 by linking to _VirtualQuery
> Use --enable-stdcall-fixup to disable these warnings
> Use --disable-stdcall-fixup to disable these fixups
> Warning: resolving _VirtualProtect at 16 by linking to _VirtualProtect
> Warning: resolving _EnterCriticalSection at 4 by linking to
> _EnterCriticalSection
> Warning: resolving _TlsGetValue at 4 by linking to _TlsGetValue
> Warning: resolving _GetLastError at 0 by linking to _GetLastError
> Warning: resolving _LeaveCriticalSection at 4 by linking to
> _LeaveCriticalSection
> Warning: resolving _DeleteCriticalSection at 4 by linking to
> _DeleteCriticalSection
> Warning: resolving _InitializeCriticalSection at 4 by linking to
> _InitializeCriticalSection
> SCF.o:SCF.c:(.text+0xf35): undefined reference to `mfreopen'
> SCF.o:SCF.c:(.text+0xf4b): undefined reference to `mfwrite_scf'
> SCF.o:SCF.c:(.text+0xf6a): undefined reference to `mfflush'
> SCF.o:SCF.c:(.text+0xf72): undefined reference to `mfdestroy'
> SCF.o:SCF.c:(.text+0x1138): undefined reference to `write_scf'
> SCF.o:SCF.c:(.text+0x16ac): undefined reference to `scf_deallocate'
> SCF.o:SCF.c:(.text+0x17b1): undefined reference to `mfreopen'
> SCF.o:SCF.c:(.text+0x17c1): undefined reference to `mfread_scf'
> SCF.o:SCF.c:(.text+0x19bd): undefined reference to `read_scf'
> c:/mingw/bin/../lib/gcc/mingw32/4.7.2/../../../../mingw32/bin/ld.exe:
> SCF.o: bad reloc address 0xa4 in section `.rdata'
> c:/mingw/bin/../lib/gcc/mingw32/4.7.2/../../../../mingw32/bin/ld.exe:
> final link failed: Invalid operation
> collect2.exe: error: ld returned 1 exit status
> dmake.exe:  Error code 129, while
> making 'blib\arch\auto\Bio\SCF\SCF.dll'
> [32m  LDS/Bio-SCF-1.03.tar.gz[0m
> [31m  D:\Perl\site\bin\dmake.exe
> -- NOT OK[0m
> [32mRunning make test[0m
> [32m  Can't test without successful
> make[0m
> [32mRunning make install[0m
> [32m  Make had returned bad
> status, install seems impossible[0m
> [32mFailed during this command:
>  LDS/Bio-SCF-1.03.tar.gz                      : make NO[0m
> [32m[0m
> [31mWarning: Configuration not saved.[0m
> [32mLockfile removed.[0m
>  
>  
>  Thanks in advance for any useful
> suggestions/help!!
> Peyman
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From anngregory at email.arizona.edu  Wed Feb 20 00:20:41 2013
From: anngregory at email.arizona.edu (Ann Gregory)
Date: Tue, 19 Feb 2013 22:20:41 -0700
Subject: [Bioperl-l]  Problem Parsing BLAST output to annotate FASTA file
Message-ID: <CAHxs2gtL=UVAh_f7nSCFKAOj11wf92MThNqHCDxAEfRyb+M_zw@mail.gmail.com>

Hi BioPerl,

I am having issues with a BioPerl script. I have a blastxml file from a
blastx blast and the original multifasta file containing the original
nucleotides sequences.

I want to take the blast result (ie. the blast description) and annotate my
multifasta file.

I have written 2 while loops that extract the blast descriptions as well as
the nucleotide sequence from the multifasta file.

My problem is that I cannot incorporate one of the while loops into the
other without loosing the loop property of one of the loops. I would like
to take the 1st blast description, then the 1st nucleotide sequence, then
the 2nd blast description, then the 2nd nucleotide sequence and so
on...just can figure out how to alternate the results.

See script below:


use warnings;
use strict;
use Bio::SearchIO;
use Bio::SeqIO;


my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file =>
"$ARGV[0]");
while (my $result = $search_in->next_result) {
while (my $hit = $result->next_hit) {
while (my $hsp = $hit->next_hsp) {
my $qd = $hit->description;
print $qd, "\n";
}
}
}

my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]");
while (my $seqobj = $seqio->next_seq) {
my $nuc = $seqobj->seq();
print $nuc, "\n";
}--
Ann (Nina) Gregory
Graduate Student
Rich Lab / Sullivan Lab
Soil, Water, Environmental Science Department
University of Arizona


From yonexhalaolv at gmail.com  Wed Feb 20 04:17:12 2013
From: yonexhalaolv at gmail.com (Sebastian Lau)
Date: Wed, 20 Feb 2013 01:17:12 -0800 (PST)
Subject: [Bioperl-l] =?utf-8?q?failed_to_install_via_fink=EF=BC=9Ano_packa?=
 =?utf-8?q?ge_found_for_specification_=27bioperl-pm5100=27!?=
Message-ID: <84fa1bcb-a39f-4847-bff2-e3a9c2b909ea@googlegroups.com>

*Hi guys,*
*
*
*I just about to install bioperl on my MacOS 10.7.5 via fink. but after 
typing the command, fink said it couldn't find any package:*

fangmatoMacBook-Pro:~ yoyo$ fink install bioperl-pm5100
Information about 6901 packages read in 1 seconds.
Failed: no package found for specification 'bioperl-pm5100'!
fangmatoMacBook-Pro:~ yoyo$ fink install bioperl-pm588
Information about 6901 packages read in 1 seconds.
Failed: no package found for specification 'bioperl-pm588'!
fangmatoMacBook-Pro:~ yoyo$ fink install bioperl-pm586
Information about 6901 packages read in 1 seconds.
Failed: no package found for specification 'bioperl-pm586'!

*I followed the instruction on wiki. I don't know what's wrong with it. 
Thanks for your help.*


From awitney at sgul.ac.uk  Wed Feb 20 10:22:51 2013
From: awitney at sgul.ac.uk (Adam Witney)
Date: Wed, 20 Feb 2013 15:22:51 +0000
Subject: [Bioperl-l] Problem Parsing BLAST output to annotate FASTA file
In-Reply-To: <CAHxs2gtL=UVAh_f7nSCFKAOj11wf92MThNqHCDxAEfRyb+M_zw@mail.gmail.com>
References: <CAHxs2gtL=UVAh_f7nSCFKAOj11wf92MThNqHCDxAEfRyb+M_zw@mail.gmail.com>
Message-ID: <5124EA4B.5020409@sgul.ac.uk>


Hi Ann,

On 20/02/2013 05:20, Ann Gregory wrote:
> Hi BioPerl,
> 
> I am having issues with a BioPerl script. I have a blastxml file from a
> blastx blast and the original multifasta file containing the original
> nucleotides sequences.
> 
> I want to take the blast result (ie. the blast description) and annotate my
> multifasta file.
> 
> I have written 2 while loops that extract the blast descriptions as well as
> the nucleotide sequence from the multifasta file.
> 
> My problem is that I cannot incorporate one of the while loops into the
> other without loosing the loop property of one of the loops. I would like
> to take the 1st blast description, then the 1st nucleotide sequence, then
> the 2nd blast description, then the 2nd nucleotide sequence and so
> on...just can figure out how to alternate the results.
> 
> See script below:
> 
> 
> use warnings;
> use strict;
> use Bio::SearchIO;
> use Bio::SeqIO;
> 
> 
> my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file =>
> "$ARGV[0]");
> while (my $result = $search_in->next_result) {
> while (my $hit = $result->next_hit) {
> while (my $hsp = $hit->next_hsp) {
> my $qd = $hit->description;
> print $qd, "\n";
> }
> }
> }
> 
> my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]");
> while (my $seqobj = $seqio->next_seq) {
> my $nuc = $seqobj->seq();
> print $nuc, "\n";
> }--

I think what you are proposing assumes that the loop over the BLAST
results will come back in the same order as the loop over the Fasta
file, this may be the case, but I'm not sure its something I would rely on.

Anyway, I would loop over the BLAST results, storing the relevant data
to an array or hash and then loop over the fasta file to put the two
together. eg:

my $blast_data;

while ( ... blast data ... ) {
	...
	$blast_data->{$qd} = <whatever you want to store>
	...
}

while ( my $seqobj = $seqio->next_seq ) {
	my $id = $seqobj->id;
	print $blast_data->{$id}."\n";
}

something along those lines... or have i misunderstood you? if so can
you provide some more details, like what do you want your output to look
like?

HTH

Adam


From andreas.leimbach at uni-wuerzburg.de  Wed Feb 20 11:24:50 2013
From: andreas.leimbach at uni-wuerzburg.de (Andreas Leimbach)
Date: Wed, 20 Feb 2013 17:24:50 +0100
Subject: [Bioperl-l] Problem Parsing BLAST output to annotate FASTA file
In-Reply-To: <CAHxs2gtL=UVAh_f7nSCFKAOj11wf92MThNqHCDxAEfRyb+M_zw@mail.gmail.com>
References: <CAHxs2gtL=UVAh_f7nSCFKAOj11wf92MThNqHCDxAEfRyb+M_zw@mail.gmail.com>
Message-ID: <5124F8D2.4020904@uni-wuerzburg.de>

oops, I just realized I had one loop to much in there. Adam is correct. 
Sorry.

The last part of the code I send you should look like this:

my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]");
while (my $seqobj = $seqio->next_seq) {
print ">$hits{$seqobj->display_id}\n";
my $nuc = $seqobj->seq();
print $nuc, "\n";
}


Cheers,
Andreas

--
Andreas Leimbach
Universit?t M?nster
Institut f?r Hygiene
Mendelstr. 7
D-48149 M?nster
Germany

Tel.: +49 (0)551 39 3843
E-Mail: andreas.leimbach at uni-wuerzburg.de

On 20.2.13 06:20, Ann Gregory wrote:
> Hi BioPerl,
>
> I am having issues with a BioPerl script. I have a blastxml file from a
> blastx blast and the original multifasta file containing the original
> nucleotides sequences.
>
> I want to take the blast result (ie. the blast description) and annotate my
> multifasta file.
>
> I have written 2 while loops that extract the blast descriptions as well as
> the nucleotide sequence from the multifasta file.
>
> My problem is that I cannot incorporate one of the while loops into the
> other without loosing the loop property of one of the loops. I would like
> to take the 1st blast description, then the 1st nucleotide sequence, then
> the 2nd blast description, then the 2nd nucleotide sequence and so
> on...just can figure out how to alternate the results.
>
> See script below:
>
>
> use warnings;
> use strict;
> use Bio::SearchIO;
> use Bio::SeqIO;
>
>
> my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file =>
> "$ARGV[0]");
> while (my $result = $search_in->next_result) {
> while (my $hit = $result->next_hit) {
> while (my $hsp = $hit->next_hsp) {
> my $qd = $hit->description;
> print $qd, "\n";
> }
> }
> }
>
> my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]");
> while (my $seqobj = $seqio->next_seq) {
> my $nuc = $seqobj->seq();
> print $nuc, "\n";
> }--
> Ann (Nina) Gregory
> Graduate Student
> Rich Lab / Sullivan Lab
> Soil, Water, Environmental Science Department
> University of Arizona
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From andreas.leimbach at uni-wuerzburg.de  Wed Feb 20 11:14:29 2013
From: andreas.leimbach at uni-wuerzburg.de (Andreas Leimbach)
Date: Wed, 20 Feb 2013 17:14:29 +0100
Subject: [Bioperl-l] Problem Parsing BLAST output to annotate FASTA file
In-Reply-To: <CAHxs2gtL=UVAh_f7nSCFKAOj11wf92MThNqHCDxAEfRyb+M_zw@mail.gmail.com>
References: <CAHxs2gtL=UVAh_f7nSCFKAOj11wf92MThNqHCDxAEfRyb+M_zw@mail.gmail.com>
Message-ID: <5124F665.5050602@uni-wuerzburg.de>

Hi Ann,

I agree with Adam, but I was already writing my email, while his came 
in. Hope it helps:

I hope I understand correctly what you want to do.
Just to clarify, you queried a protein blast database with blastx and 
nucleotide queries. Now you want to associate the protein description 
for the FIRST blast hit with the corresponding nucleotide fasta file. Is 
that correct?
You have to put the two while loops into one another. Or associate the 
blast hits with the query descriptions. But it's not feasible to take 
the first blast hit and the first nucleotide fasta seq, then the 2nd of 
both etc, as Adam already pointed out.
You would have to iterate through both at the same time. I.e. take the 
first blast hit, then iterate through the nucleotide fasta until you 
find the hit. Then take the 2nd blast hit and iterate through the 
nucleotide fasta etc. It's probably easiest to do this in a hash.

Something along the lines of (not tested I just punched that in the E-Mail):

my %hits;
my $hit_desc;
my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file =>
"$ARGV[0]");
while (my $result = $search_in->next_result) {
while (my $hit = $result->next_hit) {
while (my $hsp = $hit->next_hsp) {
if ($hit->description eq $hit_desc) { # Only want the first blast hit
next;
}
my $hit_desc = $hit->description;
$hits{$result->query_description} = $hit_desc;
}
}
}

my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]");
foreach my $query (keys %hits) {
while (my $seqobj = $seqio->next_seq) {
if ($seqobj->display_id eq $query) {
print ">$hits{$query}\n";
my $nuc = $seqobj->seq();
print $nuc, "\n";
}

You might want to put some evalue cutoff in there to only score 
significant hits. Also if your nucleotide query multi-fasta file is very 
large, you might consider creating an index first:
http://www.bioperl.org/wiki/HOWTO:Local_Databases#Bio::Index

Hope that helps!

Cheers,
Andreas

P.S.: Please next time include version numbers for BioPerl and Perl and 
a little more detail what you want to do. ;-)


--
Andreas Leimbach
Universit?t M?nster
Institut f?r Hygiene
Mendelstr. 7
D-48149 M?nster
Germany

Tel.: +49 (0)551 39 3843
E-Mail: andreas.leimbach at uni-wuerzburg.de

On 20.2.13 06:20, Ann Gregory wrote:
> Hi BioPerl,
>
> I am having issues with a BioPerl script. I have a blastxml file from a
> blastx blast and the original multifasta file containing the original
> nucleotides sequences.
>
> I want to take the blast result (ie. the blast description) and annotate my
> multifasta file.
>
> I have written 2 while loops that extract the blast descriptions as well as
> the nucleotide sequence from the multifasta file.
>
> My problem is that I cannot incorporate one of the while loops into the
> other without loosing the loop property of one of the loops. I would like
> to take the 1st blast description, then the 1st nucleotide sequence, then
> the 2nd blast description, then the 2nd nucleotide sequence and so
> on...just can figure out how to alternate the results.
>
> See script below:
>
>
> use warnings;
> use strict;
> use Bio::SearchIO;
> use Bio::SeqIO;
>
>
> my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file =>
> "$ARGV[0]");
> while (my $result = $search_in->next_result) {
> while (my $hit = $result->next_hit) {
> while (my $hsp = $hit->next_hsp) {
> my $qd = $hit->description;
> print $qd, "\n";
> }
> }
> }
>
> my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]");
> while (my $seqobj = $seqio->next_seq) {
> my $nuc = $seqobj->seq();
> print $nuc, "\n";
> }--
> Ann (Nina) Gregory
> Graduate Student
> Rich Lab / Sullivan Lab
> Soil, Water, Environmental Science Department
> University of Arizona
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From andreas.leimbach at uni-wuerzburg.de  Wed Feb 20 12:00:51 2013
From: andreas.leimbach at uni-wuerzburg.de (Andreas Leimbach)
Date: Wed, 20 Feb 2013 18:00:51 +0100
Subject: [Bioperl-l] Problem Parsing BLAST output to annotate FASTA file
In-Reply-To: <CAHxs2gtYf70wvFtEX2nFZEtTsUcuw0i1nHzKBRL=H4tcVo+vBQ@mail.gmail.com>
References: <CAHxs2gtL=UVAh_f7nSCFKAOj11wf92MThNqHCDxAEfRyb+M_zw@mail.gmail.com>
	<5124F8D2.4020904@uni-wuerzburg.de>
	<CAHxs2gtYf70wvFtEX2nFZEtTsUcuw0i1nHzKBRL=H4tcVo+vBQ@mail.gmail.com>
Message-ID: <51250143.9050503@uni-wuerzburg.de>

Hey Ann,

damn, it 's not my best day ... Anyways, I wouldn't work with 
List::MoreUtils's each_array function, as this assumes that the blast 
hits and the nucleotide queries are in the same order (as Adam pointed 
out). Rather use a hash which associates a key to a certain value. Also, 
the hash can be used to skip sequences that have no hits.
Here's my new version:

my %hits;
my $hit_desc;
my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file =>
"$ARGV[0]");
while (my $result = $search_in->next_result) {
while (my $hit = $result->next_hit) {
while (my $hsp = $hit->next_hsp) {
$hits{$result->query_description} = $hit->description; # hash: associate 
query_desc (key) with hit_desc (value)
last; # jump out of the while loop; this should resolve getting only the 
first hit
}
last; # see above
}
}


my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]");
while (my $seqobj = $seqio->next_seq) {
if ($hits{$seqobj->display_id}) { # only true if display_id associated 
with hit_desc and should skip seqs without hits
print ">$hits{$seqobj->display_id}\n";
my $nuc = $seqobj->seq();
print $nuc, "\n";
}
}

Cheers,
Andreas

P.S.: I redirected your mail to the BioPerl mailing list, others might 
profit from my mistakes ;-) ...

--
Andreas Leimbach
Universit?t M?nster
Institut f?r Hygiene
Mendelstr. 7
D-48149 M?nster
Germany

Tel.: +49 (0)551 39 3843
E-Mail: andreas.leimbach at uni-wuerzburg.de

On 20.2.13 17:35, Ann Gregory wrote:
> Hi Andreas,
>
> Thanks for you help! I don't understand how this gets the first blast hit:
>
> if ($hit->description eq $hit_desc) { # Only want the first blast hit
> next;
> }
>
> I tried this and seems to be working...but I can't get the 1st blast hit
> or skip the sequences that had no hits. Do you know any quick fixes?
>
> *
> use warnings;
> use strict;
> use Bio::SearchIO;
> use Bio::SeqIO;
> use List::MoreUtils qw(each_array);
>
> my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file =>
> "$ARGV[0]");
> my @ids;
> while (my $result = $search_in->next_result) {
> while (my $hit = $result->next_hit) {
> while (my $hsp = $hit->next_hsp) {
> my $match = $result->num_hits;
> push(@ids, $qd);
> }
> }
> }
> }
>
> my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]");
> my @seqs;
> while (my $seqobj = $seqio->next_seq) {
> my $nuc = $seqobj->seq();
> push(@seqs, $nuc);
> }
>
> my $it = each_array(@ids, at seqs);
> while(my($ids,$seqs)=$it->()){
> print $ids, "\n", $seqs, "\n";
> }
> *
>
> Thanks again!
> ~Ann
>
> On Wed, Feb 20, 2013 at 9:24 AM, Andreas Leimbach
> <andreas.leimbach at uni-wuerzburg.de
> <mailto:andreas.leimbach at uni-wuerzburg.de>> wrote:
>
>     oops, I just realized I had one loop to much in there. Adam is
>     correct. Sorry.
>
>     The last part of the code I send you should look like this:
>
>
>     my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]");
>     while (my $seqobj = $seqio->next_seq) {
>     print ">$hits{$seqobj->display_id}\__n";
>
>     my $nuc = $seqobj->seq();
>     print $nuc, "\n";
>     }
>
>
>     Cheers,
>     Andreas
>
>
>     --
>     Andreas Leimbach
>     Universit?t M?nster
>     Institut f?r Hygiene
>     Mendelstr. 7
>     D-48149 M?nster
>     Germany
>
>     Tel.: +49 (0)551 39 3843 <tel:%2B49%20%280%29551%2039%203843>
>     E-Mail: andreas.leimbach at uni-__wuerzburg.de
>     <mailto:andreas.leimbach at uni-wuerzburg.de>
>
>     On 20.2.13 06:20, Ann Gregory wrote:
>
>         Hi BioPerl,
>
>         I am having issues with a BioPerl script. I have a blastxml file
>         from a
>         blastx blast and the original multifasta file containing the
>         original
>         nucleotides sequences.
>
>         I want to take the blast result (ie. the blast description) and
>         annotate my
>         multifasta file.
>
>         I have written 2 while loops that extract the blast descriptions
>         as well as
>         the nucleotide sequence from the multifasta file.
>
>         My problem is that I cannot incorporate one of the while loops
>         into the
>         other without loosing the loop property of one of the loops. I
>         would like
>         to take the 1st blast description, then the 1st nucleotide
>         sequence, then
>         the 2nd blast description, then the 2nd nucleotide sequence and so
>         on...just can figure out how to alternate the results.
>
>         See script below:
>
>
>         use warnings;
>         use strict;
>         use Bio::SearchIO;
>         use Bio::SeqIO;
>
>
>         my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file =>
>         "$ARGV[0]");
>         while (my $result = $search_in->next_result) {
>         while (my $hit = $result->next_hit) {
>         while (my $hsp = $hit->next_hsp) {
>         my $qd = $hit->description;
>         print $qd, "\n";
>         }
>         }
>         }
>
>         my $seqio = Bio::SeqIO->new(-format => 'fasta', -file =>
>         "$ARGV[1]");
>         while (my $seqobj = $seqio->next_seq) {
>         my $nuc = $seqobj->seq();
>         print $nuc, "\n";
>         }--
>         Ann (Nina) Gregory
>         Graduate Student
>         Rich Lab / Sullivan Lab
>         Soil, Water, Environmental Science Department
>         University of Arizona
>         _________________________________________________
>         Bioperl-l mailing list
>         Bioperl-l at lists.open-bio.org <mailto:Bioperl-l at lists.open-bio.org>
>         http://lists.open-bio.org/__mailman/listinfo/bioperl-l
>         <http://lists.open-bio.org/mailman/listinfo/bioperl-l>
>
>
>
>
> --
> Ann (Nina) Gregory
> Graduate Student
> Rich Lab / Sullivan Lab
> Soil, Water, Environmental Science Department
> University of Arizona
>
>
>


From cjfields at illinois.edu  Wed Feb 20 13:24:58 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 20 Feb 2013 18:24:58 +0000
Subject: [Bioperl-l] Problem Parsing BLAST output to annotate FASTA file
In-Reply-To: <51250143.9050503@uni-wuerzburg.de>
References: <CAHxs2gtL=UVAh_f7nSCFKAOj11wf92MThNqHCDxAEfRyb+M_zw@mail.gmail.com>
	<5124F8D2.4020904@uni-wuerzburg.de>
	<CAHxs2gtYf70wvFtEX2nFZEtTsUcuw0i1nHzKBRL=H4tcVo+vBQ@mail.gmail.com>
	<51250143.9050503@uni-wuerzburg.de>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE2EB4A@CHIMBX5.ad.uillinois.edu>

If this is meant to be something done using the same FASTA files for a bunch of BLAST reports, might be worth setting up a flat file index and using that to look up and grab the sequences; it should be a LOT faster, just the first pass (generation of the initial index) would take a little time.  Look at Bio::DB::Fasta for an example.

chris

On Feb 20, 2013, at 11:00 AM, Andreas Leimbach <andreas.leimbach at uni-wuerzburg.de>
 wrote:

> Hey Ann,
> 
> damn, it 's not my best day ... Anyways, I wouldn't work with List::MoreUtils's each_array function, as this assumes that the blast hits and the nucleotide queries are in the same order (as Adam pointed out). Rather use a hash which associates a key to a certain value. Also, the hash can be used to skip sequences that have no hits.
> Here's my new version:
> 
> my %hits;
> my $hit_desc;
> my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file =>
> "$ARGV[0]");
> while (my $result = $search_in->next_result) {
> while (my $hit = $result->next_hit) {
> while (my $hsp = $hit->next_hsp) {
> $hits{$result->query_description} = $hit->description; # hash: associate query_desc (key) with hit_desc (value)
> last; # jump out of the while loop; this should resolve getting only the first hit
> }
> last; # see above
> }
> }
> 
> 
> my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]");
> while (my $seqobj = $seqio->next_seq) {
> if ($hits{$seqobj->display_id}) { # only true if display_id associated with hit_desc and should skip seqs without hits
> print ">$hits{$seqobj->display_id}\n";
> my $nuc = $seqobj->seq();
> print $nuc, "\n";
> }
> }
> 
> Cheers,
> Andreas
> 
> P.S.: I redirected your mail to the BioPerl mailing list, others might profit from my mistakes ;-) ...
> 
> --
> Andreas Leimbach
> Universit?t M?nster
> Institut f?r Hygiene
> Mendelstr. 7
> D-48149 M?nster
> Germany
> 
> Tel.: +49 (0)551 39 3843
> E-Mail: andreas.leimbach at uni-wuerzburg.de
> 
> On 20.2.13 17:35, Ann Gregory wrote:
>> Hi Andreas,
>> 
>> Thanks for you help! I don't understand how this gets the first blast hit:
>> 
>> if ($hit->description eq $hit_desc) { # Only want the first blast hit
>> next;
>> }
>> 
>> I tried this and seems to be working...but I can't get the 1st blast hit
>> or skip the sequences that had no hits. Do you know any quick fixes?
>> 
>> *
>> use warnings;
>> use strict;
>> use Bio::SearchIO;
>> use Bio::SeqIO;
>> use List::MoreUtils qw(each_array);
>> 
>> my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file =>
>> "$ARGV[0]");
>> my @ids;
>> while (my $result = $search_in->next_result) {
>> while (my $hit = $result->next_hit) {
>> while (my $hsp = $hit->next_hsp) {
>> my $match = $result->num_hits;
>> push(@ids, $qd);
>> }
>> }
>> }
>> }
>> 
>> my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]");
>> my @seqs;
>> while (my $seqobj = $seqio->next_seq) {
>> my $nuc = $seqobj->seq();
>> push(@seqs, $nuc);
>> }
>> 
>> my $it = each_array(@ids, at seqs);
>> while(my($ids,$seqs)=$it->()){
>> print $ids, "\n", $seqs, "\n";
>> }
>> *
>> 
>> Thanks again!
>> ~Ann
>> 
>> On Wed, Feb 20, 2013 at 9:24 AM, Andreas Leimbach
>> <andreas.leimbach at uni-wuerzburg.de
>> <mailto:andreas.leimbach at uni-wuerzburg.de>> wrote:
>> 
>>    oops, I just realized I had one loop to much in there. Adam is
>>    correct. Sorry.
>> 
>>    The last part of the code I send you should look like this:
>> 
>> 
>>    my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]");
>>    while (my $seqobj = $seqio->next_seq) {
>>    print ">$hits{$seqobj->display_id}\__n";
>> 
>>    my $nuc = $seqobj->seq();
>>    print $nuc, "\n";
>>    }
>> 
>> 
>>    Cheers,
>>    Andreas
>> 
>> 
>>    --
>>    Andreas Leimbach
>>    Universit?t M?nster
>>    Institut f?r Hygiene
>>    Mendelstr. 7
>>    D-48149 M?nster
>>    Germany
>> 
>>    Tel.: +49 (0)551 39 3843 <tel:%2B49%20%280%29551%2039%203843>
>>    E-Mail: andreas.leimbach at uni-__wuerzburg.de
>>    <mailto:andreas.leimbach at uni-wuerzburg.de>
>> 
>>    On 20.2.13 06:20, Ann Gregory wrote:
>> 
>>        Hi BioPerl,
>> 
>>        I am having issues with a BioPerl script. I have a blastxml file
>>        from a
>>        blastx blast and the original multifasta file containing the
>>        original
>>        nucleotides sequences.
>> 
>>        I want to take the blast result (ie. the blast description) and
>>        annotate my
>>        multifasta file.
>> 
>>        I have written 2 while loops that extract the blast descriptions
>>        as well as
>>        the nucleotide sequence from the multifasta file.
>> 
>>        My problem is that I cannot incorporate one of the while loops
>>        into the
>>        other without loosing the loop property of one of the loops. I
>>        would like
>>        to take the 1st blast description, then the 1st nucleotide
>>        sequence, then
>>        the 2nd blast description, then the 2nd nucleotide sequence and so
>>        on...just can figure out how to alternate the results.
>> 
>>        See script below:
>> 
>> 
>>        use warnings;
>>        use strict;
>>        use Bio::SearchIO;
>>        use Bio::SeqIO;
>> 
>> 
>>        my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file =>
>>        "$ARGV[0]");
>>        while (my $result = $search_in->next_result) {
>>        while (my $hit = $result->next_hit) {
>>        while (my $hsp = $hit->next_hsp) {
>>        my $qd = $hit->description;
>>        print $qd, "\n";
>>        }
>>        }
>>        }
>> 
>>        my $seqio = Bio::SeqIO->new(-format => 'fasta', -file =>
>>        "$ARGV[1]");
>>        while (my $seqobj = $seqio->next_seq) {
>>        my $nuc = $seqobj->seq();
>>        print $nuc, "\n";
>>        }--
>>        Ann (Nina) Gregory
>>        Graduate Student
>>        Rich Lab / Sullivan Lab
>>        Soil, Water, Environmental Science Department
>>        University of Arizona
>>        _________________________________________________
>>        Bioperl-l mailing list
>>        Bioperl-l at lists.open-bio.org <mailto:Bioperl-l at lists.open-bio.org>
>>        http://lists.open-bio.org/__mailman/listinfo/bioperl-l
>>        <http://lists.open-bio.org/mailman/listinfo/bioperl-l>
>> 
>> 
>> 
>> 
>> --
>> Ann (Nina) Gregory
>> Graduate Student
>> Rich Lab / Sullivan Lab
>> Soil, Water, Environmental Science Department
>> University of Arizona
>> 
>> 
>> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From carandraug+dev at gmail.com  Mon Feb 25 05:08:23 2013
From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=)
Date: Mon, 25 Feb 2013 10:08:23 +0000
Subject: [Bioperl-l] module for description of sequence variants (where to
	place code)
Message-ID: <CAPOrs_0X9tF0_4q-KmV_OMu5vPDT7JbRsPZteLf5dYh1n9_vPg@mail.gmail.com>

Hi

I'm writing a perl module to write a description of the variance
between 2 sequences as described on
http://www.hgvs.org/mutnomen/recs-prot.html

Basically, given 2 sequences, would returns something like "p.Lys2del
p.His25_Met26insGln" if those are the differences. It also accounts
for the existence of - characters on the sequences that may come from
their alignment.

My question is, where on the project tree should I place the module?

Also, is there something already written that would convert from 1 to
3 letter code?

Carn?


From andreas.leimbach at uni-wuerzburg.de  Mon Feb 25 05:32:43 2013
From: andreas.leimbach at uni-wuerzburg.de (Andreas Leimbach)
Date: Mon, 25 Feb 2013 11:32:43 +0100
Subject: [Bioperl-l] module for description of sequence variants (where
 to place code)
In-Reply-To: <CAPOrs_0X9tF0_4q-KmV_OMu5vPDT7JbRsPZteLf5dYh1n9_vPg@mail.gmail.com>
References: <CAPOrs_0X9tF0_4q-KmV_OMu5vPDT7JbRsPZteLf5dYh1n9_vPg@mail.gmail.com>
Message-ID: <512B3DCB.7050008@uni-wuerzburg.de>

Hi Carn?,

for your last question:
You can convert aa strings from one to three letter code with 
'Bio::SeqUtils'.

Cheers,
Andreas

--
Andreas Leimbach
Universit?t M?nster
Institut f?r Hygiene
Mendelstr. 7
D-48149 M?nster
Germany

Tel.: +49 (0)551 39 3843
E-Mail: andreas.leimbach at uni-wuerzburg.de

On 25.2.13 11:08, Carn? Draug wrote:
> Hi
>
> I'm writing a perl module to write a description of the variance
> between 2 sequences as described on
> http://www.hgvs.org/mutnomen/recs-prot.html
>
> Basically, given 2 sequences, would returns something like "p.Lys2del
> p.His25_Met26insGln" if those are the differences. It also accounts
> for the existence of - characters on the sequences that may come from
> their alignment.
>
> My question is, where on the project tree should I place the module?
>
> Also, is there something already written that would convert from 1 to
> 3 letter code?
>
> Carn?
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From genehack at genehack.org  Wed Feb 27 19:57:48 2013
From: genehack at genehack.org (John SJ Anderson)
Date: Wed, 27 Feb 2013 16:57:48 -0800
Subject: [Bioperl-l] YAPC talks?
Message-ID: <CABJ3DF_o2n2nS5ywzweYaaA6AQzXuQ-KPQHp80QkVv+U09T0aw@mail.gmail.com>

Hi -

Is there anyone that was planning on submitting a Bioperl talk to
YAPC::NA? In an unrelated conversation, one of the organizers
expressed an interest in getting a Bioperl talk this year.

If no one else is planning on a talk submission, Jay Hannah (aka
deafferret) and I are promising/threatening a tag-team style "Bioperl
rules / Bioperl sucks" overview/state of the dist style talk...

thanks,
john.


From cjfields at illinois.edu  Wed Feb 27 21:48:55 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Thu, 28 Feb 2013 02:48:55 +0000
Subject: [Bioperl-l] YAPC talks?
In-Reply-To: <CABJ3DF_o2n2nS5ywzweYaaA6AQzXuQ-KPQHp80QkVv+U09T0aw@mail.gmail.com>
References: <CABJ3DF_o2n2nS5ywzweYaaA6AQzXuQ-KPQHp80QkVv+U09T0aw@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6E705CD3@CHIMBX5.ad.uillinois.edu>

At the moment I personally have no plans on going, but I think a no-holds-barred bioperl talk is a good idea.  

chris

On Feb 27, 2013, at 6:57 PM, John SJ Anderson <genehack at genehack.org> wrote:

> Hi -
> 
> Is there anyone that was planning on submitting a Bioperl talk to
> YAPC::NA? In an unrelated conversation, one of the organizers
> expressed an interest in getting a Bioperl talk this year.
> 
> If no one else is planning on a talk submission, Jay Hannah (aka
> deafferret) and I are promising/threatening a tag-team style "Bioperl
> rules / Bioperl sucks" overview/state of the dist style talk...
> 
> thanks,
> john.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From hlapp at drycafe.net  Wed Feb 27 22:20:34 2013
From: hlapp at drycafe.net (Hilmar Lapp)
Date: Wed, 27 Feb 2013 22:20:34 -0500
Subject: [Bioperl-l] YAPC talks?
In-Reply-To: <CABJ3DF_o2n2nS5ywzweYaaA6AQzXuQ-KPQHp80QkVv+U09T0aw@mail.gmail.com>
References: <CABJ3DF_o2n2nS5ywzweYaaA6AQzXuQ-KPQHp80QkVv+U09T0aw@mail.gmail.com>
Message-ID: <42C1F1B8-FE26-43A8-B601-E80D17D215EC@drycafe.net>


On Feb 27, 2013, at 7:57 PM, John SJ Anderson wrote:

> Jay Hannah (aka deafferret) and I are promising/threatening a tag-team style "Bioperl
> rules / Bioperl sucks" overview/state of the dist style talk...

Please videotape. I'll be sure to watch and promote it :-)

	-hilmar
-- 
===========================================================
: Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net :
===========================================================


From saladi1 at illinois.edu  Thu Feb 28 01:58:20 2013
From: saladi1 at illinois.edu (Shyam Saladi)
Date: Wed, 27 Feb 2013 22:58:20 -0800
Subject: [Bioperl-l] EUtilities Cookbook - Accn to gi
Message-ID: <CAARX5cXXD_DNb+Sbt-_zXvsn63QAaVBcot9YGtEjQ7ucrqAEKQ@mail.gmail.com>

Hi,

I think that rettype for the section "Get GIs for a list of accessions"
should be

-rettype => 'gi');

instead of 'gilist' as it is now. I think this change is due to a change in
NCBI eutils.

webpage:
http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook#Get_GIs_for_a_list_of_accessions

Thanks,
Shyam


From fossandonc at hotmail.com  Thu Feb 28 10:36:34 2013
From: fossandonc at hotmail.com (=?iso-8859-1?Q?Francisco_J._Ossand=F3n?=)
Date: Thu, 28 Feb 2013 12:36:34 -0300
Subject: [Bioperl-l] Fix for Bug #3376 broke somewhere else
Message-ID: <SNT133-ds14A180BAFAE068EE359031CFFE0@phx.gbl>

Hi,
I was re-checking Bug #3302 using the Bio::SearchIO modules of the
repository and found that now it can't parse a Hmmer2 file that was
previously fine. After tracking the problem, I discovered that a change in a
regular expression to fix another bug broke the parse.
 
The fix for the Bug #3376 consisted in adding an extra condition to omit
lines where end of domain indicator is split across lines
(https://redmine.open-bio.org/issues/3376):
TEST: domain 1 of 1, from 8 to 97: score 184.7, E = 2.5e-56
                   *->svfqqqqssksttgstvtAiAiAigYRYRYRAvtWnsGsLssGvnDn
                      sv+qqqq+  +    +vtAiAiAigYRYRYRAv Wn GsLs G nDn
        Test     8    SVYQQQQGGSA----MVTAIAIAIGYRYRYRAVVWNKGSLSTGTNDN 50   

                   DnDqqsdgLYtiYYsvtvpssslpsqtviHHHaHkasstkiiikiePr<-
                   DnDq +d LYtiYYsvtv +ss+p q+v+HHHaH+asstkiiiki P   
        Test    51 DNDQAAD-LYTIYYSVTVSASSWPGQSVTHHHAHPASSTKIIIKIAPS   97   

                   *

        Test     -   -
This case is characterized by the 2 dashes in the line...

So the expression added in hmmer2.pm - ?next_result?
(https://github.com/bioperl/bioperl-live/commit/142e5d79e3a6593db32bf0af9904
8f47d01bd3f2):
                        elsif (CORE::length($_) == 0
                            || ( $count != 1 && /^\s+$/o )
                            || /^\s+\-?\*\s*$/
                            || /^.+\-\s+\-\s*$/ ) ### <--- This regex was
designed for bug 3376
                        {
                            next;
                        }

But the expression used is too broad because it uses the "^.+" just before
the 2 dashes, and it broke these lines parsing, where is full of dashes:
                   KyACrqCdtiVQAPaPakpIErGiptaGLLArvlVSKyaEHlPLYRQsEI
                                                                     
  lcl|gi|340     - -------------------------------------------------- -    

                   yaRqGVeiaRstLadWVgrtgarLaPLvdALaeyVLkeGklHADeTPVqV
                         +i  s L   V++ + r                           
  lcl|gi|340 60938 ------AIMISGLIHGVSARCLRF-------------------------- 60955

I think a reasonable fix that still fixes the original bug and restore the
function for this case is to add an extra \s+ in the regex just before the
first dash, so the expression makes sure that the first dash is the one that
comes AFTER the description (and is replacing the usual coordinate number)
and is not the last of an alignment or a series of dashes like the one
above:
                        elsif (CORE::length($_) == 0
                            || ( $count != 1 && /^\s+$/o )
                            || /^\s+\-?\*\s*$/
                            || /^.+\s+\-\s+\-\s*$/ ) ### <--- Tweaked regex
                        {
                            next;
                        }
I tested it and it works fine, hope you find the fix acceptable.

Cheers,

--
Francisco J. Ossandon
Bioinformatician.
Ph.D. Candidate, University Andres Bello.
Center for Bioinformatics and Genome Biology,
Fundacion Ciencia para la Vida.
Santiago, Chile.
www.cienciavida.cl/CBGB.htm


From PDagosto at edgebio.com  Mon Feb 25 11:50:34 2013
From: PDagosto at edgebio.com (Phil Dagosto)
Date: Mon, 25 Feb 2013 16:50:34 +0000
Subject: [Bioperl-l] Error when running Build.PL
Message-ID: <DC8C6FE0AED292469CF192A00459937BC0F8660B@EDGE-EXCH02.edgebio.com>

Greetings,

I downloaded BioPerl 1.6.1 from this location: http://www.bioperl.org/wiki/Getting_BioPerl

When I ran Build.PL with all of the default settings chosen in the interactive mode I got the following error message:

Could not get valid metadata. Error is: Invalid metadata structure. Errors: 'Perl_5' for 'license' does not have a URL scheme (resources -> license) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::FeatureIO::gff -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::WebAgent -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::EUtilParameters -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::OntologyIO::InterProParser -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Biblio::IO::medlinexml -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::strider -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::PhyloNetwork::RandomFactory -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::Analysis::DNA::ESEfinder -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::game::gameSubs -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::FeatureIO::interpro -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::GFF::Adaptor::berkeleydb -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::entrezgene -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::tinyseq -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::chadoxml -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::game::gameWriter -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::FileCache -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::bsml_sax -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::Primer3 -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::GFF::Adaptor::ace -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::PopGen::HtSNP -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tree::Compatible -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::Ace -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::Taxonomy::entrez -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::agave -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::PopGen::TagHaplotype -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::SeqFeature::Store::FeatureFileLoader -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::* -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::Analysis::Protein* -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SearchIO::blastxml -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::EUtilities -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tree::Draw::Cladogram -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::SeqPattern -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::tigrxml -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqFeature::Collection -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Draw::Pictogram -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SearchIO::Writer::BSMLResultWriter -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::Query::HIVQuery -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::TreeIO::svggraph -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::Biblio::eutils -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::PhyloNetwork -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::SeqPattern::BackTranslate -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::Query::GenBank -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Variation::IO::xml -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::PhyloNetwork::GraphViz -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqFeature::Annotated -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::NCBIHelper -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::HIV -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::Analysis::DNA* -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::Run::RemoteBlast -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::excel -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::ClusterIO::dbsnp -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Microarray::Tools::ReseqChip -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::Biblio::soap -> requires) [Validation: 1.2]
at /usr/local/lib/perl5/5.10.1/Module/Build/Base.pm line 4559

Could not create MYMETA files
Creating new 'Build' script for 'BioPerl' version '1.006001'

I have no idea whether this is a problem or not or if I can proceed. Also, I'm confused by the version number referenced in the last line. 1.006001 is our current version - I thought I was installing version 1.6.1. Are these version numbers equivalent, i.e., are the zeros not meaningful?.

I was actually looking for version 1.2.3 (or greater) - where can I find that?

Thanks,
Phil

Phil Dagosto
Sr. Software Engineer
Edge Bio
201 Perry Parkway, Suite 5
Gaithersburg, MD 20850

pdagosto at edgebio.com
(240) 912-8669


From chapmanb at 50mail.com  Thu Feb 28 21:30:01 2013
From: chapmanb at 50mail.com (Brad Chapman)
Date: Thu, 28 Feb 2013 21:30:01 -0500
Subject: [Bioperl-l] Coming soon: BOSC/Broad Hackathon, BOSC Codefest
Message-ID: <874ngvua1i.fsf@fastmail.fm>


Hi all; 
There are some upcoming coding events and conferences of interest to open source
biology programmers:

- BOSC/Broad Interoperability Hackathon -- This is a two day coding session at
  the Broad Institute in Cambridge, MA on April 7-8 focused on improving tool
  interoperability.
  
  Sign up and details: http://j.mp/XJT6ew
  
- Codefest at the Bioinformatics Open Source Conference -- This year BOSC is
  taking place in Berlin from July 19-20 and we'll have a two day coding session
  before the conference. This is the 4th year of Codefests and they've proven to
  be a productive and fun time to work collectively on open source projects.

  Sign up and details: http://www.open-bio.org/wiki/Codefest_2013
  BOSC conference: http://www.open-bio.org/wiki/BOSC_2013

Here are the key dates for the events and abstracts:

April  7-8, 2013: BOSC/Broad Interoperability Hackathon, Cambridge, MA
April   12, 2013: BOSC abstracts due
July 17-18, 2013: Codefest 2013, Berlin
July 19-20, 2013: BOSC 2013, Berlin

Looking forward to seeing everyone this spring and summer for plenty of fun
science and code,
Brad


From koriege at googlemail.com  Fri Feb  1 02:49:20 2013
From: koriege at googlemail.com (koriege at googlemail.com)
Date: Thu, 31 Jan 2013 18:49:20 -0800 (PST)
Subject: [Bioperl-l] problem with Bio::*::Fasta id_parser
Message-ID: <e9e0c428-0467-4b62-a37d-cfcb91ed818b@googlegroups.com>

Hi,

I tried two methods to create a bioperl FASTA database, but it failes by 
extracting the substring out of my headers.
Can someone explain me why I get the standard header or show me a work 
around? 

thanks in advance.
pyr0

i)
my $objDB = Bio::Index::Fasta->new(-filename => $PATHdbIdx, -write_flag => 
1);
$objDB->id_parser(\&get_id);
$objDB->make_index(glob($objParameter->dbGenome()));

sub get_id {
   my $header = shift;
   $header =~ /^>.*\bsp\|([A-Z]\d{5}\b)/;
   $1;
}

output
Use of uninitialized value $id in concatenation (.) or string at 
/usr/share/perl5/Bio/Index/Abstract.pm line 753, <$FASTA> line 1.
Use of uninitialized value $id in exists at 
/usr/share/perl5/Bio/Index/Abstract.pm line 754, <$FASTA> line 1.
Use of uninitialized value $id in hash element at 
/usr/share/perl5/Bio/Index/Abstract.pm line 757, <$FASTA> line 1.
gi|376282008|ref|NC_016798.1|

ii)
my $PATHdbIdx=catfile($objParameter->DIR,'data','db.idx');
unlink($PATHdbIdx);
my $objDB = Bio::DB::Fasta->new($objParameter->dbGenome(), -makeid => 
\&get_id);
$objDBgenome->set(\$objDB);

output:
Use of uninitialized value $key in pattern match (m//) at 
/usr/share/perl5/Bio/DB/Fasta.pm line 1178.
Use of uninitialized value $id in exists at 
/usr/share/perl5/Bio/DB/Fasta.pm line 617.
gi|376282008|ref|NC_016798.1|


From jason.stajich at gmail.com  Fri Feb  1 06:58:57 2013
From: jason.stajich at gmail.com (Jason Stajich)
Date: Thu, 31 Jan 2013 22:58:57 -0800
Subject: [Bioperl-l] Bioperl-l Digest, Vol 117, Issue 13
In-Reply-To: <575B184F-C051-4FEF-9BEB-7AB98E3C52A6@yahoo.com>
References: <mailman.7.1359565204.26693.bioperl-l@lists.open-bio.org>
	<575B184F-C051-4FEF-9BEB-7AB98E3C52A6@yahoo.com>
Message-ID: <CD561DB2-ACFC-4592-B83B-829F44ADE6A3@gmail.com>

Dan - 

I think the answer is yes if others are doing it - I am not in a position to be much of a main coder.

I don't know which format you speak of here or if you had to write something for the text blast changes or something else.  Specific bug reports on formats that aren't working is always helpful.  The XML format has been pretty stable so I would suggest that if you are simply parsing reports not looking at them.

Chris posted instructions on how to contribute and the move to github simplifies this.  That you had to write a whole new parser seems probably a bit severe - I hope that in the future people can speak to the problems sooner. If I hit a wall with something I can't do I usually write the code to fix it and contribute it back but I don't play follow-the-format-changes with the tools anymore, but hopefully others like yourself can make the contributions.

If you speak to the response I made to the question below, I don't think anyone will be trying and support the NCBI's additional markups that refer to the upstream and downstream features as they are laid out in the text files without some serious effort. Perhaps in the future that information will be reported in the XML format and thus be more parseable.

best wishes,
Jason
On Jan 30, 2013, at 1:40 PM, Dan kilburn <dr_kilburn59 at yahoo.com> wrote:

> Hi Jason,
> 
> Are there any plans to keep SearchIO up to date with ncbi blast? I know they change formats ridiculously often, but I had to write my own parser to get sequence identity, which I would rather not have done. I realize that this job would be a big load on anyone who takes it, but it's so fundamental. Maybe I can help.
> 
> --Dan
> Sent from my iPhone
> 
> On Jan 30, 2013, at 12:00 PM, bioperl-l-request at lists.open-bio.org wrote:
> 
>> Send Bioperl-l mailing list submissions to
>>   bioperl-l at lists.open-bio.org
>> 
>> To subscribe or unsubscribe via the World Wide Web, visit
>>   http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> or, via email, send a message with subject or body 'help' to
>>   bioperl-l-request at lists.open-bio.org
>> 
>> You can reach the person managing the list at
>>   bioperl-l-owner at lists.open-bio.org
>> 
>> When replying, please edit your Subject line so it is more specific
>> than "Re: Contents of Bioperl-l digest..."
>> 
>> 
>> Today's Topics:
>> 
>>  1. Re:  Parsing Blast-Report extracting "Features flanking    .."
>>     (Jason Stajich)
>> 
>> 
>> ----------------------------------------------------------------------
>> 
>> Message: 1
>> Date: Tue, 29 Jan 2013 11:00:16 -0800
>> From: Jason Stajich <jason.stajich at gmail.com>
>> Subject: Re: [Bioperl-l] Parsing Blast-Report extracting "Features
>>   flanking    .."
>> To: buschj at hhu.de
>> Cc: bioperl-l at lists.open-bio.org
>> Message-ID: <6E83E3F3-C304-4DC4-9A11-FE1CA90F207D at gmail.com>
>> Content-Type: text/plain;    charset=us-ascii
>> 
>> We don't parse the NCBI feature info from the BLAST reports per your query. To look up a specific feature you can use Bio::DB::GenBank to query for sequence from a specific feature by accession number - see the HOWTOs for that.
>> 
>> However, most people use tools that generate SAM/BAM files with short reads - then you can use a tool like bedtools to find overlaps of reads with the locations of features.
>> 
>> basically:
>> - download the genome and GFF for arabidopsis
>> - align your sRNA to the genome with a short read aligner - bowtie, bwa, others
>> - convert your sam to bam file with SAMtools or picard
>> - compare the location of features with the reads to get expression summaries or individuals reads with BEDTools
>> 
>> 
>> On Jan 25, 2013, at 2:20 AM, jobu <buschj at hhu.de> wrote:
>> 
>>> Am 22.01.2013 19:03, schrieb Mgavi Brathwaite:
>>>> What upstream and downstream elements are you interested in?
>>> 
>>> 
>>> I've got a huge pile of short RNA reads.
>>> Part of the question now is whether those RNA fragments originate from
>>> siRNA events,
>>> or may represent miRNAs / parts of pre-miRNAs.
>>> 
>>> So I did an online  blast search against database nt.
>>> The resulting report quite often just gives subject information like this:
>>> 
>>> -----
>>>> gb|CP002686.1| Arabidopsis thaliana chromosome 3, complete sequence
>>> Length=23459830
>>> -----
>>> 
>>> Now I would like to get the hit's neighbouring regions  for further
>>> analysis.
>>> Preferably I would like to do that  in an automized way, but the only
>>> possible action with this kind of subject gi | description would be to
>>> fetch the entire chromosomal  sequence I guess ?
>>> 
>>> However,
>>> right below the line above, the report states more precisely:
>>> 
>>> ------
>>> Features flanking this part of subject sequence:
>>> 8872 bp at 5' side: cytochrome P450 90B1
>>> 402 bp at 3' side: U1 small nuclear ribonucleoprotein-70K
>>> ------
>>> 
>>> Still I would like to have the possibility to automatically fetch the
>>> subject's sequence(s),
>>> as of now I think  parsing the report with SearchIO won't let me aquire
>>> that information, because SearchIO does not recognize report sections
>>> like those.
>>> 
>>> I hope I did not miss any of SearchIOs capabilities, but I could not
>>> find any method covering my wish?!
>>> 
>>> Right now maybe the only way to get the information I want is to
>>> construct my own parser and write it out into a separate file, which in
>>> turn again  I could read into a hash before processing the Blast-Report
>>> with SearchIO to combine both data for further automized work.
>>> 
>>> I am aware though that even successfully getting the flanking features
>>> would leave me with the more or less wide  intergenic gap my hsp is
>>> located in.
>>> 
>>> However I'm in need of a way to get the flanking features including
>>> their annotation and the region spanning between them.
>>> But I hope I do not have to get complete sequences to accomplish that,
>>> as this would be kind of an overkill.
>>> 
>>> with kind regards
>>> Jochen
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> 
>> Jason Stajich
>> jason.stajich at gmail.com
>> jason at bioperl.org
>> 
>> 
>> 
>> 
>> ------------------------------
>> 
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> 
>> End of Bioperl-l Digest, Vol 117, Issue 13
>> ******************************************
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From dr_kilburn59 at yahoo.com  Fri Feb  1 14:25:34 2013
From: dr_kilburn59 at yahoo.com (Dan Kilburn)
Date: Fri, 1 Feb 2013 06:25:34 -0800 (PST)
Subject: [Bioperl-l] Bioperl-l Digest, Vol 117, Issue 13
In-Reply-To: <CD561DB2-ACFC-4592-B83B-829F44ADE6A3@gmail.com>
References: <mailman.7.1359565204.26693.bioperl-l@lists.open-bio.org>
	<575B184F-C051-4FEF-9BEB-7AB98E3C52A6@yahoo.com>
	<CD561DB2-ACFC-4592-B83B-829F44ADE6A3@gmail.com>
Message-ID: <1359728734.27412.YahooMailNeo@web162006.mail.bf1.yahoo.com>

Hi Jason,
?
Thanks for?the detailed feedback.? The real reason I had to write my own parser is that even with close, repeated support from NCBI we couldn't get XML output with short_web_blast.pl?because the parameter that turns on XML output was not functioning (they've probably fixed it by now), and I had to crank out a parser asap to support a job talk.
?
I don't think the upstream and downstream feature reports are particulalry useful, becase in mammals they tend to be so far away that they are not likely to be biologically relevant.? But the internal motif reports are useful, maybe especially if you are blasting short reads, like I was.? A 16-mer preserved domain hit is really good if you're blasting 18-mer Illumina short reads, like I was.
?
As far as my involvement goes, I got diagnosed with cancer on Wednesday, so I'll be taking a step back until next week's surgery and taking a lot a deep breaths.? On the other hand, this just makes me more motivated: I've been thinking alot about time, and timely contributions, the last two days.
?
Cheers,
Dan
 

________________________________
 From: Jason Stajich <jason.stajich at gmail.com>
To: Dan kilburn <dr_kilburn59 at yahoo.com> 
Cc: "bioperl-l at lists.open-bio.org" <bioperl-l at lists.open-bio.org> 
Sent: Friday, February 1, 2013 1:58 AM
Subject: Re: [Bioperl-l] Bioperl-l Digest, Vol 117, Issue 13
  

Dan -?

I think the answer is yes if others are doing it - I am not in a position to be much of a main coder.

I don't know which format you speak of here or if you had to write something for the text blast changes or something else. ?Specific bug reports on formats that aren't working is always helpful. ?The XML format has been pretty stable so I would suggest that if you are simply parsing reports not looking at them.

Chris posted instructions on how to contribute and the move to github simplifies this. ?That you had to write a whole new parser seems probably a bit severe - I hope that in the future people can speak to the problems sooner. If I hit a wall with something I can't do I usually write the code to fix it and contribute it back but I don't play follow-the-format-changes with the tools anymore, but hopefully others like yourself can make the contributions.

If you speak to the response I made to the question below, I don't think anyone will be trying and support the NCBI's additional markups that refer to the upstream and downstream features as they are laid out in the text files without some serious effort. Perhaps in the future that information will be reported in the XML format and thus be more parseable.
best wishes,
Jason

On Jan 30, 2013, at 1:40 PM, Dan kilburn <dr_kilburn59 at yahoo.com> wrote:

Hi Jason,
>
>Are there any plans to keep SearchIO up to date with ncbi blast? I know they change formats ridiculously often, but I had to write my own parser to get sequence identity, which I would rather not have done. I realize that this job would be a big load on anyone who takes it, but it's so fundamental. Maybe I can help.
>
>--Dan
>Sent from my iPhone
>
>On Jan 30, 2013, at 12:00 PM, bioperl-l-request at lists.open-bio.org wrote:
>
>
>Send Bioperl-l mailing list submissions to
>>??bioperl-l at lists.open-bio.org
>>
>>To subscribe or unsubscribe via the World Wide Web, visit
>>??http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>or, via email, send a message with subject or body 'help' to
>>??bioperl-l-request at lists.open-bio.org
>>
>>You can reach the person managing the list at
>>??bioperl-l-owner at lists.open-bio.org
>>
>>When replying, please edit your Subject line so it is more specific
>>than "Re: Contents of Bioperl-l digest..."
>>
>>
>>Today's Topics:
>>
>>?1. Re: ?Parsing Blast-Report extracting "Features flanking ???.."
>>????(Jason Stajich)
>>
>>
>>----------------------------------------------------------------------
>>
>>Message: 1
>>Date: Tue, 29 Jan 2013 11:00:16 -0800
>>From: Jason Stajich <jason.stajich at gmail.com>
>>Subject: Re: [Bioperl-l] Parsing Blast-Report extracting "Features
>>??flanking ???.."
>>To: buschj at hhu.de
>>Cc: bioperl-l at lists.open-bio.org
>>Message-ID: <6E83E3F3-C304-4DC4-9A11-FE1CA90F207D at gmail.com>
>>Content-Type: text/plain; ???charset=us-ascii
>>
>>We don't parse the NCBI feature info from the BLAST reports per your query. To look up a specific feature you can use Bio::DB::GenBank to query for sequence from a specific feature by accession number - see the HOWTOs for that.
>>
>>However, most people use tools that generate SAM/BAM files with short reads - then you can use a tool like bedtools to find overlaps of reads with the locations of features.
>>
>>basically:
>>- download the genome and GFF for arabidopsis
>>- align your sRNA to the genome with a short read aligner - bowtie, bwa, others
>>- convert your sam to bam file with SAMtools or picard
>>- compare the location of features with the reads to get expression summaries or individuals reads with BEDTools
>>
>>
>>On Jan 25, 2013, at 2:20 AM, jobu <buschj at hhu.de> wrote:
>>
>>
>>Am 22.01.2013 19:03, schrieb Mgavi Brathwaite:
>>>
>>>What upstream and downstream elements are you interested in?
>>>>
>>>
>>>I've got a huge pile of short RNA reads.
>>>Part of the question now is whether those RNA fragments originate from
>>>siRNA events,
>>>or may represent miRNAs / parts of pre-miRNAs.
>>>
>>>So I did an online ?blast search against database nt.
>>>The resulting report quite often just gives subject information like this:
>>>
>>>-----
>>>
>>>gb|CP002686.1| Arabidopsis thaliana chromosome 3, complete sequence
>>>>Length=23459830
>>>-----
>>>
>>>Now I would like to get the hit's neighbouring regions ?for further
>>>analysis.
>>>Preferably I would like to do that ?in an automized way, but the only
>>>possible action with this kind of subject gi | description would be to
>>>fetch the entire chromosomal ?sequence I guess ?
>>>
>>>However,
>>>right below the line above, the report states more precisely:
>>>
>>>------
>>>Features flanking this part of subject sequence:
>>>8872 bp at 5' side: cytochrome P450 90B1
>>>402 bp at 3' side: U1 small nuclear ribonucleoprotein-70K
>>>------
>>>
>>>Still I would like to have the possibility to automatically fetch the
>>>subject's sequence(s),
>>>as of now I think ?parsing the report with SearchIO won't let me aquire
>>>that information, because SearchIO does not recognize report sections
>>>like those.
>>>
>>>I hope I did not miss any of SearchIOs capabilities, but I could not
>>>find any method covering my wish?!
>>>
>>>Right now maybe the only way to get the information I want is to
>>>construct my own parser and write it out into a separate file, which in
>>>turn again ?I could read into a hash before processing the Blast-Report
>>>with SearchIO to combine both data for further automized work.
>>>
>>>I am aware though that even successfully getting the flanking features
>>>would leave me with the more or less wide ?intergenic gap my hsp is
>>>located in.
>>>
>>>However I'm in need of a way to get the flanking features including
>>>their annotation and the region spanning between them.
>>>But I hope I do not have to get complete sequences to accomplish that,
>>>as this would be kind of an overkill.
>>>
>>>with kind regards
>>>Jochen
>>>
>>>
>>>
>>>_______________________________________________
>>>Bioperl-l mailing list
>>>Bioperl-l at lists.open-bio.org
>>>http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>Jason Stajich
>>jason.stajich at gmail.com
>>jason at bioperl.org
>>
>>
>>
>>
>>------------------------------
>>
>>_______________________________________________
>>Bioperl-l mailing list
>>Bioperl-l at lists.open-bio.org
>>http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>End of Bioperl-l Digest, Vol 117, Issue 13
>>******************************************
>>
>_______________________________________________
>Bioperl-l mailing list
>Bioperl-l at lists.open-bio.org
>http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org  


From carandraug+dev at gmail.com  Sun Feb  3 01:44:31 2013
From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=)
Date: Sun, 3 Feb 2013 01:44:31 +0000
Subject: [Bioperl-l] TCofee does not accept named arguments and issue with
	output option
Message-ID: <CAPOrs_3TM5+yD3s3=npWb1sucmy_smSLejxz3Cr6C0Rg6h3Dyw@mail.gmail.com>

Hi

the TCoffee module does not options of the named argument type:

-arg => option

one needs to do like

'arg' => option

Is there a special reason for this? I tracked down this to the commit

7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e

12 years ago[1]. A comment on the code actually says "don't want named
parameters"[2] (though the commit message sounds pretty innocuous
"migrated to new Bio::Root::RootI chained new"). Is there a reason for
this? The rest of bioperl has no issue with named parameters, and the
API should be the same as Clustalw which also has no problem with it.
This is very easy to fix, I can submit a pull request no problem.

Also, shouldn't the code complain in the case of non-supported
options? Took me a very long time to find out the problem because
there was no complaints coming from the code.

There is also a problem with the way it handles the output option.
I'll have to look closer into it, but the documentation is simply
incorrect. "'output' => 'fasta_aln'" gives an error while just 'fasta'
(undocumented), works fine.

Carn?
[1] https://github.com/carandraug/bioperl-run/commit/7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e
[2] https://github.com/carandraug/bioperl-run/commit/7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e#L0R374


From cjfields at illinois.edu  Sun Feb  3 21:54:51 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Sun, 3 Feb 2013 21:54:51 +0000
Subject: [Bioperl-l] TCofee does not accept named arguments and issue
 with	output option
In-Reply-To: <CAPOrs_3TM5+yD3s3=npWb1sucmy_smSLejxz3Cr6C0Rg6h3Dyw@mail.gmail.com>
References: <CAPOrs_3TM5+yD3s3=npWb1sucmy_smSLejxz3Cr6C0Rg6h3Dyw@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE14D30@CHIMBX5.ad.uillinois.edu>

Carn?,

On Feb 2, 2013, at 7:44 PM, Carn? Draug <carandraug+dev at gmail.com> wrote:

> Hi
> 
> the TCoffee module does not options of the named argument type:
> 
> -arg => option
> 
> one needs to do like
> 
> 'arg' => option
> 
> Is there a special reason for this? I tracked down this to the commit
> 
> 7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e
> 
> 12 years ago[1]. A comment on the code actually says "don't want named
> parameters"[2] (though the commit message sounds pretty innocuous
> "migrated to new Bio::Root::RootI chained new"). Is there a reason for
> this? The rest of bioperl has no issue with named parameters, and the
> API should be the same as Clustalw which also has no problem with it.
> This is very easy to fix, I can submit a pull request no problem.

IIRC the reasoning behind this was to differentiate Bioperl parameters from command-specific ones.  This decision predates my involvement w/ core dev, but my general feeling is that anything that is an object attribute (regardless whether it is a direct representation of a value passed to a wrapped program or not) should be preceded by '-' for consistency.  

The downside of big changes like this: potential backwards compatibility issues.  Such changes would need to be tested out rigorously, as there are a ton of old scripts that would potentially break with a direct change.  I don't have a problem breaking this with a bioperl 2.0 release, though.  

> Also, shouldn't the code complain in the case of non-supported
> options? Took me a very long time to find out the problem because
> there was no complaints coming from the code.

Yes, it should complain when options are given that do not make sense, some validation would help there.  With some modules this might be a side-effect of using AUTOLOAD or simply not checking the parameters.

> There is also a problem with the way it handles the output option.
> I'll have to look closer into it, but the documentation is simply
> incorrect. "'output' => 'fasta_aln'" gives an error while just 'fasta'
> (undocumented), works fine.

That's entirely possible.

> Carn?
> [1] https://github.com/carandraug/bioperl-run/commit/7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e
> [2] https://github.com/carandraug/bioperl-run/commit/7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e#L0R374

As an aside, there are a few downsides of trying to implement command-line parameters as perl object attributes (getter/setter), one being that many can't be directly represented as an object attribute (namely, anything that can't be a getter/setter named subroutine, such as those having hyphens, starting with a number, etc) so you have to hack your way around it.  Infernal was this way IIRC.  Maybe these should just be simply stored as a semi-validated set of key-value pairs.  

chris


From carandraug+dev at gmail.com  Mon Feb  4 04:34:22 2013
From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=)
Date: Mon, 4 Feb 2013 04:34:22 +0000
Subject: [Bioperl-l] TCofee does not accept named arguments and issue
 with output option
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE14D30@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_3TM5+yD3s3=npWb1sucmy_smSLejxz3Cr6C0Rg6h3Dyw@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE14D30@CHIMBX5.ad.uillinois.edu>
Message-ID: <CAPOrs_2b2+Dy-HW3ngjNd2tjaTxgvFpTR-rKzq7HOO-6ZzyoTQ@mail.gmail.com>

On 3 February 2013 21:54, Fields, Christopher J <cjfields at illinois.edu> wrote:
> On Feb 2, 2013, at 7:44 PM, Carn? Draug <carandraug+dev at gmail.com> wrote:
>
>> Hi
>>
>> the TCoffee module does not options of the named argument type:
>>
>> -arg => option
>>
>> one needs to do like
>>
>> 'arg' => option
>>
>> Is there a special reason for this? I tracked down this to the commit
>>
>> 7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e
>>
>> 12 years ago[1]. A comment on the code actually says "don't want named
>> parameters"[2] (though the commit message sounds pretty innocuous
>> "migrated to new Bio::Root::RootI chained new"). Is there a reason for
>> this? The rest of bioperl has no issue with named parameters, and the
>> API should be the same as Clustalw which also has no problem with it.
>> This is very easy to fix, I can submit a pull request no problem.
>
> IIRC the reasoning behind this was to differentiate Bioperl parameters from command-specific ones.  This decision predates my involvement w/ core dev, but my general feeling is that anything that is an object attribute (regardless whether it is a direct representation of a value passed to a wrapped program or not) should be preceded by '-' for consistency.
>
> The downside of big changes like this: potential backwards compatibility issues.  Such changes would need to be tested out rigorously, as there are a ton of old scripts that would potentially break with a direct change.  I don't have a problem breaking this with a bioperl 2.0 release, though.

Should passing the tests be enough? There's one for TCofee. At the
moment I don't see how this would cause compatibility issues, we are
adding an option, not removing it. But the comment on the code,
stating plainly that the -param API was not wanted caught me by
surpise and why I'm asking.

> As an aside, there are a few downsides of trying to implement command-line parameters as perl object attributes (getter/setter), one being that many can't be directly represented as an object attribute (namely, anything that can't be a getter/setter named subroutine, such as those having hyphens, starting with a number, etc) so you have to hack your way around it.  Infernal was this way IIRC.  Maybe these should just be simply stored as a semi-validated set of key-value pairs.

>From a quick glance at the list of TCoffee parameters I don't at the
moment see any that should cause problem.

I have submitted a bug report[1] which mentions some other issues I
found with TCoffee. If someone could comment on them would be great
and I can start fixing it.

Carn?

[1] https://redmine.open-bio.org/issues/3406


From yuf228 at hotmail.com  Fri Feb  1 04:15:15 2013
From: yuf228 at hotmail.com (Rob)
Date: Fri, 1 Feb 2013 04:15:15 +0000 (UTC)
Subject: [Bioperl-l] Where to get BLASTCLUST or equivalent?
References: <200305311150.h4VBopn2019091@localhost.localdomain>
Message-ID: <loom.20130201T045704-740@post.gmane.org>

Cyril C.C. Chua <bmbcccc <at> bmb.leeds.ac.uk> writes:

> 
> Hi,
> 
> I have some difficulty in sourcing for BLASTCLUST or related 
> programs/mods. Does any1 know exactly how to locate them?
> 
> Regards
> 
> Cyril Chua
> 


Hi Cyril,

I heard of the following programmes that might do similar things (I HAVEN'T 
used any of them yet):

Afree - http://www.vicbioinformatics.com/software.afree.shtml
Uclust - http://drive5.com/uclust/uclust_userguide_2_1.pdf
Usearch - http://www.drive5.com/usearch/
DomClust - http://mbgd.genome.ad.jp/domclust/

or 

Check this: 

http://ppod.princeton.edu/help/help_tech.html

God bless,


Robert


From whereverroadgoes at gmail.com  Mon Feb  4 15:39:19 2013
From: whereverroadgoes at gmail.com (Slym)
Date: Mon, 4 Feb 2013 07:39:19 -0800 (PST)
Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases
Message-ID: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com>

The result I get is:

Number of bases of type A = 
Number of bases of type C = 
Number of bases of type G = 
Number of bases of type T = 

i.e. There's no expected values. 
Please help!

#! /usr/bin/perl

use Bio::Tools::SeqStats;
use Bio::Seq;

open (FILE, "seq.fasta");
@array = <FILE>;

# Removing first line of fasta

shift (@array);
$array = join('', at array);
open (FILE2, ">>seq2.fasta");
print FILE2 "$array";

$seqobj = Bio::PrimarySeq->new( -file => "sekw2.fasta",
- alphabet => 'dna',);


my $seq_stats = Bio::Tools::SeqStats->new(-seq=>$seqobj);

my $monomer_ref = $seq_stats->count_monomers();

foreach $base (sort keys %$monomer_ref) {
print "Liczba zasad typu ", $base," = ", $monomer_ref{$base},"\n";
}


From hamish.mcwilliam at bioinfo-user.org.uk  Mon Feb  4 16:59:16 2013
From: hamish.mcwilliam at bioinfo-user.org.uk (Hamish McWilliam)
Date: Mon, 4 Feb 2013 16:59:16 +0000
Subject: [Bioperl-l] Where to get BLASTCLUST or equivalent?
In-Reply-To: <loom.20130201T045704-740@post.gmane.org>
References: <200305311150.h4VBopn2019091@localhost.localdomain>
	<loom.20130201T045704-740@post.gmane.org>
Message-ID: <CABqDwwLHWp2fZm5h8KJmZhBFV6QmNLJrg5OE=hR+9U3Y3UJ7_g@mail.gmail.com>

BLASTCLUST is part of the legacy NCBI BLAST package (not NCBI BLAST+)
and can be obtained from:

ftp://ftp.ncbi.nlm.nih.gov/blast/executables/release/LATEST

As Robert notes there are many other tools which can be used to
perform sequence clustering, Wikipedia has a Sequence Clustering
article (http://en.wikipedia.org/wiki/Sequence_clustering) which lists
some of the most commonly used.

All the best,

Hamish

On 1 February 2013 04:15, Rob <yuf228 at hotmail.com> wrote:
> Cyril C.C. Chua <bmbcccc <at> bmb.leeds.ac.uk> writes:
>
>>
>> Hi,
>>
>> I have some difficulty in sourcing for BLASTCLUST or related
>> programs/mods. Does any1 know exactly how to locate them?
>>
>> Regards
>>
>> Cyril Chua
>>
>
>
> Hi Cyril,
>
> I heard of the following programmes that might do similar things (I HAVEN'T
> used any of them yet):
>
> Afree - http://www.vicbioinformatics.com/software.afree.shtml
> Uclust - http://drive5.com/uclust/uclust_userguide_2_1.pdf
> Usearch - http://www.drive5.com/usearch/
> DomClust - http://mbgd.genome.ad.jp/domclust/
>
> or
>
> Check this:
>
> http://ppod.princeton.edu/help/help_tech.html
>
> God bless,
>
>
> Robert
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


--
----
"Saying the internet has changed dramatically over the last five years
is clich? ? the internet is always changing dramatically" - Craig
Labovitz, Arbor Networks.


From whereverroadgoes at gmail.com  Mon Feb  4 17:34:10 2013
From: whereverroadgoes at gmail.com (Slym)
Date: Mon, 4 Feb 2013 09:34:10 -0800 (PST)
Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases
In-Reply-To: <CAJ57qHHphLgEyfkEEyt2HVh+RahSWpiuhuaA08vi5ZxMwDDgTg@mail.gmail.com>
References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com>
	<CAJ57qHHphLgEyfkEEyt2HVh+RahSWpiuhuaA08vi5ZxMwDDgTg@mail.gmail.com>
Message-ID: <b2154001-d1eb-4266-a491-108d3e6ae77d@googlegroups.com>

Thanks Roy,

It still doesn't seem to produce anything. :/


From roy.chaudhuri at gmail.com  Mon Feb  4 17:51:03 2013
From: roy.chaudhuri at gmail.com (Roy Chaudhuri)
Date: Mon, 4 Feb 2013 17:51:03 +0000
Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases
In-Reply-To: <b2154001-d1eb-4266-a491-108d3e6ae77d@googlegroups.com>
References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com>
	<CAJ57qHHphLgEyfkEEyt2HVh+RahSWpiuhuaA08vi5ZxMwDDgTg@mail.gmail.com>
	<b2154001-d1eb-4266-a491-108d3e6ae77d@googlegroups.com>
Message-ID: <CAJ57qHG9zFomG1wB4fN7hZZaByvP_EhxOHRTt2OrOZz__WgawQ@mail.gmail.com>

Sorry, I'd missed another problem in your code - you are trying to
load a fasta file using Bio::PrimarySeq. To read sequence data from a
file you should use Bio::SeqIO, see:

http://www.bioperl.org/wiki/HOWTO:Beginners#Retrieving_a_sequence_from_a_file
http://www.bioperl.org/wiki/HOWTO:SeqIO

Cheers,
Roy.


From asjo at koldfront.dk  Mon Feb  4 17:58:25 2013
From: asjo at koldfront.dk (Adam =?iso-8859-1?Q?Sj=F8gren?=)
Date: Mon, 04 Feb 2013 18:58:25 +0100
Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases
In-Reply-To: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com> (Slym's
	message of "Mon, 4 Feb 2013 07:39:19 -0800 (PST)")
References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com>
Message-ID: <8738xc2c72.fsf@topper.koldfront.dk>

On Mon, 4 Feb 2013 07:39:19 -0800 (PST), Slym wrote:

> #! /usr/bin/perl

> use Bio::Tools::SeqStats;
> use Bio::Seq;

It can be a good idea to add "use strict; use warnings;" to the top of
your script. At least two problems in your program would have been
caught by perl if you had.

> open (FILE, "seq.fasta");

Using (global) literal filehandles and the two parameter open() is
somewhat outdated, a more current way to do it could be:

  open my $fh, '<', 'seq.fasta';

> @array = <FILE>;

> # Removing first line of fasta

> shift (@array);
> $array = join('', at array);
> open (FILE2, ">>seq2.fasta");
> print FILE2 "$array";

Note that you are writing just the sequence to your seq2.fasta file
here, so the new file isn't really a fasta file.

> $seqobj = Bio::PrimarySeq->new( -file => "sekw2.fasta",
> - alphabet => 'dna',);

Bio::PrimarySeq doesn't take a '-file' parameter. Also, note that the
filename is different than before "sekw2" vs. "seq2"!

Either you should use Bio::SeqIO with a '-file' parameter, or you can
use Bio::PrimarySeq with a '-seq' parameter.

> my $seq_stats = Bio::Tools::SeqStats->new(-seq=>$seqobj);

> my $monomer_ref = $seq_stats->count_monomers();

> foreach $base (sort keys %$monomer_ref) {
> print "Liczba zasad typu ", $base," = ", $monomer_ref{$base},"\n";

Here you wanted $monomer_ref->{$base}, as %monomer_ref isn't mentioned
anywhere else.

> }

Here is a complete version of your script - I chose to use Bio::SeqIO -
that works:

  #!/usr/bin/perl

  use strict;
  use warnings;

  use Bio::SeqIO;
  use Bio::Tools::SeqStats;

  my $io=Bio::SeqIO->new(-file=>'seq.fasta', -alphabet=>'dna');
  my $seqobj=$io->next_seq; # Get the first sequence from the file

  my $seq_stats = Bio::Tools::SeqStats->new(-seq=>$seqobj);
  my $monomer_ref = $seq_stats->count_monomers();
  foreach my $base (sort keys %$monomer_ref) {
      print "Liczba zasad typu ", $base," = ", $monomer_ref->{$base},"\n";
  }

E.g.:

  $ cat seq.fasta
  >test
  aaaacccggt
  $ ./slym.pl 
  Liczba zasad typu A = 4
  Liczba zasad typu C = 3
  Liczba zasad typu G = 2
  Liczba zasad typu T = 1
  $ 


  Best regards,

    Adam

-- 
 "Grittings. Ma nam is Kahlfin."                              Adam Sj?gren
                                                         asjo at koldfront.dk


From whereverroadgoes at gmail.com  Mon Feb  4 18:02:29 2013
From: whereverroadgoes at gmail.com (Slym)
Date: Mon, 4 Feb 2013 10:02:29 -0800 (PST)
Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases
In-Reply-To: <CAJ57qHG9zFomG1wB4fN7hZZaByvP_EhxOHRTt2OrOZz__WgawQ@mail.gmail.com>
References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com>
	<CAJ57qHHphLgEyfkEEyt2HVh+RahSWpiuhuaA08vi5ZxMwDDgTg@mail.gmail.com>
	<b2154001-d1eb-4266-a491-108d3e6ae77d@googlegroups.com>
	<CAJ57qHG9zFomG1wB4fN7hZZaByvP_EhxOHRTt2OrOZz__WgawQ@mail.gmail.com>
Message-ID: <d5e347d1-cbaa-498a-9b64-a5242fdc4dd8@googlegroups.com>

The thing is, if I use Bio::SeqIO then  Bio::Tools::SeqStats produces an 
error (saying that it wants input provided by Bio::PrimarySeq).
(btw in this line
 $seqobj = Bio::PrimarySeq->new( -file => "sekw2.fasta", - alphabet => 
'dna',); 
there's a typo "sekw2" instead of "seq2" but this is correct in my original 
code).


From whereverroadgoes at gmail.com  Mon Feb  4 18:02:29 2013
From: whereverroadgoes at gmail.com (Slym)
Date: Mon, 4 Feb 2013 10:02:29 -0800 (PST)
Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases
In-Reply-To: <CAJ57qHG9zFomG1wB4fN7hZZaByvP_EhxOHRTt2OrOZz__WgawQ@mail.gmail.com>
References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com>
	<CAJ57qHHphLgEyfkEEyt2HVh+RahSWpiuhuaA08vi5ZxMwDDgTg@mail.gmail.com>
	<b2154001-d1eb-4266-a491-108d3e6ae77d@googlegroups.com>
	<CAJ57qHG9zFomG1wB4fN7hZZaByvP_EhxOHRTt2OrOZz__WgawQ@mail.gmail.com>
Message-ID: <d5e347d1-cbaa-498a-9b64-a5242fdc4dd8@googlegroups.com>

The thing is, if I use Bio::SeqIO then  Bio::Tools::SeqStats produces an 
error (saying that it wants input provided by Bio::PrimarySeq).
(btw in this line
 $seqobj = Bio::PrimarySeq->new( -file => "sekw2.fasta", - alphabet => 
'dna',); 
there's a typo "sekw2" instead of "seq2" but this is correct in my original 
code).


From cjfields at illinois.edu  Mon Feb  4 18:54:39 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Mon, 4 Feb 2013 18:54:39 +0000
Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases
In-Reply-To: <d5e347d1-cbaa-498a-9b64-a5242fdc4dd8@googlegroups.com>
References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com>
	<CAJ57qHHphLgEyfkEEyt2HVh+RahSWpiuhuaA08vi5ZxMwDDgTg@mail.gmail.com>
	<b2154001-d1eb-4266-a491-108d3e6ae77d@googlegroups.com>
	<CAJ57qHG9zFomG1wB4fN7hZZaByvP_EhxOHRTt2OrOZz__WgawQ@mail.gmail.com>
	<d5e347d1-cbaa-498a-9b64-a5242fdc4dd8@googlegroups.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE161ED@CHIMBX5.ad.uillinois.edu>

Please make sure and read both Roy's and Adam's responses all the way through; Bio::SeqIO is not a sequence object but the front-end for format parsing (e.g. FASTA, etc).  Bio::PrimarySeq does not have a '-file' parameter, Bio::SeqIO does.  

If SeqStats truly doesn't work with Bio::Seq we can fix that, but according to Adam he has tested using Bio::SeqIO out and it seems to work.

chris

On Feb 4, 2013, at 12:02 PM, Slym <whereverroadgoes at gmail.com>
 wrote:

> The thing is, if I use Bio::SeqIO then  Bio::Tools::SeqStats produces an 
> error (saying that it wants input provided by Bio::PrimarySeq).
> (btw in this line
> $seqobj = Bio::PrimarySeq->new( -file => "sekw2.fasta", - alphabet => 
> 'dna',); 
> there's a typo "sekw2" instead of "seq2" but this is correct in my original 
> code).
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From asjo at koldfront.dk  Mon Feb  4 20:00:32 2013
From: asjo at koldfront.dk (Adam =?iso-8859-1?Q?Sj=F8gren?=)
Date: Mon, 04 Feb 2013 21:00:32 +0100
Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases
In-Reply-To: <d5e347d1-cbaa-498a-9b64-a5242fdc4dd8@googlegroups.com> (Slym's
	message of "Mon, 4 Feb 2013 10:02:29 -0800 (PST)")
References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com>
	<CAJ57qHHphLgEyfkEEyt2HVh+RahSWpiuhuaA08vi5ZxMwDDgTg@mail.gmail.com>
	<b2154001-d1eb-4266-a491-108d3e6ae77d@googlegroups.com>
	<CAJ57qHG9zFomG1wB4fN7hZZaByvP_EhxOHRTt2OrOZz__WgawQ@mail.gmail.com>
	<d5e347d1-cbaa-498a-9b64-a5242fdc4dd8@googlegroups.com>
Message-ID: <87txpr26jj.fsf@topper.koldfront.dk>

On Mon, 4 Feb 2013 10:02:29 -0800 (PST), Slym wrote:

> The thing is, if I use Bio::SeqIO then  Bio::Tools::SeqStats produces an 
> error (saying that it wants input provided by Bio::PrimarySeq).

That sounds like you forgot to call ->next_seq() on the Bio::SeqIO
object - to get a sequence object - please see the complete, working
example I sent earlier.


  Best regards,

    Adam

-- 
 "Denial springs eternal."                                    Adam Sj?gren
                                                         asjo at koldfront.dk


From scott at scottcain.net  Tue Feb  5 14:45:14 2013
From: scott at scottcain.net (Scott Cain)
Date: Tue, 5 Feb 2013 09:45:14 -0500
Subject: [Bioperl-l] Have your say in the 2013 GMOD Community Survey!
Message-ID: <CA+JTaoy5NZubXo2jQ8oDN20BQ5BAHg3B9ZmYZRJ6f2Ryr+-awQ@mail.gmail.com>

Give us your thoughts on the GMOD project and win a personal DNA test
from 23andMe!

The GMOD project provides tools like GBrowse, Galaxy, MAKER, JBrowse,
Tripal, Apollo, Chado, and many more to a huge community of users and
developers around the world.

To make sure that GMOD is giving you the support you need, we want to
know how you use GMOD, which components you find valuable, your
opinion on support, training, and GMOD's strengths and weaknesses.
Your feedback is vital in helping GMOD to serve its user community
more effectively and to suggest future directions for the project.

Do the survey: http://gmod.org/survey.html

The survey should take between 10 and 15 minutes (including thinking
time), and participants can enter a draw to win "A Journey Through
Your DNA", the personal DNA test from 23andMe (the winner can pick a
$50 Amazon gift voucher if they prefer).

The survey will be open until March 1st. Results will be collated and
discussed at the April 2013 GMOD Meeting in Cambridge, UK, and posted
on the GMOD wiki at http://gmod.org.

Please spread the word to other friends and colleagues who use GMOD:
the more voices we hear, the better the picture we get of the needs of
our users, and the better we can help you!

Do the survey: http://gmod.org/survey.html

If you have any questions or problems with the survey, please email me
-- I will be happy to help out!

Thanks,
Scott


-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   scott at scottcain dot net
GMOD Coordinator (http://gmod.org/)                     216-392-3087
Ontario Institute for Cancer Research


From tiago.hori at gmail.com  Tue Feb  5 15:21:55 2013
From: tiago.hori at gmail.com (Tiago Hori)
Date: Tue, 5 Feb 2013 07:21:55 -0800 (PST)
Subject: [Bioperl-l] Search I::O
Message-ID: <39b1269f-63a7-4b29-af79-8c93ab231abf@googlegroups.com>

Hi All,

I am trying to find the best putative orthologs for 44K Atlantic Salmon 
sequences, and so I need to parse 44K BLAST reports to find the best human 
hit. I am trying to learn Seach::IO, but when I try the first example on 
the HOWTO: use strict;
use Bio::SearchIO;

my $in = new Bio::SearchIO(-format => 'blast'
               -file => 'C001R047.txt');

while( my $result = $in->next_result ) {
  ## $result is a Bio::Search::Result::ResultI compliant object
  while( my $hit = $result->next_hit ) {
    ## $hit is a Bio::Search::Hit::HitI compliant object
    while( my $hsp = $hit->next_hsp ) {
      ## $hsp is a Bio::Search::HSP::HSPI compliant object
      if( $hsp->length('total') > 50 ) {
        if ( $hsp->percent_identity >= 75 ) {
          print "Query=",   $result->query_name,
            " Hit=",        $hit->name,
            " Length=",     $hsp->length('total'),
            " Percent_id=", $hsp->percent_identity, "\n";
        }
      }
    }  
  }
}

I get this error: Odd number of elements in hash assignment at 
/usr/local/share/perl/5.14.2/Bio/SearchIO.pm line 189.

I am using BioPerl version 1.6.901. Is there a format problem with the 
blast reports?

Any help would be greatly appreciated!

T.


From tiago.hori at gmail.com  Tue Feb  5 15:33:32 2013
From: tiago.hori at gmail.com (Tiago Hori)
Date: Tue, 5 Feb 2013 07:33:32 -0800 (PST)
Subject: [Bioperl-l] Search::IO example from HOWTO
Message-ID: <c87907a1-18da-49ed-ad70-55ca7bd27658@googlegroups.com>

Hi All,

I am trying to run tha example from the Search::IO how to use strict;
use Bio::SearchIO;

my $in = new Bio::SearchIO(-format => 'blast'
               -file => 'test.txt');

while( my $result = $in->next_result ) {
  ## $result is a Bio::Search::Result::ResultI compliant object
  while( my $hit = $result->next_hit ) {
    ## $hit is a Bio::Search::Hit::HitI compliant object
    while( my $hsp = $hit->next_hsp ) {
      ## $hsp is a Bio::Search::HSP::HSPI compliant object
      if( $hsp->length('total') > 50 ) {
        if ( $hsp->percent_identity >= 75 ) {
          print "Query=",   $result->query_name,
            " Hit=",        $hit->name,
            " Length=",     $hsp->length('total'),
            " Percent_id=", $hsp->percent_identity, "\n";
        }
      }
    }  
  }
}

And I get this error:Odd number of elements in hash assignment at 
/usr/local/share/perl/5.14.2/Bio/SearchIO.pm line 189.

Can anybody help!

Cheers,

T.


From carandraug+dev at gmail.com  Tue Feb  5 18:56:21 2013
From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=)
Date: Tue, 5 Feb 2013 18:56:21 +0000
Subject: [Bioperl-l] removing packages from bioperl-live
Message-ID: <CAPOrs_1z3xYWVFvObLryf7E4w1oO3O0ZjJ_Cu8HA805=S0Fpzw@mail.gmail.com>

Hi

some of the bioperl-live packages have already been split into
separate repositories. However, they were never actually removed from
bioperl-live. This creates 2 entry points for bug fixes and
implementations. After a chat on #bioperl, I was told to ask here.

Should these be removed? For example, there's bioperl-FeatureIO but
that code alo exists in bioperl-live. Can I remove it from
bioperl-live?

Carn?


From cjfields at illinois.edu  Tue Feb  5 19:34:07 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Tue, 5 Feb 2013 19:34:07 +0000
Subject: [Bioperl-l] Removing Bio::FeatureIO? was Re: removing packages from
 bioperl-live
In-Reply-To: <CAPOrs_1z3xYWVFvObLryf7E4w1oO3O0ZjJ_Cu8HA805=S0Fpzw@mail.gmail.com>
References: <CAPOrs_1z3xYWVFvObLryf7E4w1oO3O0ZjJ_Cu8HA805=S0Fpzw@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu>

Probably should retitle this to ask the question directly (make sure the right radars are pinged).

My vote is yes, it should be removed.  There were a lot of implementation issues with it that ended up becoming problematic.  I do believe it is used, though, so I would like to get additional responses from the community before removing it and pointing to the separate repository (where there has been a lot of experimenting going on).

chris

On Feb 5, 2013, at 12:56 PM, Carn? Draug <carandraug+dev at gmail.com> wrote:

> Hi
> 
> some of the bioperl-live packages have already been split into
> separate repositories. However, they were never actually removed from
> bioperl-live. This creates 2 entry points for bug fixes and
> implementations. After a chat on #bioperl, I was told to ask here.
> 
> Should these be removed? For example, there's bioperl-FeatureIO but
> that code alo exists in bioperl-live. Can I remove it from
> bioperl-live?
> 
> Carn?
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From scott at scottcain.net  Tue Feb  5 19:36:10 2013
From: scott at scottcain.net (Scott Cain)
Date: Tue, 5 Feb 2013 14:36:10 -0500
Subject: [Bioperl-l] Removing Bio::FeatureIO? was Re: removing packages
 from bioperl-live
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_1z3xYWVFvObLryf7E4w1oO3O0ZjJ_Cu8HA805=S0Fpzw@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu>
Message-ID: <CA+JTaowxkgy+2ytqHG-MG6VrOdT7jGLQ9-_TJfVA3COsLgUZYw@mail.gmail.com>

I'm sure it will lead to lots of fun, but I suspect you are right and
it should be removed.  It's time you yank on that bandaid :-)

Scott


On Tue, Feb 5, 2013 at 2:34 PM, Fields, Christopher J
<cjfields at illinois.edu> wrote:
> Probably should retitle this to ask the question directly (make sure the right radars are pinged).
>
> My vote is yes, it should be removed.  There were a lot of implementation issues with it that ended up becoming problematic.  I do believe it is used, though, so I would like to get additional responses from the community before removing it and pointing to the separate repository (where there has been a lot of experimenting going on).
>
> chris
>
> On Feb 5, 2013, at 12:56 PM, Carn? Draug <carandraug+dev at gmail.com> wrote:
>
>> Hi
>>
>> some of the bioperl-live packages have already been split into
>> separate repositories. However, they were never actually removed from
>> bioperl-live. This creates 2 entry points for bug fixes and
>> implementations. After a chat on #bioperl, I was told to ask here.
>>
>> Should these be removed? For example, there's bioperl-FeatureIO but
>> that code alo exists in bioperl-live. Can I remove it from
>> bioperl-live?
>>
>> Carn?
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   scott at scottcain dot net
GMOD Coordinator (http://gmod.org/)                     216-392-3087
Ontario Institute for Cancer Research


From carandraug+dev at gmail.com  Tue Feb  5 20:06:23 2013
From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=)
Date: Tue, 5 Feb 2013 20:06:23 +0000
Subject: [Bioperl-l] dependencies on perl version
Message-ID: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>

Hi

how much perl backwards compatibility does bioperl needs to keep?

If I have something I want to implement and use state (requires
5.010), is it acceptable? 5.010 is already a quite old perl version.
Of course, there are other less elegant ways to implement those
features. If I can't use modern perl stuff, what version number is the
limit?

Carn?


From carandraug+dev at gmail.com  Tue Feb  5 20:10:01 2013
From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=)
Date: Tue, 5 Feb 2013 20:10:01 +0000
Subject: [Bioperl-l] Removing Bio::FeatureIO? was Re: removing packages
 from bioperl-live
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_1z3xYWVFvObLryf7E4w1oO3O0ZjJ_Cu8HA805=S0Fpzw@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu>
Message-ID: <CAPOrs_0qgrs3FKaoyFHL_RmbYJG8jNDfhxW-YddFVUfW3DFn4w@mail.gmail.com>

On 5 February 2013 19:34, Fields, Christopher J <cjfields at illinois.edu> wrote:
> Probably should retitle this to ask the question directly (make sure the right radars are pinged).
>
> My vote is yes, it should be removed.  There were a lot of implementation issues with it that ended up becoming problematic.  I do believe it is used, though, so I would like to get additional responses from the community before removing it and pointing to the separate repository (where there has been a lot of experimenting going on).

Mentioning Bio::FeatureIO was just an example. I meant to ask it as
more general. If the code is already in a separate repository, should
it be removed from bioperl-live?

Carn?


From cjfields at illinois.edu  Tue Feb  5 20:56:48 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Tue, 5 Feb 2013 20:56:48 +0000
Subject: [Bioperl-l] dependencies on perl version
In-Reply-To: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>

Aim for 5.10.1, but be careful of smart-match.  If you do this, make sure to add a 'use 5.010' pragma at the top.  

(for those who don't like this, please speak up.  perl 5.8 has been around almost 8 yrs, we would like to allow using new features if at all possible)

chris

On Feb 5, 2013, at 2:06 PM, Carn? Draug <carandraug+dev at gmail.com> wrote:

> Hi
> 
> how much perl backwards compatibility does bioperl needs to keep?
> 
> If I have something I want to implement and use state (requires
> 5.010), is it acceptable? 5.010 is already a quite old perl version.
> Of course, there are other less elegant ways to implement those
> features. If I can't use modern perl stuff, what version number is the
> limit?
> 
> Carn?
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Tue Feb  5 20:59:38 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Tue, 5 Feb 2013 20:59:38 +0000
Subject: [Bioperl-l] Removing Bio::FeatureIO? was Re: removing packages
 from bioperl-live
In-Reply-To: <CAPOrs_0qgrs3FKaoyFHL_RmbYJG8jNDfhxW-YddFVUfW3DFn4w@mail.gmail.com>
References: <CAPOrs_1z3xYWVFvObLryf7E4w1oO3O0ZjJ_Cu8HA805=S0Fpzw@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu>
	<CAPOrs_0qgrs3FKaoyFHL_RmbYJG8jNDfhxW-YddFVUfW3DFn4w@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1829D@CHIMBX5.ad.uillinois.edu>

On Feb 5, 2013, at 2:10 PM, Carn? Draug <carandraug+dev at gmail.com> wrote:

> On 5 February 2013 19:34, Fields, Christopher J <cjfields at illinois.edu> wrote:
>> Probably should retitle this to ask the question directly (make sure the right radars are pinged).
>> 
>> My vote is yes, it should be removed.  There were a lot of implementation issues with it that ended up becoming problematic.  I do believe it is used, though, so I would like to get additional responses from the community before removing it and pointing to the separate repository (where there has been a lot of experimenting going on).
> 
> Mentioning Bio::FeatureIO was just an example. I meant to ask it as
> more general. If the code is already in a separate repository, should
> it be removed from bioperl-live?
> 
> Carn?

Yes for Bio::FeatureIO, no for Bio::Root::Root and the others at the moment (I want to get a release out by March 1, which I'm planning on announcing later today, so the less disruptive it is the better).  Once we get a new release out we should remove the rest.

chris


From cjfields at illinois.edu  Tue Feb  5 21:53:29 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Tue, 5 Feb 2013 21:53:29 +0000
Subject: [Bioperl-l] Next BioPerl release
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu>

All,

I am scheduling the next BioPerl CPAN release tentatively for March 1.  Any help in triaging bug reports would be greatly appreciated!  

Amongst all other changes, as mentioned in a separate thread we will remove Bio::FeatureIO, now developed in a separate repository:

    https://github.com/bioperl/Bio-FeatureIO

Feedback, suggestions, etc are greatly appreciated.

chris


From miker at htblis.com  Wed Feb  6 00:54:17 2013
From: miker at htblis.com (Michael Rogoff)
Date: Tue, 5 Feb 2013 16:54:17 -0800
Subject: [Bioperl-l] Bio::Graphics error when rendering features with Split
	locations
Message-ID: <C71FF11A-F2E2-4204-9A10-50F5535A0C81@htblis.com>

When trying to render features from a genbank file that include a split location e.g.:

     promoter        join(1000..1080,1..5)
                     /label=PROM1

The following exception is raised:
Can't locate object method "has_tag" via package "Bio::Location::Simple" at lib/perl5/site_perl/5.10.1/Bio/Graphics/Glyph.pm line 704, <GEN0> line 36.

This can be reproduced with the code in the example "Rendering Features from a GenBank or EMBL File" from the Graphics HOW-TO:
http://www.bioperl.org/wiki/HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File

Is there a way to change the script so that split locations would, at the very least, not cause a fatal error?  Is there a different glyph type that needs to be used?  Thanks in advance for any help.

I've attached a simple genbank input that will reproduce the error:

LOCUS       sample2     1080 bp DNA    circular
DEFINITION  Cloning vector sample2
ACCESSION   sample2
VERSION     sample2.1  GI:4352432
COMMENT     Component Fragments
FEATURES               Location/Qualifiers
     terminator      39..328
                     /label=TERM1
                     /note="terminator 1"
     misc_feature    393..488
                     /label=MF1
     CDS             complement(800..900)
                     /label=CDS1
                     /note="resistence gene"
     promoter        join(1000..1080,1..5)
                     /label=PROM1
ORIGIN
        1  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
       61  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      121  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      181  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      241  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      301  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      361  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      421  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      481  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      541  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      601  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      661  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      721  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      781  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      841  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      901  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
      961  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
     1021  nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
//


P.S.  I think I have traced the source of the problem to Glyph's _subfeat method, which in the case of a feature with split locations is returning location objects instead of feature objects.  Is this a bug?

sub _subfeat {
  my $class   = shift;
  my $feature = shift;

  return $feature->segments     if $feature->can('segments');

  my @split = eval { my $id   = $feature->location->seq_id;
                     my @subs = $feature->location->sub_Location;
                     grep {$id eq $_->seq_id} @subs;
                   };

  return @split if @split;

  # Either the APIs have changed, or I got confused at some point...
  return $feature->get_SeqFeatures         if $feature->can('get_SeqFeatures');
  return $feature->sub_SeqFeature          if $feature->can('sub_SeqFeature');
  return;
}


From l.m.timmermans at students.uu.nl  Wed Feb  6 02:40:27 2013
From: l.m.timmermans at students.uu.nl (Leon Timmermans)
Date: Wed, 6 Feb 2013 03:40:27 +0100
Subject: [Bioperl-l] dependencies on perl version
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
Message-ID: <CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>

On Tue, Feb 5, 2013 at 9:56 PM, Fields, Christopher J
<cjfields at illinois.edu> wrote:
> Aim for 5.10.1, but be careful of smart-match.  If you do this, make sure to add a 'use 5.010' pragma at the top.
>
> (for those who don't like this, please speak up.  perl 5.8 has been around almost 8 yrs, we would like to allow using new features if at all possible)

I *really* hate saying it, but I fear a lot of places are still stuck
on 5.8, in particular on 5.8.8 because of CentOS 5. I know my
department still is and doesn't seem to be in a hurry to upgrade, and
I'm pretty sure it won't be the only one (though personally I use a
self-compiled 5.16).

Leon


From florent.angly at gmail.com  Wed Feb  6 02:51:27 2013
From: florent.angly at gmail.com (Florent Angly)
Date: Wed, 06 Feb 2013 12:51:27 +1000
Subject: [Bioperl-l] Removing Bio::FeatureIO? was Re: removing packages
 from bioperl-live
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1829D@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_1z3xYWVFvObLryf7E4w1oO3O0ZjJ_Cu8HA805=S0Fpzw@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu>
	<CAPOrs_0qgrs3FKaoyFHL_RmbYJG8jNDfhxW-YddFVUfW3DFn4w@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1829D@CHIMBX5.ad.uillinois.edu>
Message-ID: <5111C52F.50101@gmail.com>

On 06/02/13 06:59, Fields, Christopher J wrote:
> On Feb 5, 2013, at 2:10 PM, Carn? Draug <carandraug+dev at gmail.com> wrote:
>
>> On 5 February 2013 19:34, Fields, Christopher J <cjfields at illinois.edu> wrote:
>>> Probably should retitle this to ask the question directly (make sure the right radars are pinged).
>>>
>>> My vote is yes, it should be removed.  There were a lot of implementation issues with it that ended up becoming problematic.  I do believe it is used, though, so I would like to get additional responses from the community before removing it and pointing to the separate repository (where there has been a lot of experimenting going on).
>> Mentioning Bio::FeatureIO was just an example. I meant to ask it as
>> more general. If the code is already in a separate repository, should
>> it be removed from bioperl-live?
>>
>> Carn?
> Yes for Bio::FeatureIO, no for Bio::Root::Root and the others at the moment (I want to get a release out by March 1, which I'm planning on announcing later today, so the less disruptive it is the better).  Once we get a new release out we should remove the rest.
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Sounds good to me (I've been burnt once by the fact that Bio::FeatureIO 
is in two places).
Florent


From florent.angly at gmail.com  Wed Feb  6 02:56:19 2013
From: florent.angly at gmail.com (Florent Angly)
Date: Wed, 06 Feb 2013 12:56:19 +1000
Subject: [Bioperl-l] dependencies on perl version
In-Reply-To: <CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
Message-ID: <5111C653.2010703@gmail.com>

For what it's worth, the current stable version of Debian uses perl 
5.10.1 (http://packages.debian.org/stable/perl/perl).
Florent

On 06/02/13 12:40, Leon Timmermans wrote:
> On Tue, Feb 5, 2013 at 9:56 PM, Fields, Christopher J
> <cjfields at illinois.edu> wrote:
>> Aim for 5.10.1, but be careful of smart-match.  If you do this, make sure to add a 'use 5.010' pragma at the top.
>>
>> (for those who don't like this, please speak up.  perl 5.8 has been around almost 8 yrs, we would like to allow using new features if at all possible)
> I *really* hate saying it, but I fear a lot of places are still stuck
> on 5.8, in particular on 5.8.8 because of CentOS 5. I know my
> department still is and doesn't seem to be in a hurry to upgrade, and
> I'm pretty sure it won't be the only one (though personally I use a
> self-compiled 5.16).
>
> Leon
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From hlapp at drycafe.net  Wed Feb  6 03:27:35 2013
From: hlapp at drycafe.net (Hilmar Lapp)
Date: Tue, 5 Feb 2013 22:27:35 -0500
Subject: [Bioperl-l] Next BioPerl release
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu>
References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu>
Message-ID: <09524241-59F8-4BFF-8054-53CD0A649C11@drycafe.net>


On Feb 5, 2013, at 4:53 PM, Fields, Christopher J wrote:

> I am scheduling the next BioPerl CPAN release tentatively for March 1.

Yay!! Thanks for your leadership again, Chris, and for volunteering your time for the project. If nothing else, and I know this is no compensation really worth speaking of, we owe you beer, and I'll certainly pay my debt to you in Berlin if you come there.

	-hilmar
-- 
===========================================================
: Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net :
===========================================================


From hlapp at drycafe.net  Wed Feb  6 03:32:40 2013
From: hlapp at drycafe.net (Hilmar Lapp)
Date: Tue, 5 Feb 2013 22:32:40 -0500
Subject: [Bioperl-l] dependencies on perl version
In-Reply-To: <5111C653.2010703@gmail.com>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
Message-ID: <A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>

Does anyone know what Ubuntu uses? I've heard lots of other old version problems with CentOS.

8 years is really old, and at some point I fear that weighing backwards compatibility too much just holds us back in a real detrimental way.

	-hilmar

On Feb 5, 2013, at 9:56 PM, Florent Angly wrote:

> For what it's worth, the current stable version of Debian uses perl 5.10.1 (http://packages.debian.org/stable/perl/perl).
> Florent
> 
> On 06/02/13 12:40, Leon Timmermans wrote:
>> On Tue, Feb 5, 2013 at 9:56 PM, Fields, Christopher J
>> <cjfields at illinois.edu> wrote:
>>> Aim for 5.10.1, but be careful of smart-match.  If you do this, make sure to add a 'use 5.010' pragma at the top.
>>> 
>>> (for those who don't like this, please speak up.  perl 5.8 has been around almost 8 yrs, we would like to allow using new features if at all possible)
>> I *really* hate saying it, but I fear a lot of places are still stuck
>> on 5.8, in particular on 5.8.8 because of CentOS 5. I know my
>> department still is and doesn't seem to be in a hurry to upgrade, and
>> I'm pretty sure it won't be the only one (though personally I use a
>> self-compiled 5.16).
>> 
>> Leon
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net :
===========================================================


From cjfields at illinois.edu  Wed Feb  6 03:58:08 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 6 Feb 2013 03:58:08 +0000
Subject: [Bioperl-l] dependencies on perl version
In-Reply-To: <A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE18CBE@CHIMBX5.ad.uillinois.edu>

Re: being held back, I agree.  I don't necessarily want to intentionally break current modules by adding modern code unless it can be demonstrated to be a decent benefit performance-wise, but I don't want to impede new additions by requiring compat with perl 5.8 (hence my suggestion of a 'use 5.01x' pragma when appropriate).

Ubuntu 12.04 LTS is on perl 5.14.2: 

    http://askubuntu.com/questions/80672/what-perl-version-will-be-in-12-04-lts

BTW, I was wrong about perl 5.8 being 8 yrs old; it's almost 11 yrs old (perl 5.8.0 was released on 7/18/2002).  perl 5.8 reached end-of-life in 2008, fixes being only for security reasons.

So, I support dropping perl 5.8 support, but we should have a decent route of use for the folks stuck on old clusters.

chris

On Feb 5, 2013, at 9:32 PM, Hilmar Lapp <hlapp at drycafe.net> wrote:

> Does anyone know what Ubuntu uses? I've heard lots of other old version problems with CentOS.
> 
> 8 years is really old, and at some point I fear that weighing backwards compatibility too much just holds us back in a real detrimental way.
> 
> 	-hilmar
> 
> On Feb 5, 2013, at 9:56 PM, Florent Angly wrote:
> 
>> For what it's worth, the current stable version of Debian uses perl 5.10.1 (http://packages.debian.org/stable/perl/perl).
>> Florent
>> 
>> On 06/02/13 12:40, Leon Timmermans wrote:
>>> On Tue, Feb 5, 2013 at 9:56 PM, Fields, Christopher J
>>> <cjfields at illinois.edu> wrote:
>>>> Aim for 5.10.1, but be careful of smart-match.  If you do this, make sure to add a 'use 5.010' pragma at the top.
>>>> 
>>>> (for those who don't like this, please speak up.  perl 5.8 has been around almost 8 yrs, we would like to allow using new features if at all possible)
>>> I *really* hate saying it, but I fear a lot of places are still stuck
>>> on 5.8, in particular on 5.8.8 because of CentOS 5. I know my
>>> department still is and doesn't seem to be in a hurry to upgrade, and
>>> I'm pretty sure it won't be the only one (though personally I use a
>>> self-compiled 5.16).
>>> 
>>> Leon
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> 
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> -- 
> ===========================================================
> : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net :
> ===========================================================
> 
> 
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From l.m.timmermans at students.uu.nl  Wed Feb  6 04:11:52 2013
From: l.m.timmermans at students.uu.nl (Leon Timmermans)
Date: Wed, 6 Feb 2013 05:11:52 +0100
Subject: [Bioperl-l] dependencies on perl version
In-Reply-To: <A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
Message-ID: <CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>

On Wed, Feb 6, 2013 at 4:32 AM, Hilmar Lapp <hlapp at drycafe.net> wrote:
> Does anyone know what Ubuntu uses?

5.14.2, distrowatch is your friend ;-)

> I've heard lots of other old version problems with CentOS.

I know people who still use CentOS 4 in production :-|

> 8 years is really old, and at some point I fear that weighing backwards compatibility too much just holds us back in a real detrimental way.

CentOS 5 is 6 years old (and will be supported another 4), but CentOS
6 is 'only' 19 months. perl missing a release in the 5.8-5.10
timeframe combined with an unfortunate alignment of its release
schedule with Red Hat's don't do us any favors here.

Leon


From cjfields at illinois.edu  Wed Feb  6 04:14:24 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 6 Feb 2013 04:14:24 +0000
Subject: [Bioperl-l] dependencies on perl version
In-Reply-To: <CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE18E52@CHIMBX5.ad.uillinois.edu>

On Feb 5, 2013, at 8:40 PM, Leon Timmermans <l.m.timmermans at students.uu.nl> wrote:

> On Tue, Feb 5, 2013 at 9:56 PM, Fields, Christopher J
> <cjfields at illinois.edu> wrote:
>> Aim for 5.10.1, but be careful of smart-match.  If you do this, make sure to add a 'use 5.010' pragma at the top.
>> 
>> (for those who don't like this, please speak up.  perl 5.8 has been around almost 8 yrs, we would like to allow using new features if at all possible)
> 
> I *really* hate saying it, but I fear a lot of places are still stuck
> on 5.8, in particular on 5.8.8 because of CentOS 5. I know my
> department still is and doesn't seem to be in a hurry to upgrade, and
> I'm pretty sure it won't be the only one (though personally I use a
> self-compiled 5.16).
> 
> Leon

We had the same problem for a while, but our sysadmins were willing to set up perl 5.12 (at that time) loadable as a module (we can of course set up a local perl as well).  We're now using a sysadmin-installed perl 5.16 with our current cluster.

chris


From cjfields at illinois.edu  Wed Feb  6 04:24:31 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 6 Feb 2013 04:24:31 +0000
Subject: [Bioperl-l] dependencies on perl version
In-Reply-To: <CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>

On Feb 5, 2013, at 10:11 PM, Leon Timmermans <l.m.timmermans at students.uu.nl> wrote:

> On Wed, Feb 6, 2013 at 4:32 AM, Hilmar Lapp <hlapp at drycafe.net> wrote:
>> Does anyone know what Ubuntu uses?
> 
> 5.14.2, distrowatch is your friend ;-)
> 
>> I've heard lots of other old version problems with CentOS.
> 
> I know people who still use CentOS 4 in production :-|
> 
>> 8 years is really old, and at some point I fear that weighing backwards compatibility too much just holds us back in a real detrimental way.
> 
> CentOS 5 is 6 years old (and will be supported another 4), but CentOS
> 6 is 'only' 19 months. perl missing a release in the 5.8-5.10
> timeframe combined with an unfortunate alignment of its release
> schedule with Red Hat's don't do us any favors here.
> 
> Leon

Right, it took ~8 yrs to go from 5.8 to 5.10.  I'd like to point out that Python users are in the same boat: the Python version for CentOS 5 is 2.4.3, and Biopython requires a minimum of python 2.5 (and recommends python 2.7).  

We can always state that perl 5.8 is supported for the upcoming Bioperl release, but we're dropping v5.8 support for any future releases.

chris


From l.m.timmermans at students.uu.nl  Wed Feb  6 04:33:57 2013
From: l.m.timmermans at students.uu.nl (Leon Timmermans)
Date: Wed, 6 Feb 2013 05:33:57 +0100
Subject: [Bioperl-l] dependencies on perl version
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
Message-ID: <CAC1jpXAjt8m9Go9YkGFOUkxw92FUoLFbs0Q_fys-f_gyAwX8yw@mail.gmail.com>

On Wed, Feb 6, 2013 at 5:24 AM, Fields, Christopher J
<cjfields at illinois.edu> wrote:
> Right, it took ~8 yrs to go from 5.8 to 5.10.  I'd like to point out that Python users are in the same boat: the Python version for CentOS 5 is 2.4.3, and Biopython requires a minimum of python 2.5 (and recommends python 2.7).
>
> We can always state that perl 5.8 is supported for the upcoming Bioperl release, but we're dropping v5.8 support for any future releases.

Sounds reasonable. These things shouldn't come as a surprise.

I suspect that the thing that will save us is that most of these
people install it once and then never upgrade.

Leon


From hartzell at alerce.com  Wed Feb  6 17:58:07 2013
From: hartzell at alerce.com (George Hartzell)
Date: Wed, 6 Feb 2013 09:58:07 -0800
Subject: [Bioperl-l] dependencies on perl version
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
Message-ID: <20754.39343.128576.743448@gargle.gargle.HOWL>

Fields, Christopher J writes:
 > [...]
 > Right, it took ~8 yrs to go from 5.8 to 5.10.  I'd like to point
 > out that Python users are in the same boat: the Python version for
 > CentOS 5 is 2.4.3, and Biopython requires a minimum of python 2.5
 > (and recommends python 2.7).   
 > 
 > We can always state that perl 5.8 is supported for the upcoming
 > Bioperl release, but we're dropping v5.8 support for any future
 > releases. 

Do more than drop support for 5.8.

The Perl community has put a transparent and predictable process in
place for releasing [generally] better versions of the language.  It
means that Perl has a chance of continuing to be relevant, attracting
new talent and actually *fixing* some of the s&%t that gives Perl a
bad rap.  It gives people something to plan around, no one should be
surprised that v 5.X.Y is coming out in mid 20ZZ.

BioPerl should do the same thing, declare a release policy that trails
along with the Perl release schedule.  Keep it simple and no one can
argue with it.  Support Perl releases as long as the releases
themselves are supported.

Rather than expending energy supporting out of date platforms, put the
energy into being modern (or Modern...), better distro building and
packaging, testing, documentation and releasing so that the process of
staying current is painless.

Look forward.  Keep it interesting and fun.

Everyone running Mac OS 9 on their Pismo, raise your hand.  Anyone
make their living running sequencing gels in Plexiglas doohickeys on
their lab bench?

I'm not suggesting that the BioPerl community is free to make
arbitrary and capricious changes that makes it difficult for *anyone*
to get anything done.  Churn is a waste of time.

But why should the all-volunteer BioPerl community be stuck supporting
code from 12 years ago because it's cost effective for someone else to
avoid spending *their* $/time/people to stay up to date.

Those sites that value stability/maturity/stagnation so highly have
already accepted the cost/difficulty of nailing one of their feet to
the floor as they try to run forward.  They recognize and depend on
the benefits of having that stable base but generally they've also
accepted the costs associated with their restrictive choices.  They
know how to pull in separate kernel/driver updates so that they can
actually run on nearly modern hardware.  They know, and live with, the
fact that they're not going to have access to the shiny new stuff.
And they know how to stay up to date, when they need to, with the
software that their users need to be competitive (e.g. BioConductor
and R).

As long as (if/when...) updating a BioPerl release is something that
can reliably happen with a few cpanm invocations then the sites that
otherwise favor punctuated equilibrium will learn to handle gradual
change.

Those folks that are "stuck" on older releases always have the option
of supporting professional Perl programmers to keep older releases
going, backport changes, etc....  They're already buying support for
their platforms (or freeloading and coping), let them put bread on the
table at one of the bioinformatics consultancies or labs if they have
something special they need.

Have fun.  Use sharp tools.  Do cool science.  Build cool things.  No
one is paying you to be backwards compatible with the previous
millennium.

g.


From amackey at virginia.edu  Wed Feb  6 18:47:46 2013
From: amackey at virginia.edu (Aaron Mackey)
Date: Wed, 6 Feb 2013 13:47:46 -0500
Subject: [Bioperl-l] dependencies on perl version
In-Reply-To: <20754.39343.128576.743448@gargle.gargle.HOWL>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
Message-ID: <CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>

Huzzah!

--
Aaron J. Mackey, PhD
Assistant Professor
Center for Public Health Genomics
University of Virginia
amackey at virginia.edu
http://www.cphg.virginia.edu/mackey


On Wed, Feb 6, 2013 at 12:58 PM, George Hartzell <hartzell at alerce.com>wrote:

> Fields, Christopher J writes:
>  > [...]
>  > Right, it took ~8 yrs to go from 5.8 to 5.10.  I'd like to point
>  > out that Python users are in the same boat: the Python version for
>  > CentOS 5 is 2.4.3, and Biopython requires a minimum of python 2.5
>  > (and recommends python 2.7).
>  >
>  > We can always state that perl 5.8 is supported for the upcoming
>  > Bioperl release, but we're dropping v5.8 support for any future
>  > releases.
>
> Do more than drop support for 5.8.
>
> The Perl community has put a transparent and predictable process in
> place for releasing [generally] better versions of the language.  It
> means that Perl has a chance of continuing to be relevant, attracting
> new talent and actually *fixing* some of the s&%t that gives Perl a
> bad rap.  It gives people something to plan around, no one should be
> surprised that v 5.X.Y is coming out in mid 20ZZ.
>
> BioPerl should do the same thing, declare a release policy that trails
> along with the Perl release schedule.  Keep it simple and no one can
> argue with it.  Support Perl releases as long as the releases
> themselves are supported.
>
> Rather than expending energy supporting out of date platforms, put the
> energy into being modern (or Modern...), better distro building and
> packaging, testing, documentation and releasing so that the process of
> staying current is painless.
>
> Look forward.  Keep it interesting and fun.
>
> Everyone running Mac OS 9 on their Pismo, raise your hand.  Anyone
> make their living running sequencing gels in Plexiglas doohickeys on
> their lab bench?
>
> I'm not suggesting that the BioPerl community is free to make
> arbitrary and capricious changes that makes it difficult for *anyone*
> to get anything done.  Churn is a waste of time.
>
> But why should the all-volunteer BioPerl community be stuck supporting
> code from 12 years ago because it's cost effective for someone else to
> avoid spending *their* $/time/people to stay up to date.
>
> Those sites that value stability/maturity/stagnation so highly have
> already accepted the cost/difficulty of nailing one of their feet to
> the floor as they try to run forward.  They recognize and depend on
> the benefits of having that stable base but generally they've also
> accepted the costs associated with their restrictive choices.  They
> know how to pull in separate kernel/driver updates so that they can
> actually run on nearly modern hardware.  They know, and live with, the
> fact that they're not going to have access to the shiny new stuff.
> And they know how to stay up to date, when they need to, with the
> software that their users need to be competitive (e.g. BioConductor
> and R).
>
> As long as (if/when...) updating a BioPerl release is something that
> can reliably happen with a few cpanm invocations then the sites that
> otherwise favor punctuated equilibrium will learn to handle gradual
> change.
>
> Those folks that are "stuck" on older releases always have the option
> of supporting professional Perl programmers to keep older releases
> going, backport changes, etc....  They're already buying support for
> their platforms (or freeloading and coping), let them put bread on the
> table at one of the bioinformatics consultancies or labs if they have
> something special they need.
>
> Have fun.  Use sharp tools.  Do cool science.  Build cool things.  No
> one is paying you to be backwards compatible with the previous
> millennium.
>
> g.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From tiago.hori at gmail.com  Wed Feb  6 13:25:41 2013
From: tiago.hori at gmail.com (Tiago Hori)
Date: Wed, 6 Feb 2013 05:25:41 -0800 (PST)
Subject: [Bioperl-l] Problems installing Bio::Tools::Run:StandAloneBlastPlus
Message-ID: <9b488c6e-34b3-4269-a7ac-e2206720939a@googlegroups.com>

Hi Guys,

I am trying to install the module Bio::Tools::Run:StandAloneBlastPlus, but 
it has been hard so far.

I managed to install and compile samtools, after finding all the 
dependencies, but I am still missing something! I posted the complete 
report below!

Any help, would be great!

Cheers,

T.

cpan[1]> install Bio::Tools::Run::StandAloneBlastPlus
Reading '/home/tiagohori/.cpan/Metadata'
  Database was generated on Tue, 05 Feb 2013 18:41:03 GMT
Running install for module 'Bio::Tools::Run::StandAloneBlastPlus'
Running make for C/CJ/CJFIELDS/BioPerl-Run-1.006900.tar.gz
Checksum for 
/home/tiagohori/.cpan/sources/authors/id/C/CJ/CJFIELDS/BioPerl-Run-1.006900.tar.gz 
ok
Scanning cache /home/tiagohori/.cpan/build for sizes
..................................------------------------------------------DONE
DEL(1/20): /home/tiagohori/.cpan/build/BioPerl-Run-1.006900-qpHfzz 
DEL(2/20): /home/tiagohori/.cpan/build/BioPerl-Run-1.006900-qpHfzz.yml 
DEL(3/20): /home/tiagohori/.cpan/build/BioPerl-Run-1.006900-nMOXgO 
DEL(4/20): /home/tiagohori/.cpan/build/BioPerl-Run-1.006900-nMOXgO.yml 
DEL(5/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-bgBQyC 
DEL(6/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-bgBQyC.yml 
DEL(7/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-Ki3dbt 
DEL(8/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-Ki3dbt.yml 
DEL(9/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-ciM7U4 
DEL(10/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-ciM7U4.yml 
DEL(11/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-oDyi_5 
DEL(12/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-oDyi_5.yml 
DEL(13/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-AQiiAn 
DEL(14/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-AQiiAn.yml 
DEL(15/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-0H2Z9o 
DEL(16/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-0H2Z9o.yml 
DEL(17/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-c_8A_U 
DEL(18/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-c_8A_U.yml 
DEL(19/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-lWtV8v 
DEL(20/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-lWtV8v.yml 

  CPAN.pm: Building C/CJ/CJFIELDS/BioPerl-Run-1.006900.tar.gz

Install scripts? y/n [n ]
n 
Do you want to run tests that require connection to servers across the 
internet
(likely to cause some failures)? y/n [n ]
n 
  - will not run internet-requiring tests
Created MYMETA.yml and MYMETA.json
Creating new 'Build' script for 'BioPerl-Run' version '1.006900'
Building BioPerl-Run
  CJFIELDS/BioPerl-Run-1.006900.tar.gz
  ./Build -- OK
Running Build test
t/Amap.t ...................... 1/18 # Required executable for 
Bio::Tools::Run::Alignment::Amap is not present
t/Amap.t ...................... ok     
t/AnalysisFactory_soap.t ...... skipped: Network tests have not been 
requested
t/Analysis_soap.t ............. skipped: Network tests have not been 
requested
t/BEDTools.t .................. 3/423 # Required executable for 
Bio::Tools::Run::BEDTools is not present
t/BEDTools.t .................. ok       
t/BWA.t ....................... 1/36 # Required executable for 
Bio::Tools::Run::BWA is not present
t/BWA.t ....................... ok     
t/Blat.t ...................... 1/33 # Required executable for 
Bio::Tools::Run::Alignment::Blat is not present
# Looks like you planned 33 tests but ran 20.
t/Blat.t ...................... Dubious, test returned 255 (wstat 65280, 
0xff00)
Failed 13/33 subtests 
(less 15 skipped subtests: 5 okay)
t/Bowtie.t .................... 1/73 # Required executable for 
Bio::Tools::Run::Bowtie is not present
t/Bowtie.t .................... ok     
t/Cap3.t ...................... 1/91 # Required executable for 
Bio::Tools::Run::Cap3 is not present
t/Cap3.t ...................... ok     
t/Clustalw.t .................. 1/45 # Required executable for 
Bio::Tools::Run::Alignment::Clustalw is not present
t/Clustalw.t .................. ok     
t/Coil.t ...................... 2/6 # Required executable for 
Bio::Tools::Run::Coil is not present
t/Coil.t ...................... ok   
t/Consense.t .................. 1/9 # Required executable for 
Bio::Tools::Run::Phylo::Phylip::Consense is not present
t/Consense.t .................. ok   
t/DBA.t ....................... 1/18 # Required executable for 
Bio::Tools::Run::Alignment::DBA is not present
t/DBA.t ....................... ok     
t/DrawGram.t .................. 1/6 # Required executable for 
Bio::Tools::Run::Phylo::Phylip::DrawGram is not present
t/DrawGram.t .................. ok   
t/DrawTree.t .................. 1/6 # Required executable for 
Bio::Tools::Run::Phylo::Phylip::DrawTree is not present
t/DrawTree.t .................. ok   
t/EMBOSS.t .................... ok     
t/Ensembl.t ................... skipped: Network tests have not been 
requested
t/Eponine.t ................... 1/7 # Looks like you planned 7 tests but 
ran 2.
t/Eponine.t ................... Dubious, test returned 255 (wstat 65280, 
0xff00)
Failed 5/7 subtests 
t/Exonerate.t ................. 1/89 # Required executable for 
Bio::Tools::Run::Alignment::Exonerate is not present
t/Exonerate.t ................. ok     
t/FootPrinter.t ............... 1/24 # Required executable for 
Bio::Tools::Run::FootPrinter is not present
t/FootPrinter.t ............... ok     
t/Genemark.hmm.prokaryotic.t .. 1/99 # Required environment variable 
$GENEMARK_MODELS is not set
t/Genemark.hmm.prokaryotic.t .. ok     
t/Genewise.t .................. 1/20 # Required executable for 
Bio::Tools::Run::Genewise is not present
t/Genewise.t .................. ok     
t/Genscan.t ................... 1/6 # Required environment variable 
$GENSCANDIR is not set
t/Genscan.t ................... ok   
t/Gerp.t ...................... 1/33 # Required executable for 
Bio::Tools::Run::Phylo::Gerp is not present
t/Gerp.t ...................... ok     
t/Glimmer2.t .................. 1/217 # Required executable for 
Bio::Tools::Run::Glimmer is not present
t/Glimmer2.t .................. ok       
t/Glimmer3.t .................. 1/111 # Required executable for 
Bio::Tools::Run::Glimmer is not present
t/Glimmer3.t .................. ok       
t/Gumby.t ..................... 1/124 # Required executable for 
Bio::Tools::Run::Phylo::Gumby is not present
t/Gumby.t ..................... ok       
t/Hmmer.t ..................... 1/27 # Required executable for 
Bio::Tools::Run::Hmmer is not present
t/Hmmer.t ..................... ok     
t/Hyphy.t ..................... 2/15 # Required executable for 
Bio::Tools::Run::Phylo::Hyphy::SLAC is not present
t/Hyphy.t ..................... ok     
t/Infernal.t .................. 1/43 # Required executable for 
Bio::Tools::Run::Infernal is not present
t/Infernal.t .................. ok     
t/Kalign.t .................... 1/8 # Required executable for 
Bio::Tools::Run::Alignment::Kalign is not present
t/Kalign.t .................... ok   
t/LVB.t ....................... 1/19 # Required executable for 
Bio::Tools::Run::Phylo::LVB is not present
t/LVB.t ....................... ok     
t/Lagan.t ..................... 1/12 # Required executable for 
Bio::Tools::Run::Alignment::Lagan is not present
t/Lagan.t ..................... ok     
t/MAFFT.t ..................... 1/17 # Required executable for 
Bio::Tools::Run::Alignment::MAFFT is not present
t/MAFFT.t ..................... ok     
t/MCS.t ....................... 1/24 # Required executable for 
Bio::Tools::Run::MCS is not present
t/MCS.t ....................... ok     
t/Maq.t ....................... 1/51 # Required executable for 
Bio::Tools::Run::Maq is not present
t/Maq.t ....................... ok     
t/Match.t ..................... 1/7 # Required executable for 
Bio::Tools::Run::Match is not present
t/Match.t ..................... ok   
t/Mdust.t ..................... 1/5 # Required executable for 
Bio::Tools::Run::Mdust is not present
t/Mdust.t ..................... ok   
t/Meme.t ...................... 1/25 # Required executable for 
Bio::Tools::Run::Meme is not present
t/Meme.t ...................... ok     
t/Minimo.t .................... 1/72 # Required executable for 
Bio::Tools::Run::Minimo is not present
t/Minimo.t .................... ok     
t/Molphy.t .................... 1/10 # Required executable for 
Bio::Tools::Run::Phylo::Molphy::ProtML is not present
t/Molphy.t .................... ok     
t/Muscle.t .................... 1/16 # Required executable for 
Bio::Tools::Run::Alignment::Muscle is not present
t/Muscle.t .................... ok     
t/Neighbor.t .................. 1/17 # Required executable for 
Bio::Tools::Run::Phylo::Phylip::Neighbor is not present
t/Neighbor.t .................. ok     
t/Newbler.t ................... 1/98 # Required executable for 
Bio::Tools::Run::Newbler is not present
t/Newbler.t ................... ok     
t/Njtree.t .................... 1/6 # Required executable for 
Bio::Tools::Run::Phylo::Njtree::Best is not present
t/Njtree.t .................... ok   
t/PAML.t ...................... 1/28 # Required executable for 
Bio::Tools::Run::Phylo::PAML::Codeml is not present
t/PAML.t ...................... ok     
t/Pal2Nal.t ................... 1/9 # Required executable for 
Bio::Tools::Run::Alignment::Pal2Nal is not present
t/Pal2Nal.t ................... ok   
t/PhastCons.t ................. 1/181 # Required executable for 
Bio::Tools::Run::Phylo::Phast::PhastCons is not present
t/PhastCons.t ................. ok       
t/Phrap.t ..................... 1/127 # Required executable for 
Bio::Tools::Run::Phrap is not present
t/Phrap.t ..................... ok       
t/Phyml.t ..................... 1/47 # Required executable for 
Bio::Tools::Run::Phylo::Phyml is not present
t/Phyml.t ..................... ok     
t/Primate.t ................... 1/8 # Required executable for 
Bio::Tools::Run::Primate is not present
t/Primate.t ................... ok   
t/Primer3.t ................... 1/9 # Required executable for 
Bio::Tools::Run::Primer3 is not present
t/Primer3.t ................... ok   
t/Prints.t .................... 1/7 # Required executable for 
Bio::Tools::Run::Prints is not present
t/Prints.t .................... ok   
t/Probalign.t ................. 1/13 # Required executable for 
Bio::Tools::Run::Alignment::Probalign is not present
t/Probalign.t ................. ok     
t/Probcons.t .................. 1/11 # Required executable for 
Bio::Tools::Run::Alignment::Probcons is not present
t/Probcons.t .................. ok     
t/Profile.t ................... 1/7 # Required executable for 
Bio::Tools::Run::Profile is not present
t/Profile.t ................... ok   
t/Promoterwise.t .............. 1/9 # Required executable for 
Bio::Tools::Run::Promoterwise is not present
t/Promoterwise.t .............. ok   
t/ProtDist.t .................. 1/14 # Required executable for 
Bio::Tools::Run::Phylo::Phylip::ProtDist is not present
t/ProtDist.t .................. ok     
t/ProtPars.t .................. 1/11 # Required executable for 
Bio::Tools::Run::Phylo::Phylip::ProtPars is not present
t/ProtPars.t .................. ok     
t/Pseudowise.t ................ 1/18 # Required executable for 
Bio::Tools::Run::Pseudowise is not present
t/Pseudowise.t ................ ok     
t/QuickTree.t ................. 1/13 # Required executable for 
Bio::Tools::Run::Phylo::QuickTree is not present
t/QuickTree.t ................. ok     
t/RepeatMasker.t .............. 1/12 RepeatMasker program not found as  or 
not executable. 
# Required executable for Bio::Tools::Run::RepeatMasker is not present
t/RepeatMasker.t .............. ok     
t/SABlastPlus.t ............... 1/65 # Required executable for 
Bio::Tools::Run::BlastPlus is not present
# Looks like you planned 65 tests but ran 63.
t/SABlastPlus.t ............... Dubious, test returned 255 (wstat 65280, 
0xff00)
Failed 2/65 subtests 
(less 59 skipped subtests: 4 okay)
t/SLR.t ....................... 1/7 # Required executable for 
Bio::Tools::Run::Phylo::SLR is not present
t/SLR.t ....................... ok   
t/Samtools.t .................. ok     
t/Seg.t ....................... 1/8 # Required executable for 
Bio::Tools::Run::Seg is not present
t/Seg.t ....................... ok   
t/Semphy.t .................... 1/19 # Required executable for 
Bio::Tools::Run::Phylo::Semphy is not present
t/Semphy.t .................... ok     
t/SeqBoot.t ................... 1/9 # Required executable for 
Bio::Tools::Run::Phylo::Phylip::SeqBoot is not present
t/SeqBoot.t ................... ok   
t/Signalp.t ................... 1/7 # Required executable for 
Bio::Tools::Run::Signalp is not present
t/Signalp.t ................... ok   
t/Sim4.t ...................... 1/23 # Required executable for 
Bio::Tools::Run::Alignment::Sim4 is not present
t/Sim4.t ...................... ok     
t/Simprot.t ................... 1/6 # Required executable for 
Bio::Tools::Run::Simprot is not present
t/Simprot.t ................... ok   
t/SoapEU-function.t ........... skipped: The optional module Bio::DB::ESoap 
(or dependencies thereof) was not installed
t/SoapEU-unit.t ............... skipped: The optional module Bio::DB::ESoap 
(or dependencies thereof) was not installed
t/StandAloneFasta.t ........... 1/15 # Required executable for 
Bio::Tools::Run::Alignment::StandAloneFasta is not present
t/StandAloneFasta.t ........... ok     
t/TCoffee.t ................... 1/27 # Required executable for 
Bio::Tools::Run::Alignment::TCoffee is not present
t/TCoffee.t ................... ok     
t/TigrAssembler.t ............. 1/88 # Required executable for 
Bio::Tools::Run::TigrAssembler is not present
# Required executable for Bio::Tools::Run::TigrAssembler is not present
t/TigrAssembler.t ............. ok     
t/Tmhmm.t ..................... 1/9 # Required executable for 
Bio::Tools::Run::Tmhmm is not present
t/Tmhmm.t ..................... ok   
t/TribeMCL.t .................. ok     
t/Vista.t ..................... ok   
t/gmap-run.t .................. 1/8 # Required executable for 
Bio::Tools::Run::Alignment::Gmap is not present
t/gmap-run.t .................. ok   
t/tRNAscanSE.t ................ 1/12 # Required executable for 
Bio::Tools::Run::tRNAscanSE is not present
t/tRNAscanSE.t ................ ok     

Test Summary Report
-------------------
t/Blat.t                    (Wstat: 65280 Tests: 20 Failed: 0)
  Non-zero exit status: 255
  Parse errors: Bad plan.  You planned 33 tests but ran 20.
t/Eponine.t                 (Wstat: 65280 Tests: 2 Failed: 0)
  Non-zero exit status: 255
  Parse errors: Bad plan.  You planned 7 tests but ran 2.
t/SABlastPlus.t             (Wstat: 65280 Tests: 63 Failed: 0)
  Non-zero exit status: 255
  Parse errors: Bad plan.  You planned 65 tests but ran 63.
Files=80, Tests=2876, 39 wallclock secs ( 0.54 usr  0.23 sys + 32.54 cusr 
 4.94 csys = 38.25 CPU)
Result: FAIL
Failed 3/80 test programs. 0/2876 subtests failed.
  CJFIELDS/BioPerl-Run-1.006900.tar.gz
  ./Build test -- NOT OK
//hint// to see the cpan-testers results for installing this module, try:
  reports CJFIELDS/BioPerl-Run-1.006900.tar.gz
Running Build install
  make test had returned bad status, won't install without force


From guy.leonard at gmail.com  Wed Feb  6 18:35:38 2013
From: guy.leonard at gmail.com (guy.leonard at gmail.com)
Date: Wed, 6 Feb 2013 10:35:38 -0800 (PST)
Subject: [Bioperl-l] Next BioPerl release
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu>
References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu>
Message-ID: <3e4d717e-b58a-4bfd-943d-6f213bfae260@googlegroups.com>

Nice, super work. 

Will there be a rough list of feature changes/addition/deprecation, or 
shall I consult git logs?

On Tuesday, 5 February 2013 21:53:29 UTC, Christopher Fields wrote:
>
> All, 
>
> I am scheduling the next BioPerl CPAN release tentatively for March 1. 
>  Any help in triaging bug reports would be greatly appreciated!   
>
> Amongst all other changes, as mentioned in a separate thread we will 
> remove Bio::FeatureIO, now developed in a separate repository: 
>
>     https://github.com/bioperl/Bio-FeatureIO 
>
> Feedback, suggestions, etc are greatly appreciated. 
>
> chris 
> _______________________________________________ 
> Bioperl-l mailing list 
> Biop... at lists.open-bio.org <javascript:> 
> http://lists.open-bio.org/mailman/listinfo/bioperl-l 
>


From guy.leonard at gmail.com  Wed Feb  6 18:35:38 2013
From: guy.leonard at gmail.com (guy.leonard at gmail.com)
Date: Wed, 6 Feb 2013 10:35:38 -0800 (PST)
Subject: [Bioperl-l] Next BioPerl release
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu>
References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu>
Message-ID: <3e4d717e-b58a-4bfd-943d-6f213bfae260@googlegroups.com>

Nice, super work. 

Will there be a rough list of feature changes/addition/deprecation, or 
shall I consult git logs?

On Tuesday, 5 February 2013 21:53:29 UTC, Christopher Fields wrote:
>
> All, 
>
> I am scheduling the next BioPerl CPAN release tentatively for March 1. 
>  Any help in triaging bug reports would be greatly appreciated!   
>
> Amongst all other changes, as mentioned in a separate thread we will 
> remove Bio::FeatureIO, now developed in a separate repository: 
>
>     https://github.com/bioperl/Bio-FeatureIO 
>
> Feedback, suggestions, etc are greatly appreciated. 
>
> chris 
> _______________________________________________ 
> Bioperl-l mailing list 
> Biop... at lists.open-bio.org <javascript:> 
> http://lists.open-bio.org/mailman/listinfo/bioperl-l 
>


From sidd.basu at gmail.com  Wed Feb  6 19:36:17 2013
From: sidd.basu at gmail.com (Siddhartha Basu)
Date: Wed, 6 Feb 2013 13:36:17 -0600
Subject: [Bioperl-l]  Re: Next BioPerl release
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu>
References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu>
Message-ID: <5112b0b3.a5dc320a.4105.1fe3@mx.google.com>

Hi, 

On Tue, 05 Feb 2013, Fields, Christopher J wrote:

> All,
> 
> I am scheduling the next BioPerl CPAN release tentatively for March 1.  Any help in triaging bug reports would be greatly appreciated!  
> 
> Amongst all other changes, as mentioned in a separate thread we will remove Bio::FeatureIO, now developed in a separate repository:
> 
>     https://github.com/bioperl/Bio-FeatureIO
> 
> Feedback, suggestions, etc are greatly appreciated.

Here are CI build report on 5.12, 5.14 and 5.16 using travis. 
https://api.travis-ci.org/jobs/4623997/log.txt?deansi=true
https://api.travis-ci.org/jobs/4623998/log.txt?deansi=true
https://api.travis-ci.org/jobs/4623999/log.txt?deansi=true

Could not get 5.10 to work on travis. Though i activated the (--network)
option,  it still didn't run one of the test that needs network. Also, initially got
confused by the fact that though it has dist.ini,  the tests still has
to run through Build.PL. Running **dzil test** do not work.

Hope this helps.

thanks, 
-siddhartha

> 
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Wed Feb  6 19:46:49 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 6 Feb 2013 19:46:49 +0000
Subject: [Bioperl-l] Next BioPerl release
In-Reply-To: <3e4d717e-b58a-4bfd-943d-6f213bfae260@googlegroups.com>
References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu>
	<3e4d717e-b58a-4bfd-943d-6f213bfae260@googlegroups.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1A109@CHIMBX5.ad.uillinois.edu>

We've been a little better at keeping track of significant changes this time 'round.  There aren't a lot of major updates, but it's important to make sure we get a release out to ensure everyone (not just those familiar with git) can access them.

chris

On Feb 6, 2013, at 12:35 PM, <guy.leonard at gmail.com>
 wrote:

> Nice, super work. 
> 
> Will there be a rough list of feature changes/addition/deprecation, or 
> shall I consult git logs?
> 
> On Tuesday, 5 February 2013 21:53:29 UTC, Christopher Fields wrote:
>> 
>> All, 
>> 
>> I am scheduling the next BioPerl CPAN release tentatively for March 1. 
>> Any help in triaging bug reports would be greatly appreciated!   
>> 
>> Amongst all other changes, as mentioned in a separate thread we will 
>> remove Bio::FeatureIO, now developed in a separate repository: 
>> 
>>    https://github.com/bioperl/Bio-FeatureIO 
>> 
>> Feedback, suggestions, etc are greatly appreciated. 
>> 
>> chris 
>> _______________________________________________ 
>> Bioperl-l mailing list 
>> Biop... at lists.open-bio.org <javascript:> 
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l 
>> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Wed Feb  6 19:54:58 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 6 Feb 2013 19:54:58 +0000
Subject: [Bioperl-l] Next BioPerl release
In-Reply-To: <5112b0b3.a5dc320a.4105.1fe3@mx.google.com>
References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu>
	<5112b0b3.a5dc320a.4105.1fe3@mx.google.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1A1AB@CHIMBX5.ad.uillinois.edu>

On Feb 6, 2013, at 1:36 PM, Siddhartha Basu <sidd.basu at gmail.com>
 wrote:

> Hi, 
> 
> On Tue, 05 Feb 2013, Fields, Christopher J wrote:
> 
>> All,
>> 
>> I am scheduling the next BioPerl CPAN release tentatively for March 1.  Any help in triaging bug reports would be greatly appreciated!  
>> 
>> Amongst all other changes, as mentioned in a separate thread we will remove Bio::FeatureIO, now developed in a separate repository:
>> 
>>    https://github.com/bioperl/Bio-FeatureIO
>> 
>> Feedback, suggestions, etc are greatly appreciated.
> 
> Here are CI build report on 5.12, 5.14 and 5.16 using travis. 
> https://api.travis-ci.org/jobs/4623997/log.txt?deansi=true
> https://api.travis-ci.org/jobs/4623998/log.txt?deansi=true
> https://api.travis-ci.org/jobs/4623999/log.txt?deansi=true
> 
> Could not get 5.10 to work on travis. Though i activated the (--network)
> option,  it still didn't run one of the test that needs network. Also, initially got
> confused by the fact that though it has dist.ini,  the tests still has
> to run through Build.PL. Running **dzil test** do not work.
> 
> Hope this helps.
> 
> thanks, 
> -siddhartha

Just to point out, that was for Bio-FeatureIO.  Truthfully I'm not worried about that one yet; got to get over Mt. Everest first (the main release).  

Build.PL is there mainly as a convenience for users w/o Dist::Zilla, which, last I recall, had a higher dependency list than even BioPerl (though I may be mistaken).  I'll probably have to set up a Build.PL that can be clobbered by Dist::Zilla as needed.  Or we can just get rid of it and insist that dev. code has to be added via 'use lib' or PERL5LIB, and not allow installation.

chris


From sidd.basu at gmail.com  Wed Feb  6 20:26:06 2013
From: sidd.basu at gmail.com (Siddhartha Basu)
Date: Wed, 6 Feb 2013 14:26:06 -0600
Subject: [Bioperl-l]  Re: Next BioPerl release
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1A1AB@CHIMBX5.ad.uillinois.edu>
References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu>
	<5112b0b3.a5dc320a.4105.1fe3@mx.google.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1A1AB@CHIMBX5.ad.uillinois.edu>
Message-ID: <5112bc60.c69e320a.1e98.2028@mx.google.com>

On Wed, 06 Feb 2013, Fields, Christopher J wrote:

> On Feb 6, 2013, at 1:36 PM, Siddhartha Basu <sidd.basu at gmail.com>
>  wrote:
> 
> > Hi, 
> > 
> > On Tue, 05 Feb 2013, Fields, Christopher J wrote:
> > 
> >> All,
> >> 
> >> I am scheduling the next BioPerl CPAN release tentatively for March 1.  Any help in triaging bug reports would be greatly appreciated!  
> >> 
> >> Amongst all other changes, as mentioned in a separate thread we will remove Bio::FeatureIO, now developed in a separate repository:
> >> 
> >>    https://github.com/bioperl/Bio-FeatureIO
> >> 
> >> Feedback, suggestions, etc are greatly appreciated.
> > 
> > Here are CI build report on 5.12, 5.14 and 5.16 using travis. 
> > https://api.travis-ci.org/jobs/4623997/log.txt?deansi=true
> > https://api.travis-ci.org/jobs/4623998/log.txt?deansi=true
> > https://api.travis-ci.org/jobs/4623999/log.txt?deansi=true
> > 
> > Could not get 5.10 to work on travis. Though i activated the (--network)
> > option,  it still didn't run one of the test that needs network. Also, initially got
> > confused by the fact that though it has dist.ini,  the tests still has
> > to run through Build.PL. Running **dzil test** do not work.
> > 
> > Hope this helps.
> > 
> > thanks, 
> > -siddhartha
> 
> Just to point out, that was for Bio-FeatureIO.  Truthfully I'm not worried about that one yet; got to get over Mt. Everest first (the main release).  
So,  what are steps left for getting the release out to CPAN. Like are
there lot of feature branches still left to be merged,  are there a lot
of unit tests still not passing. Just trying to figure out anyway i
could be of any help to expedite the release process. However,  if they
are already taken care of,  please ignore.

> 
> Build.PL is there mainly as a convenience for users w/o Dist::Zilla, which, last I recall, had a higher dependency list than even BioPerl (though I may be mistaken).  I'll probably have to set up a Build.PL that can be clobbered by Dist::Zilla as needed.  
As far as the error i encountered, presence of Build.PL was blocking dzil
build/release process. And by default,  dzil expects to generate
Build.PL during its build/release process. However,  i am not sure which
mode is the most suitable for bioperl devs.
> Or we can just get rid of it and insist that dev. code has to be added via 'use lib' or PERL5LIB, and not allow installation.

thanks, 
-siddhartha

> 
> chris


From hlapp at drycafe.net  Wed Feb  6 21:30:33 2013
From: hlapp at drycafe.net (Hilmar Lapp)
Date: Wed, 6 Feb 2013 16:30:33 -0500
Subject: [Bioperl-l] dependencies on perl version
In-Reply-To: <20754.39343.128576.743448@gargle.gargle.HOWL>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
Message-ID: <A78F0D43-8296-45CF-9409-320D1FE7CA2F@drycafe.net>

Great points, George, and you're making a very compelling argument. I'm in total agreement. It's almost becoming a reason to having to be embarrassed to still be programming in Perl these days, so one might as well have fun while it lasts.

	-hilmar

On Feb 6, 2013, at 12:58 PM, George Hartzell wrote:

> Fields, Christopher J writes:
>> [...]
>> Right, it took ~8 yrs to go from 5.8 to 5.10.  I'd like to point
>> out that Python users are in the same boat: the Python version for
>> CentOS 5 is 2.4.3, and Biopython requires a minimum of python 2.5
>> (and recommends python 2.7).   
>> 
>> We can always state that perl 5.8 is supported for the upcoming
>> Bioperl release, but we're dropping v5.8 support for any future
>> releases. 
> 
> Do more than drop support for 5.8.
> 
> The Perl community has put a transparent and predictable process in
> place for releasing [generally] better versions of the language.  It
> means that Perl has a chance of continuing to be relevant, attracting
> new talent and actually *fixing* some of the s&%t that gives Perl a
> bad rap.  It gives people something to plan around, no one should be
> surprised that v 5.X.Y is coming out in mid 20ZZ.
> 
> BioPerl should do the same thing, declare a release policy that trails
> along with the Perl release schedule.  Keep it simple and no one can
> argue with it.  Support Perl releases as long as the releases
> themselves are supported.
> 
> Rather than expending energy supporting out of date platforms, put the
> energy into being modern (or Modern...), better distro building and
> packaging, testing, documentation and releasing so that the process of
> staying current is painless.
> 
> Look forward.  Keep it interesting and fun.
> 
> Everyone running Mac OS 9 on their Pismo, raise your hand.  Anyone
> make their living running sequencing gels in Plexiglas doohickeys on
> their lab bench?
> 
> I'm not suggesting that the BioPerl community is free to make
> arbitrary and capricious changes that makes it difficult for *anyone*
> to get anything done.  Churn is a waste of time.
> 
> But why should the all-volunteer BioPerl community be stuck supporting
> code from 12 years ago because it's cost effective for someone else to
> avoid spending *their* $/time/people to stay up to date.
> 
> Those sites that value stability/maturity/stagnation so highly have
> already accepted the cost/difficulty of nailing one of their feet to
> the floor as they try to run forward.  They recognize and depend on
> the benefits of having that stable base but generally they've also
> accepted the costs associated with their restrictive choices.  They
> know how to pull in separate kernel/driver updates so that they can
> actually run on nearly modern hardware.  They know, and live with, the
> fact that they're not going to have access to the shiny new stuff.
> And they know how to stay up to date, when they need to, with the
> software that their users need to be competitive (e.g. BioConductor
> and R).
> 
> As long as (if/when...) updating a BioPerl release is something that
> can reliably happen with a few cpanm invocations then the sites that
> otherwise favor punctuated equilibrium will learn to handle gradual
> change.
> 
> Those folks that are "stuck" on older releases always have the option
> of supporting professional Perl programmers to keep older releases
> going, backport changes, etc....  They're already buying support for
> their platforms (or freeloading and coping), let them put bread on the
> table at one of the bioinformatics consultancies or labs if they have
> something special they need.
> 
> Have fun.  Use sharp tools.  Do cool science.  Build cool things.  No
> one is paying you to be backwards compatible with the previous
> millennium.
> 
> g.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net :
===========================================================


From cjfields at illinois.edu  Wed Feb  6 22:11:06 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 6 Feb 2013 22:11:06 +0000
Subject: [Bioperl-l] BioPerl long-term, was Re:  dependencies on perl version
In-Reply-To: <CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>

George,

Should put your post on a pedestal :)

tl;dr version: I completely agree, but we need help in order to do this.

Long(-winded) version:

I agree completely, backwards compatibility is killing us.  But, we do need current and new people to get involved and help drive this forward.  We need people on all fronts, from coding and bug fixes to documentation and web site maintenance.  I've been driving this bus for a number of years now.  Not getting tired yet, but I am getting substantially busier with my current endeavors, so my time spent working on BioPerl has dwindled considerably.  Any additional support or sharing of responsibilities will help tremendously in keeping up momentum (if someone else wants to take the wheel for a bit, please let me know :).  

If we follow the perl release route, we should streamline the release process (think Dist::Zilla), end support of older versions of Perl, and work on a sustainable release schedule.  The fact that we have so many of us so-called 'old folks' speaking up in favor of this is a very good sign.  We do need a bit more than that; we need help.  BioPerl is a very large project.

A key point we need to address, which is very important for the future of BioPerl.  I use Perl quite a bit in my current work (dabble with Ruby and Python as well when I have to).  BioPerl?  A little, but not as much as I could.  

Shocked?  The main three reason I don't use it 'in anger':  performance, performance, and performance.  It is very important that we make a concerted effort to address this at all levels.  It could be as simple as completely separating parsing from object creation (where the bulk of performance problems seem to lie, but not all of them).  

A specific example: Heng Li once tested the performance of FASTQ parsing (perl, python, bioperl, biopython, his C code, etc). BioPerl's FASTQ couldn't even be measured; IIRC it went on for many hours until he killed it.  This was with the older version of the parser, but I'm willing to bet the newer one I wrote isn't any better.

This. needs. to. change.

I see no problem in stating any generic parsing and low-level interfaces are just as much a part of what BioPerl encompasses as the higher-level Bio::* classes themselves.  Steve and Jason were on to something with SearchIO; it's maybe not as performant as we would like, but it certainly is more flexible in terms of what can be done, b/c it separates out low-level parsing from object creation.  That's the general model we should look at.  There is a good reason Biopython is following this model with their SearchIO implementation (Peter C, are you reading this?)

We have a lot of very talented people involved with this project, both on the purely computational and purely biological end as well as the folks like me who straddle the two domains.  A lot of good code out there that can be used, wrapped, taken advantage of, including everything we currently have in BioPerl.  Let's come up with something that both works and works well, that people can use on a regular basis, even at a low level if they choose.  That alone would dissuade new users from writing up (yet another) custom FASTA/FASTQ/BLAST/GenBank/etc parser b/c the BioPerl one takes millennia to finish.  

A few examples on this front: Rob Buels created a generic parser for GFF3 (Bio::GFF3::LowLevel) with very few dependencies, we wrap this with the newer Bio::FeatureIO code.  Leon has Bio::SFF.  Lincoln of course wrote Bio::DB::Sam and Bio::DB::BigFile.  I have started a wrapper around Heng's FASTQ/FASTA parsing code (kseq), it seems to work quite well (~20M FASTQ in 30 sec last I recall?).  

So:

If it means targeting performance, backwards-compatibility be damned (using Devel::NYTProf?), we do that.

If it means creating a new Bio-NGS repo to focus some of these efforts, so be it.

If it means we get away from the Java-based interface stuff in favor of something more Perl-like (roles anyone?), then I'm all for it.

If it means we modularize BioPerl so this can be done, well, you probably know where I stand (yes).

If it means this is to be BioPerl 2.0, then let's move that direction, sooner than later.

But I can't do it alone.  We (not just me, but we) need to drive the direction we take.

First one who codes gets the gold ring.

chris

On Feb 6, 2013, at 12:47 PM, Aaron Mackey <amackey at virginia.edu>
 wrote:

> Huzzah!
> 
> --
> Aaron J. Mackey, PhD
> Assistant Professor
> Center for Public Health Genomics
> University of Virginia
> amackey at virginia.edu
> http://www.cphg.virginia.edu/mackey
> 
> 
> On Wed, Feb 6, 2013 at 12:58 PM, George Hartzell <hartzell at alerce.com> wrote:
> Fields, Christopher J writes:
>  > [...]
>  > Right, it took ~8 yrs to go from 5.8 to 5.10.  I'd like to point
>  > out that Python users are in the same boat: the Python version for
>  > CentOS 5 is 2.4.3, and Biopython requires a minimum of python 2.5
>  > (and recommends python 2.7).
>  >
>  > We can always state that perl 5.8 is supported for the upcoming
>  > Bioperl release, but we're dropping v5.8 support for any future
>  > releases.
> 
> Do more than drop support for 5.8.
> 
> The Perl community has put a transparent and predictable process in
> place for releasing [generally] better versions of the language.  It
> means that Perl has a chance of continuing to be relevant, attracting
> new talent and actually *fixing* some of the s&%t that gives Perl a
> bad rap.  It gives people something to plan around, no one should be
> surprised that v 5.X.Y is coming out in mid 20ZZ.
> 
> BioPerl should do the same thing, declare a release policy that trails
> along with the Perl release schedule.  Keep it simple and no one can
> argue with it.  Support Perl releases as long as the releases
> themselves are supported.
> 
> Rather than expending energy supporting out of date platforms, put the
> energy into being modern (or Modern...), better distro building and
> packaging, testing, documentation and releasing so that the process of
> staying current is painless.
> 
> Look forward.  Keep it interesting and fun.
> 
> Everyone running Mac OS 9 on their Pismo, raise your hand.  Anyone
> make their living running sequencing gels in Plexiglas doohickeys on
> their lab bench?
> 
> I'm not suggesting that the BioPerl community is free to make
> arbitrary and capricious changes that makes it difficult for *anyone*
> to get anything done.  Churn is a waste of time.
> 
> But why should the all-volunteer BioPerl community be stuck supporting
> code from 12 years ago because it's cost effective for someone else to
> avoid spending *their* $/time/people to stay up to date.
> 
> Those sites that value stability/maturity/stagnation so highly have
> already accepted the cost/difficulty of nailing one of their feet to
> the floor as they try to run forward.  They recognize and depend on
> the benefits of having that stable base but generally they've also
> accepted the costs associated with their restrictive choices.  They
> know how to pull in separate kernel/driver updates so that they can
> actually run on nearly modern hardware.  They know, and live with, the
> fact that they're not going to have access to the shiny new stuff.
> And they know how to stay up to date, when they need to, with the
> software that their users need to be competitive (e.g. BioConductor
> and R).
> 
> As long as (if/when...) updating a BioPerl release is something that
> can reliably happen with a few cpanm invocations then the sites that
> otherwise favor punctuated equilibrium will learn to handle gradual
> change.
> 
> Those folks that are "stuck" on older releases always have the option
> of supporting professional Perl programmers to keep older releases
> going, backport changes, etc....  They're already buying support for
> their platforms (or freeloading and coping), let them put bread on the
> table at one of the bioinformatics consultancies or labs if they have
> something special they need.
> 
> Have fun.  Use sharp tools.  Do cool science.  Build cool things.  No
> one is paying you to be backwards compatible with the previous
> millennium.
> 
> g.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From cjfields at illinois.edu  Wed Feb  6 22:34:42 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 6 Feb 2013 22:34:42 +0000
Subject: [Bioperl-l] BioPerl long-term,
 was Re:  dependencies on perl version
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1AF0C@CHIMBX5.ad.uillinois.edu>

I want to clarify, parser optimization isn't the only point we need to focus on by any means (and may not be the main one).  There is a lot of room for improvement top to bottom, that was one specific example I have long held to be an issue.

-c

On Feb 6, 2013, at 4:11 PM, "Fields, Christopher J" <cjfields at illinois.edu> wrote:

> Shocked?  The main three reason I don't use it 'in anger':  performance, performance, and performance.  It is very important that we make a concerted effort to address this at all levels.  It could be as simple as completely separating parsing from object creation (where the bulk of performance problems seem to lie, but not all of them).  
...


From p.j.a.cock at googlemail.com  Wed Feb  6 22:43:13 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Wed, 6 Feb 2013 22:43:13 +0000
Subject: [Bioperl-l] BioPerl long-term,
	was Re: dependencies on perl version
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
Message-ID: <CAKVJ-_6v2r4R=F-sEAtC9TCLsuU1VxNi6vk-E4gsd2e=Ri0pjQ@mail.gmail.com>

On Wed, Feb 6, 2013 at 10:11 PM, Fields, Christopher J
<cjfields at illinois.edu> wrote:
>
> I see no problem in stating any generic parsing and low-level interfaces
> are just as much a part of what BioPerl encompasses as the higher-level
> Bio::* classes themselves.  Steve and Jason were on to something with
> SearchIO; it's maybe not as performant as we would like, but it certainly
> is more flexible in terms of what can be done, b/c it separates out
> low-level parsing from object creation.  That's the general model we
> should look at.  There is a good reason Biopython is following this
> model with their SearchIO implementation (Peter C, are you reading this?)

Actually I don't think we did end up with that kind of separation in the
Biopython SearchIO - which is not so say it isn't an excellent model
to follow. Rather the Biopython SearchIO (like the BioPerl one) had
as the first goal a consistent object model across assorted file
formats.

The idea of a low level minimal overhead parsers (which are very
format specific), on which a heavier but consistent object model
can be built might be a good balance - the high level API has the
connivence, but if you give that up you can have more speed.
That's what I recommend with FASTQ and Biopython, e.g.
http://news.open-bio.org/news/2009/09/biopython-fast-fastq/

>
> I have started a wrapper around Heng's FASTQ/FASTA parsing
> code (kseq), it seems to work quite well (~20M FASTQ in 30 sec
> last I recall?).
>

I'd have to dig through my emails, but I think the BioRuby guys
looked at that too - as I recall while it was fast, the error handling
left something to be desired. Email me directly or on the BioRuby
list if you want to follow up on that.

Regards,

Peter


From cjfields at illinois.edu  Wed Feb  6 22:53:21 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 6 Feb 2013 22:53:21 +0000
Subject: [Bioperl-l] FASTQ, was Re:  BioPerl long-term,
 was Re: dependencies on perl version
In-Reply-To: <CAKVJ-_6v2r4R=F-sEAtC9TCLsuU1VxNi6vk-E4gsd2e=Ri0pjQ@mail.gmail.com>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
	<CAKVJ-_6v2r4R=F-sEAtC9TCLsuU1VxNi6vk-E4gsd2e=Ri0pjQ@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu>

On Feb 6, 2013, at 4:43 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:

> On Wed, Feb 6, 2013 at 10:11 PM, Fields, Christopher J
> <cjfields at illinois.edu> wrote:
>> 
>> I see no problem in stating any generic parsing and low-level interfaces
>> are just as much a part of what BioPerl encompasses as the higher-level
>> Bio::* classes themselves.  Steve and Jason were on to something with
>> SearchIO; it's maybe not as performant as we would like, but it certainly
>> is more flexible in terms of what can be done, b/c it separates out
>> low-level parsing from object creation.  That's the general model we
>> should look at.  There is a good reason Biopython is following this
>> model with their SearchIO implementation (Peter C, are you reading this?)
> 
> Actually I don't think we did end up with that kind of separation in the
> Biopython SearchIO - which is not so say it isn't an excellent model
> to follow. Rather the Biopython SearchIO (like the BioPerl one) had
> as the first goal a consistent object model across assorted file
> formats.
> 
> The idea of a low level minimal overhead parsers (which are very
> format specific), on which a heavier but consistent object model
> can be built might be a good balance - the high level API has the
> connivence, but if you give that up you can have more speed.
> That's what I recommend with FASTQ and Biopython, e.g.
> http://news.open-bio.org/news/2009/09/biopython-fast-fastq/
> 
>> 
>> I have started a wrapper around Heng's FASTQ/FASTA parsing
>> code (kseq), it seems to work quite well (~20M FASTQ in 30 sec
>> last I recall?).
>> 
> 
> I'd have to dig through my emails, but I think the BioRuby guys
> looked at that too - as I recall while it was fast, the error handling
> left something to be desired. Email me directly or on the BioRuby
> list if you want to follow up on that.
> 
> Regards,
> 
> Peter

I did a little on this, worth following up on, but I pulled the FASTQ test examples you created from the paper to test it out.  IIRC it parsed where it needed to, but I'm not sure how it handled bad sequences, so yes, worth looking into.  Maybe worth moving to open-bio-l for broader discussion.

chris


From whereverroadgoes at gmail.com  Wed Feb  6 21:59:04 2013
From: whereverroadgoes at gmail.com (Slym)
Date: Wed, 6 Feb 2013 13:59:04 -0800 (PST)
Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases
In-Reply-To: <87txpr26jj.fsf@topper.koldfront.dk>
References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com>
	<CAJ57qHHphLgEyfkEEyt2HVh+RahSWpiuhuaA08vi5ZxMwDDgTg@mail.gmail.com>
	<b2154001-d1eb-4266-a491-108d3e6ae77d@googlegroups.com>
	<CAJ57qHG9zFomG1wB4fN7hZZaByvP_EhxOHRTt2OrOZz__WgawQ@mail.gmail.com>
	<d5e347d1-cbaa-498a-9b64-a5242fdc4dd8@googlegroups.com>
	<87txpr26jj.fsf@topper.koldfront.dk>
Message-ID: <411e920d-e614-417d-9198-78bef9adba16@googlegroups.com>

Everything's working now! Thank you very much, especially to you Adam!


>


From carandraug+dev at gmail.com  Thu Feb  7 01:38:20 2013
From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=)
Date: Thu, 7 Feb 2013 01:38:20 +0000
Subject: [Bioperl-l] dependencies on perl version
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
Message-ID: <CAPOrs_0esYVUe_0gZHdAtk4orJQMO82fLjnfNL3Nap=BqX7RWw@mail.gmail.com>

On 5 February 2013 20:56, Fields, Christopher J <cjfields at illinois.edu> wrote:
> On Feb 5, 2013, at 2:06 PM, Carn? Draug <carandraug+dev at gmail.com> wrote:
>> how much perl backwards compatibility does bioperl needs to keep?
>
> Aim for 5.10.1, but be careful of smart-match.

Well, I solved my problem differently and ended up not needing any of
the new features. But next time I'll know. Thanks

Carn?


From pcantalupo at gmail.com  Thu Feb  7 04:04:08 2013
From: pcantalupo at gmail.com (Paul Cantalupo)
Date: Wed, 6 Feb 2013 23:04:08 -0500
Subject: [Bioperl-l] bug 3376 status needs updated
Message-ID: <CAJqbkv77bC3eWGsaOwwXFnGMrAZjVJSSU97CCRwJmMMPLQRjTQ@mail.gmail.com>

Hi,

A few months ago, I fixed bug 3376 (
https://github.com/bioperl/bioperl-live/commit/142e5d79e3a6593db32bf0af99048f47d01bd3f2).
The Redmine bug page (https://redmine.open-bio.org/issues/3376) hasn't been
updated to resolved or closed. Should I do this or is Chris the only one
who does that?

Thank you,

Paul


From cjfields at illinois.edu  Thu Feb  7 04:20:30 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Thu, 7 Feb 2013 04:20:30 +0000
Subject: [Bioperl-l] bug 3376 status needs updated
In-Reply-To: <CAJqbkv77bC3eWGsaOwwXFnGMrAZjVJSSU97CCRwJmMMPLQRjTQ@mail.gmail.com>
References: <CAJqbkv77bC3eWGsaOwwXFnGMrAZjVJSSU97CCRwJmMMPLQRjTQ@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1B45C@CHIMBX5.ad.uillinois.edu>

No, go ahead and close it.  Let me know if you run into perm. problems with it.

chris

On Feb 6, 2013, at 10:04 PM, Paul Cantalupo <pcantalupo at gmail.com>
 wrote:

> Hi,
> 
> A few months ago, I fixed bug 3376 (
> https://github.com/bioperl/bioperl-live/commit/142e5d79e3a6593db32bf0af99048f47d01bd3f2).
> The Redmine bug page (https://redmine.open-bio.org/issues/3376) hasn't been
> updated to resolved or closed. Should I do this or is Chris the only one
> who does that?
> 
> Thank you,
> 
> Paul
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From l.m.timmermans at students.uu.nl  Thu Feb  7 09:07:57 2013
From: l.m.timmermans at students.uu.nl (Leon Timmermans)
Date: Thu, 7 Feb 2013 10:07:57 +0100
Subject: [Bioperl-l] Next BioPerl release
In-Reply-To: <5112bc60.c69e320a.1e98.2028@mx.google.com>
References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu>
	<5112b0b3.a5dc320a.4105.1fe3@mx.google.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1A1AB@CHIMBX5.ad.uillinois.edu>
	<5112bc60.c69e320a.1e98.2028@mx.google.com>
Message-ID: <CAC1jpXDQG8NwaPKd8PEVqWs7NWHHAkrGaasCeJ+bKVy1z0he1Q@mail.gmail.com>

On Wed, Feb 6, 2013 at 9:26 PM, Siddhartha Basu <sidd.basu at gmail.com> wrote:
> As far as the error i encountered, presence of Build.PL was blocking dzil
> build/release process. And by default,  dzil expects to generate
> Build.PL during its build/release process. However,  i am not sure which
> mode is the most suitable for bioperl devs.

You can prune the Build.PL, and then let dzil add its own. We wouldn't
be the first to do that sort of thing.

Leon


From amackey at virginia.edu  Thu Feb  7 15:25:07 2013
From: amackey at virginia.edu (Aaron Mackey)
Date: Thu, 7 Feb 2013 10:25:07 -0500
Subject: [Bioperl-l] FASTQ, was Re: BioPerl long-term,
 was Re: dependencies on perl version
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
	<CAKVJ-_6v2r4R=F-sEAtC9TCLsuU1VxNi6vk-E4gsd2e=Ri0pjQ@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu>
Message-ID: <CAErFSojxeHBTcNK0GiYQ8D-MbPgzMvZ8xfnbeVU0-KaCNq7ZXw@mail.gmail.com>

You might also want to consider a lazy/pull-based parser to defer
parsing/object-building for pieces of the object that don't get used.  This
also usually provides some error tolerance.

-Aaron

--
Aaron J. Mackey, PhD
Assistant Professor
Center for Public Health Genomics
University of Virginia
amackey at virginia.edu
http://www.cphg.virginia.edu/mackey


On Wed, Feb 6, 2013 at 5:53 PM, Fields, Christopher J <cjfields at illinois.edu
> wrote:

> On Feb 6, 2013, at 4:43 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
>
> > On Wed, Feb 6, 2013 at 10:11 PM, Fields, Christopher J
> > <cjfields at illinois.edu> wrote:
> >>
> >> I see no problem in stating any generic parsing and low-level interfaces
> >> are just as much a part of what BioPerl encompasses as the higher-level
> >> Bio::* classes themselves.  Steve and Jason were on to something with
> >> SearchIO; it's maybe not as performant as we would like, but it
> certainly
> >> is more flexible in terms of what can be done, b/c it separates out
> >> low-level parsing from object creation.  That's the general model we
> >> should look at.  There is a good reason Biopython is following this
> >> model with their SearchIO implementation (Peter C, are you reading
> this?)
> >
> > Actually I don't think we did end up with that kind of separation in the
> > Biopython SearchIO - which is not so say it isn't an excellent model
> > to follow. Rather the Biopython SearchIO (like the BioPerl one) had
> > as the first goal a consistent object model across assorted file
> > formats.
> >
> > The idea of a low level minimal overhead parsers (which are very
> > format specific), on which a heavier but consistent object model
> > can be built might be a good balance - the high level API has the
> > connivence, but if you give that up you can have more speed.
> > That's what I recommend with FASTQ and Biopython, e.g.
> > http://news.open-bio.org/news/2009/09/biopython-fast-fastq/
> >
> >>
> >> I have started a wrapper around Heng's FASTQ/FASTA parsing
> >> code (kseq), it seems to work quite well (~20M FASTQ in 30 sec
> >> last I recall?).
> >>
> >
> > I'd have to dig through my emails, but I think the BioRuby guys
> > looked at that too - as I recall while it was fast, the error handling
> > left something to be desired. Email me directly or on the BioRuby
> > list if you want to follow up on that.
> >
> > Regards,
> >
> > Peter
>
> I did a little on this, worth following up on, but I pulled the FASTQ test
> examples you created from the paper to test it out.  IIRC it parsed where
> it needed to, but I'm not sure how it handled bad sequences, so yes, worth
> looking into.  Maybe worth moving to open-bio-l for broader discussion.
>
> chris
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From tiago.hori at gmail.com  Thu Feb  7 14:58:37 2013
From: tiago.hori at gmail.com (Tiago Hori)
Date: Thu, 7 Feb 2013 06:58:37 -0800 (PST)
Subject: [Bioperl-l] Search I::O
In-Reply-To: <6B0BCF1B-4B67-4697-9B34-8F822B4DC565@gmail.com>
References: <39b1269f-63a7-4b29-af79-8c93ab231abf@googlegroups.com>
	<6B0BCF1B-4B67-4697-9B34-8F822B4DC565@gmail.com>
Message-ID: <e5d61704-086a-4434-ae80-434252d1f55e@googlegroups.com>

Thanks, Jason! It is working Now.

So here is what I am trying to accomplish. For a given Blastx report, I 
want to extract the best BLASTx hit that is human, and does not contain 
unnamed or Predicted. I got very close, but I still can't get it to give me 
only the top BLAST hit, it gives me all blast hits that meet my criteria. I 
tried using "last" to stop it from looping through the hits, once it found 
a human one, but it didn't work. Can someone help? Here is my code so far 
(mostly stolen for the wiki).

use strict;
use Bio::SearchIO; 

my $in = new Bio::SearchIO(-format => 'blast', 
                           -file   => 'testsalmon.txt');
while( my $result = $in->next_result ) {
 ## $result is a Bio::Search::Result::ResultI compliant object
  while( my $hit = $result->next_hit ) {
  ## $hit is a Bio::Search::Hit::HitI compliant object    
    if( $hit->description !~ /[Uu]nnamed|PREDICTED|hypothetical/){        
      if( $hit->description =~ /Homo sapiens/){  
         while( my $hsp = $hit->next_hsp ) {
          ## $hsp is a Bio::Search::HSP::HSPI compliant object
              if( $hsp->length('total') > 50 ) {
                if ( $hsp->percent_identity >= 30) {
              if( $hsp->evalue <= 1e-05){
               print "Query=",   $result->query_name,"\t",
                     " Description=",    $hit->description,"\t",
                     " Hit=",        $hit->name,"\t",
                     " Length=",     $hsp->length('total'),"\t",
                     " Percent_id=", $hsp->percent_identity,"\t",
          }
        }
          }
     }
      }
    }
  }
}


T.


On Wednesday, February 6, 2013 6:46:47 PM UTC-3:30, Jason Stajich wrote:
>
> you are missing a comma after the -format => 'blast' 
> should be 
> my $in = Bio::SearchIO->new(-format => 'blast',   
>   -file => 'XXX' ); 
>
>
> On Feb 5, 2013, at 7:21 AM, Tiago Hori <tiago... at gmail.com <javascript:>> 
> wrote: 
>
> > Hi All, 
> > 
> > I am trying to find the best putative orthologs for 44K Atlantic Salmon 
> > sequences, and so I need to parse 44K BLAST reports to find the best 
> human 
> > hit. I am trying to learn Seach::IO, but when I try the first example on 
> > the HOWTO: use strict; 
> > use Bio::SearchIO; 
> > 
> > my $in = new Bio::SearchIO(-format => 'blast' 
> >               -file => 'C001R047.txt'); 
> > 
> > while( my $result = $in->next_result ) { 
> >  ## $result is a Bio::Search::Result::ResultI compliant object 
> >  while( my $hit = $result->next_hit ) { 
> >    ## $hit is a Bio::Search::Hit::HitI compliant object 
> >    while( my $hsp = $hit->next_hsp ) { 
> >      ## $hsp is a Bio::Search::HSP::HSPI compliant object 
> >      if( $hsp->length('total') > 50 ) { 
> >        if ( $hsp->percent_identity >= 75 ) { 
> >          print "Query=",   $result->query_name, 
> >            " Hit=",        $hit->name, 
> >            " Length=",     $hsp->length('total'), 
> >            " Percent_id=", $hsp->percent_identity, "\n"; 
> >        } 
> >      } 
> >    }   
> >  } 
> > } 
> > 
> > I get this error: Odd number of elements in hash assignment at 
> > /usr/local/share/perl/5.14.2/Bio/SearchIO.pm line 189. 
> > 
> > I am using BioPerl version 1.6.901. Is there a format problem with the 
> > blast reports? 
> > 
> > Any help would be greatly appreciated! 
> > 
> > T. 
> > _______________________________________________ 
> > Bioperl-l mailing list 
> > Biop... at lists.open-bio.org <javascript:> 
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l 
>
> Jason Stajich 
> jason.... at gmail.com <javascript:> 
> ja... at bioperl.org <javascript:> 
>
>


From cjfields at illinois.edu  Thu Feb  7 15:56:04 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Thu, 7 Feb 2013 15:56:04 +0000
Subject: [Bioperl-l] FASTQ, was Re: BioPerl long-term,
 was Re: dependencies on perl version
In-Reply-To: <CAErFSojxeHBTcNK0GiYQ8D-MbPgzMvZ8xfnbeVU0-KaCNq7ZXw@mail.gmail.com>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
	<CAKVJ-_6v2r4R=F-sEAtC9TCLsuU1VxNi6vk-E4gsd2e=Ri0pjQ@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu>
	<CAErFSojxeHBTcNK0GiYQ8D-MbPgzMvZ8xfnbeVU0-KaCNq7ZXw@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1C42B@CHIMBX5.ad.uillinois.edu>

This will likely be the approach for more NGS-friendly Bio::Seq class.  Calculation of the PHRED scores could also be deferred until needed.

seqtk has some C-based methods that we could possibly take advantage of, but will have to look into it.

chris

On Feb 7, 2013, at 9:25 AM, Aaron Mackey <amackey at virginia.edu> wrote:

> You might also want to consider a lazy/pull-based parser to defer parsing/object-building for pieces of the object that don't get used.  This also usually provides some error tolerance.
> 
> -Aaron
> 
> --
> Aaron J. Mackey, PhD
> Assistant Professor
> Center for Public Health Genomics
> University of Virginia
> amackey at virginia.edu
> http://www.cphg.virginia.edu/mackey
> 
> 
> On Wed, Feb 6, 2013 at 5:53 PM, Fields, Christopher J <cjfields at illinois.edu> wrote:
> On Feb 6, 2013, at 4:43 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> 
> > On Wed, Feb 6, 2013 at 10:11 PM, Fields, Christopher J
> > <cjfields at illinois.edu> wrote:
> >>
> >> I see no problem in stating any generic parsing and low-level interfaces
> >> are just as much a part of what BioPerl encompasses as the higher-level
> >> Bio::* classes themselves.  Steve and Jason were on to something with
> >> SearchIO; it's maybe not as performant as we would like, but it certainly
> >> is more flexible in terms of what can be done, b/c it separates out
> >> low-level parsing from object creation.  That's the general model we
> >> should look at.  There is a good reason Biopython is following this
> >> model with their SearchIO implementation (Peter C, are you reading this?)
> >
> > Actually I don't think we did end up with that kind of separation in the
> > Biopython SearchIO - which is not so say it isn't an excellent model
> > to follow. Rather the Biopython SearchIO (like the BioPerl one) had
> > as the first goal a consistent object model across assorted file
> > formats.
> >
> > The idea of a low level minimal overhead parsers (which are very
> > format specific), on which a heavier but consistent object model
> > can be built might be a good balance - the high level API has the
> > connivence, but if you give that up you can have more speed.
> > That's what I recommend with FASTQ and Biopython, e.g.
> > http://news.open-bio.org/news/2009/09/biopython-fast-fastq/
> >
> >>
> >> I have started a wrapper around Heng's FASTQ/FASTA parsing
> >> code (kseq), it seems to work quite well (~20M FASTQ in 30 sec
> >> last I recall?).
> >>
> >
> > I'd have to dig through my emails, but I think the BioRuby guys
> > looked at that too - as I recall while it was fast, the error handling
> > left something to be desired. Email me directly or on the BioRuby
> > list if you want to follow up on that.
> >
> > Regards,
> >
> > Peter
> 
> I did a little on this, worth following up on, but I pulled the FASTQ test examples you created from the paper to test it out.  IIRC it parsed where it needed to, but I'm not sure how it handled bad sequences, so yes, worth looking into.  Maybe worth moving to open-bio-l for broader discussion.
> 
> chris
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From amackey at virginia.edu  Thu Feb  7 16:09:14 2013
From: amackey at virginia.edu (Aaron Mackey)
Date: Thu, 7 Feb 2013 11:09:14 -0500
Subject: [Bioperl-l] FASTQ, was Re: BioPerl long-term,
 was Re: dependencies on perl version
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1C42B@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
	<CAKVJ-_6v2r4R=F-sEAtC9TCLsuU1VxNi6vk-E4gsd2e=Ri0pjQ@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu>
	<CAErFSojxeHBTcNK0GiYQ8D-MbPgzMvZ8xfnbeVU0-KaCNq7ZXw@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1C42B@CHIMBX5.ad.uillinois.edu>
Message-ID: <CAErFSoitVuxPbBHbHcEh=dZ+A8qPjjmNvF14iYBVK=FKRKL5ig@mail.gmail.com>

e.g., a pull-based FASTQ parser that did nothing else at the top level but
"chunk" the file into as-yet-unparsed four-line blobs could appear to work
very fast, if the user code did nothing but count the number of entries:

  while (my $seq = $seqio->nextseq) { $ct++ };

in other words, you defer *everything* except the minimal amount of
parsing/logic required to detect object boundaries.

This is, in fact, the exact opposite of the event-based SearchIO "push"
parsers, which always perform the most parsing possible, despite the user
never accessing most of the material.

Lastly, with respect to performance, if the parsing/object building
operation is not simply IO bound, then parallel parser/object-building CPU
threads could be considered, which could then dynamically adapt to
pre-parse attributes (e.g. quality scores) that the calling code was
actually using.  What's the state of thread-safe Perl these days?

-Aaron


On Thu, Feb 7, 2013 at 10:56 AM, Fields, Christopher J <
cjfields at illinois.edu> wrote:

> This will likely be the approach for more NGS-friendly Bio::Seq class.
>  Calculation of the PHRED scores could also be deferred until needed.
>
> seqtk has some C-based methods that we could possibly take advantage of,
> but will have to look into it.
>
> chris
>
> On Feb 7, 2013, at 9:25 AM, Aaron Mackey <amackey at virginia.edu> wrote:
>
> > You might also want to consider a lazy/pull-based parser to defer
> parsing/object-building for pieces of the object that don't get used.  This
> also usually provides some error tolerance.
> >
> > -Aaron
>


From sidd.basu at gmail.com  Thu Feb  7 16:38:47 2013
From: sidd.basu at gmail.com (Siddhartha Basu)
Date: Thu, 7 Feb 2013 10:38:47 -0600
Subject: [Bioperl-l]  Re: FASTQ, was Re:BioPerl long-term,
	was Re:	dependencies on perl version
In-Reply-To: <CAErFSoitVuxPbBHbHcEh=dZ+A8qPjjmNvF14iYBVK=FKRKL5ig@mail.gmail.com>
References: <CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
	<CAKVJ-_6v2r4R=F-sEAtC9TCLsuU1VxNi6vk-E4gsd2e=Ri0pjQ@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu>
	<CAErFSojxeHBTcNK0GiYQ8D-MbPgzMvZ8xfnbeVU0-KaCNq7ZXw@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1C42B@CHIMBX5.ad.uillinois.edu>
	<CAErFSoitVuxPbBHbHcEh=dZ+A8qPjjmNvF14iYBVK=FKRKL5ig@mail.gmail.com>
Message-ID: <5113d899.ea64320a.489a.262d@mx.google.com>

Another approach might be use map-reduce(Hadoop) if possible. I have
seen one implementation in biopython's GFF3 parser.
http://bcbio.wordpress.com/2009/03/22/mapreduce-implementation-of-gff-parsing-for-biopython/

-siddhartha


On Thu, 07 Feb 2013, Aaron Mackey wrote:

> e.g., a pull-based FASTQ parser that did nothing else at the top level but
> "chunk" the file into as-yet-unparsed four-line blobs could appear to work
> very fast, if the user code did nothing but count the number of entries:
> 
>   while (my $seq = $seqio->nextseq) { $ct++ };
> 
> in other words, you defer *everything* except the minimal amount of
> parsing/logic required to detect object boundaries.
> 
> This is, in fact, the exact opposite of the event-based SearchIO "push"
> parsers, which always perform the most parsing possible, despite the user
> never accessing most of the material.
> 
> Lastly, with respect to performance, if the parsing/object building
> operation is not simply IO bound, then parallel parser/object-building CPU
> threads could be considered, which could then dynamically adapt to
> pre-parse attributes (e.g. quality scores) that the calling code was
> actually using.  What's the state of thread-safe Perl these days?
> 
> -Aaron
> 
> 
> On Thu, Feb 7, 2013 at 10:56 AM, Fields, Christopher J <
> cjfields at illinois.edu> wrote:
> 
> > This will likely be the approach for more NGS-friendly Bio::Seq class.
> >  Calculation of the PHRED scores could also be deferred until needed.
> >
> > seqtk has some C-based methods that we could possibly take advantage of,
> > but will have to look into it.
> >
> > chris
> >
> > On Feb 7, 2013, at 9:25 AM, Aaron Mackey <amackey at virginia.edu> wrote:
> >
> > > You might also want to consider a lazy/pull-based parser to defer
> > parsing/object-building for pieces of the object that don't get used.  This
> > also usually provides some error tolerance.
> > >
> > > -Aaron
> >
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Thu Feb  7 16:55:53 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Thu, 7 Feb 2013 16:55:53 +0000
Subject: [Bioperl-l] FASTQ, was Re:BioPerl long-term,
	was Re:	dependencies on perl version
In-Reply-To: <5113d899.ea64320a.489a.262d@mx.google.com>
References: <CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
	<CAKVJ-_6v2r4R=F-sEAtC9TCLsuU1VxNi6vk-E4gsd2e=Ri0pjQ@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu>
	<CAErFSojxeHBTcNK0GiYQ8D-MbPgzMvZ8xfnbeVU0-KaCNq7ZXw@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1C42B@CHIMBX5.ad.uillinois.edu>
	<CAErFSoitVuxPbBHbHcEh=dZ+A8qPjjmNvF14iYBVK=FKRKL5ig@mail.gmail.com>
	<5113d899.ea64320a.489a.262d@mx.google.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1C7B8@CHIMBX5.ad.uillinois.edu>

I think we will want to allow for a multitude of implementations.  SeqIO already allows for that to a degree, but multiple backend implementations (say, different ways of parsing/processing FASTQ and others) isn't supported yet.

chris

On Feb 7, 2013, at 10:38 AM, Siddhartha Basu <sidd.basu at gmail.com> wrote:

> Another approach might be use map-reduce(Hadoop) if possible. I have
> seen one implementation in biopython's GFF3 parser.
> http://bcbio.wordpress.com/2009/03/22/mapreduce-implementation-of-gff-parsing-for-biopython/
> 
> -siddhartha
> 
> 
> On Thu, 07 Feb 2013, Aaron Mackey wrote:
> 
>> e.g., a pull-based FASTQ parser that did nothing else at the top level but
>> "chunk" the file into as-yet-unparsed four-line blobs could appear to work
>> very fast, if the user code did nothing but count the number of entries:
>> 
>>  while (my $seq = $seqio->nextseq) { $ct++ };
>> 
>> in other words, you defer *everything* except the minimal amount of
>> parsing/logic required to detect object boundaries.
>> 
>> This is, in fact, the exact opposite of the event-based SearchIO "push"
>> parsers, which always perform the most parsing possible, despite the user
>> never accessing most of the material.
>> 
>> Lastly, with respect to performance, if the parsing/object building
>> operation is not simply IO bound, then parallel parser/object-building CPU
>> threads could be considered, which could then dynamically adapt to
>> pre-parse attributes (e.g. quality scores) that the calling code was
>> actually using.  What's the state of thread-safe Perl these days?
>> 
>> -Aaron
>> 
>> 
>> On Thu, Feb 7, 2013 at 10:56 AM, Fields, Christopher J <
>> cjfields at illinois.edu> wrote:
>> 
>>> This will likely be the approach for more NGS-friendly Bio::Seq class.
>>> Calculation of the PHRED scores could also be deferred until needed.
>>> 
>>> seqtk has some C-based methods that we could possibly take advantage of,
>>> but will have to look into it.
>>> 
>>> chris
>>> 
>>> On Feb 7, 2013, at 9:25 AM, Aaron Mackey <amackey at virginia.edu> wrote:
>>> 
>>>> You might also want to consider a lazy/pull-based parser to defer
>>> parsing/object-building for pieces of the object that don't get used.  This
>>> also usually provides some error tolerance.
>>>> 
>>>> -Aaron
>>> 
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Thu Feb  7 17:01:07 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Thu, 7 Feb 2013 17:01:07 +0000
Subject: [Bioperl-l] FASTQ, was Re: BioPerl long-term,
 was Re: dependencies on perl version
In-Reply-To: <CAErFSoitVuxPbBHbHcEh=dZ+A8qPjjmNvF14iYBVK=FKRKL5ig@mail.gmail.com>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
	<CAKVJ-_6v2r4R=F-sEAtC9TCLsuU1VxNi6vk-E4gsd2e=Ri0pjQ@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu>
	<CAErFSojxeHBTcNK0GiYQ8D-MbPgzMvZ8xfnbeVU0-KaCNq7ZXw@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1C42B@CHIMBX5.ad.uillinois.edu>
	<CAErFSoitVuxPbBHbHcEh=dZ+A8qPjjmNvF14iYBVK=FKRKL5ig@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1C7EF@CHIMBX5.ad.uillinois.edu>

re: thread-safe perl, so-so at best from what I understand.

chris

On Feb 7, 2013, at 10:09 AM, Aaron Mackey <amackey at virginia.edu> wrote:

> e.g., a pull-based FASTQ parser that did nothing else at the top level but "chunk" the file into as-yet-unparsed four-line blobs could appear to work very fast, if the user code did nothing but count the number of entries:
> 
>   while (my $seq = $seqio->nextseq) { $ct++ };
> 
> in other words, you defer *everything* except the minimal amount of parsing/logic required to detect object boundaries.
> 
> This is, in fact, the exact opposite of the event-based SearchIO "push" parsers, which always perform the most parsing possible, despite the user never accessing most of the material.
> 
> Lastly, with respect to performance, if the parsing/object building operation is not simply IO bound, then parallel parser/object-building CPU threads could be considered, which could then dynamically adapt to pre-parse attributes (e.g. quality scores) that the calling code was actually using.  What's the state of thread-safe Perl these days?
> 
> -Aaron
> 
> 
> On Thu, Feb 7, 2013 at 10:56 AM, Fields, Christopher J <cjfields at illinois.edu> wrote:
> This will likely be the approach for more NGS-friendly Bio::Seq class.  Calculation of the PHRED scores could also be deferred until needed.
> 
> seqtk has some C-based methods that we could possibly take advantage of, but will have to look into it.
> 
> chris
> 
> On Feb 7, 2013, at 9:25 AM, Aaron Mackey <amackey at virginia.edu> wrote:
> 
> > You might also want to consider a lazy/pull-based parser to defer parsing/object-building for pieces of the object that don't get used.  This also usually provides some error tolerance.
> >
> > -Aaron


From hartzell at alerce.com  Thu Feb  7 21:36:24 2013
From: hartzell at alerce.com (George Hartzell)
Date: Thu, 7 Feb 2013 13:36:24 -0800
Subject: [Bioperl-l]  BioPerl long-term,
	was Re:  dependencies on perl version
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
Message-ID: <20756.7768.125680.662488@gargle.gargle.HOWL>

Fields, Christopher J writes:
 > George,
 > 
 > Should put your post on a pedestal :)
 > 
 > tl;dr version: I completely agree, but we need help in order to do this.
 > [...]

And therein lies the [a] problem.  Don't look at me....

I'm not coding on bioinformatics problems these days (though I'm
available...) so _maybe_ I shouldn't have gotten up on the soapbox.

But I'm so sick of getting into arguments (or walking away from
them...) with Ruby and Python [and lisp and *PHP*] fans; Perl is dead,
you can't write good code in Perl, look - Ruby has GEMS!, etc...

Perl of the olden days was an easy language in which to write really
shitty code.  Even the Perl of the BioPerl heyday wasn't really much
help; role your own OO, role your own distro-building, mountains of
monkey-work to provide consistent POD, versioning, etc...

But that's not the Perl that I use.  I have Moose and Moo.  TAP and
the things built on it.  Dist::Zilla.  PerlTidy.  PerlCritic.  cpanm.
MetaCPAN.  Pinto.  GitHub.  Perlbrew.  Wow.

It isn't any harder to write good code, for measures that I care
about, using Perl than it is *any* of the other similar languages.

And it's just as easy, and happens just as frequently, for people to
write shitty (undocumented, untested, poorly managed, poorly packaged,
...) stuff in the other languages.

GET OFF MY LAWN, KID! (Yeah, I know...)

But BioPerl *is* dying.  You might be standing on the shoulders of
giants when you use it to solve a problem, but you *definitely* have
those same giants (and their extended families) on your shoulders
every time I see you try move the project forward.  All of that
history has become the tail that's wagging the dog.

If all y'all are going to keep the thing alive, moving forward and
contributing to new great works then make Apple your hero.  Deprecate
the stuff that's holding you back, give folks a path forward and move
on.

Have fun.  Use sharp tools.  Do cool science.  Build cool things.
Advance your careers (forgot that one last time).  Be reasonable and
professional.

Supporting last year's projects is someone else's business
opportunity.

g.

ps.  Are all y'all following this thread?

     http://news.ycombinator.com/item?id=5123022

Maybe someone should search down for this bit: "Where to start? Any
list of this [sic] projects?" and insert a plug for the various
open-bio projects.  (But "someone" doesn't work here, he said...).


From cjfields at illinois.edu  Thu Feb  7 23:12:19 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Thu, 7 Feb 2013 23:12:19 +0000
Subject: [Bioperl-l] BioPerl long-term,
 was Re:  dependencies on perl version
In-Reply-To: <20756.7768.125680.662488@gargle.gargle.HOWL>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
	<20756.7768.125680.662488@gargle.gargle.HOWL>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1D071@CHIMBX5.ad.uillinois.edu>

On Feb 7, 2013, at 3:36 PM, George Hartzell <hartzell at alerce.com> wrote:

> Fields, Christopher J writes:
>> George,
>> 
>> Should put your post on a pedestal :)
>> 
>> tl;dr version: I completely agree, but we need help in order to do this.
>> [...]
> 
> And therein lies the [a] problem.  Don't look at me....
> 
> I'm not coding on bioinformatics problems these days (though I'm
> available...) so _maybe_ I shouldn't have gotten up on the soapbox.
> 
> But I'm so sick of getting into arguments (or walking away from
> them...) with Ruby and Python [and lisp and *PHP*] fans; Perl is dead,
> you can't write good code in Perl, look - Ruby has GEMS!, etc?

Right, but that's a perception not just in the Bio* world.  It's larger and more pervasive than that.  

> Perl of the olden days was an easy language in which to write really
> shitty code.  Even the Perl of the BioPerl heyday wasn't really much
> help; role your own OO, role your own distro-building, mountains of
> monkey-work to provide consistent POD, versioning, etc...
> 
> But that's not the Perl that I use.  I have Moose and Moo.  TAP and
> the things built on it.  Dist::Zilla.  PerlTidy.  PerlCritic.  cpanm.
> MetaCPAN.  Pinto.  GitHub.  Perlbrew.  Wow.

Yes, and that is the direction we need to go in.

> It isn't any harder to write good code, for measures that I care
> about, using Perl than it is *any* of the other similar languages.
> 
> And it's just as easy, and happens just as frequently, for people to
> write shitty (undocumented, untested, poorly managed, poorly packaged,
> ...) stuff in the other languages.

Oh, I know.  I'm working on some very nice looking but terribly implemented Python code now.

> GET OFF MY LAWN, KID! (Yeah, I know...)
> 
> But BioPerl *is* dying.  You might be standing on the shoulders of
> giants when you use it to solve a problem, but you *definitely* have
> those same giants (and their extended families) on your shoulders
> every time I see you try move the project forward.  All of that
> history has become the tail that's wagging the dog.

Yep.

> If all y'all are going to keep the thing alive, moving forward and
> contributing to new great works then make Apple your hero.  Deprecate
> the stuff that's holding you back, give folks a path forward and move
> on.

That's fine.

> Have fun.  Use sharp tools.  Do cool science.  Build cool things.
> Advance your careers (forgot that one last time).  Be reasonable and
> professional.
> 
> Supporting last year's projects is someone else's business
> opportunity.
> 
> g.

Right, but this isn't just my show.  I can't do this alone; it's simply too much code and I don't have even 1/4 the time I used to have.

> ps.  Are all y'all following this thread?
> 
>     http://news.ycombinator.com/item?id=5123022
> 
> Maybe someone should search down for this bit: "Where to start? Any
> list of this [sic] projects?" and insert a plug for the various
> open-bio projects.  (But "someone" doesn't work here, he said?).

Read the original guy's post.  He's completely delusional (okay, maybe not *completely*, but he comes across as quite bitter and unrealistic).  

Frankly I don't feel so bad if he wants to leave.  He doesn't like messy things.  Biology is messy, if one doesn't understand that then computational biology is not for them.

chris


From carandraug+dev at gmail.com  Fri Feb  8 04:12:22 2013
From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=)
Date: Fri, 8 Feb 2013 04:12:22 +0000
Subject: [Bioperl-l] BioPerl long-term,
	was Re: dependencies on perl version
Message-ID: <CAPOrs_1+oYc20aMvUKOKdeX78XwdZaduh7LKeEG=UQrRgYB6+A@mail.gmail.com>

On 6 February 2013 22:11, "Fields, Christopher J" <cjfields at illinois.edu> wrote:
> [...]
> So:
>
> If it means targeting performance, backwards-compatibility be damned (using Devel::NYTProf?), we do that.
>
> If it means creating a new Bio-NGS repo to focus some of these efforts, so be it.
>
> If it means we get away from the Java-based interface stuff in favor of something more Perl-like (roles anyone?), then I'm all for it.
>
> If it means we modularize BioPerl so this can be done, well, you probably know where I stand (yes).
>
> If it means this is to be BioPerl 2.0, then let's move that direction, sooner than later.
>
> But I can't do it alone.  We (not just me, but we) need to drive the direction we take.
>
> First one who codes gets the gold ring.

Hi

I know I'm not much involved with bioperl development but here's my
suggestion as maintainer of another quite modular free software
project. I swear I'm not promoting it. Skip to the last paragraph for
the very short version.

Octave Forge is now a collection of packages for GNU Octave, each
released independently whenever its maintainer sees fit. But it wasn't
like that before. For a long time, everything was released at the same
time, there was no independent packages. Then it was decided to split
it into sections: main, extra and nonfree (free software dependent on
non-free libraries, now purged), and inside those, it was split into
packages, each with its own maintainer. But some packages were (and
are) more active that the others. Some packages even came from single
contributions and we never heard from the authors again. And so, with
time, cruft settled in.

We didn't want to remove the code, but no one was interested or
comfortable enough on the field, to fix it either. Packages that had a
much more active development were being dragged down by code that no
one was maintaining. So we broke with that and each package is now
released independently. We have packages that haven't been released in
3 years yes, but that just shows the packages that no one cares about.
Those have been marked as unmaintained and anyone can come around and
make a release if they care about it.

As the maintainer of the project, I do *not* make the releases of the
packages. The package maintainers prepares everything and uploads
them, I only run a handful of tests (takes me 10min), upload it to our
server, and make the official announcement. I am also the maintainer
of one of the packages, and have often made releases of unmaintained
packages because I needed it. That's to show, if they are important
enough for someone, they will get a release somehow. If they are not
important, why would we waste our time on them anyway? We now around 5
package releases per month, many of them being minor releases with a
handful of bug fixes. Preparing a release of a small package is much
easier and much less trouble than preparing a giant release
encompassing all of them at the same time.

Short version:
I'd recommend to split the project into much smaller ones. Some of the
small ones will wither and die but those are the less important ones,
and will allow the others, the ones that people care about, freedom to
grow faster. Bioperl would still be just one project, that
incorporates a hundred or so of smaller modules. Let those who care
the most about a specific module to take care of it and make the
releases. Releasing a module becomes much simpler, which means more
releases, more activity, and the smaller code base for each module
also make it less intimidating for new contributors.

Carn?


From hartzell at alerce.com  Fri Feb  8 06:17:17 2013
From: hartzell at alerce.com (George Hartzell)
Date: Thu, 7 Feb 2013 22:17:17 -0800
Subject: [Bioperl-l] injecting a bit of levity....
Message-ID: <20756.39021.553502.116384@gargle.gargle.HOWL>


Perl's not dead.  It's FAMOUS!

  http://imgs.xkcd.com/comics/perl_problems.png

g.


From carandraug+dev at gmail.com  Fri Feb  8 06:57:30 2013
From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=)
Date: Fri, 8 Feb 2013 06:57:30 +0000
Subject: [Bioperl-l] getting a Bio::Search::HSP::HSPI from Bio::SimpleAlign
 (to find differences between sequences)
Message-ID: <CAPOrs_084-eh9kq=uWk19jvLagKKGr2qOs3HpGLpBt7YOLaO4A@mail.gmail.com>

Hi

I already have a Bio::SimpleAlign object (got it after using TCoffee
through bioperl-run module) and I'm trying to get a
Bio::Search::HSP::HSPI object from a pair of the aligned sequences.
How can I do this? I want to use the seq_inds method to compare the
sequences.

Here's my actual problem just in case I should be trying to fix it
some other way. I have a bunch of sequences from protein isoforms.
They have small differences between them, point-mutations, small
insertions or deletions, nothing too big. I want to make a table of
the mutations that each of them has against the consensus sequence. I
already made the alignment and got have the consensus with
"$align->consensus_string". Now, I want to get something like:

isoform1: Ala67Gly, His90_Met91insGln
isoform2: ....

The seq_inds method from the Bio::Search::HSP::HSPI class seems to do
the part of finding the differences, but how can I get one? I can't
find it on the documentation.

Any tips, and even showing a different approach to my problem, are
most appreciated. Thanks,

Carn?


From l.m.timmermans at students.uu.nl  Fri Feb  8 11:18:58 2013
From: l.m.timmermans at students.uu.nl (Leon Timmermans)
Date: Fri, 8 Feb 2013 12:18:58 +0100
Subject: [Bioperl-l] BioPerl long-term,
	was Re: dependencies on perl version
In-Reply-To: <20756.7768.125680.662488@gargle.gargle.HOWL>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
	<20756.7768.125680.662488@gargle.gargle.HOWL>
Message-ID: <CAC1jpXA-bu20fP0WsRi=bJKxnBkfL=KJyB5n8h_XMh6eTOq3uQ@mail.gmail.com>

On Thu, Feb 7, 2013 at 10:36 PM, George Hartzell <hartzell at alerce.com> wrote:
> But I'm so sick of getting into arguments (or walking away from
> them...) with Ruby and Python [and lisp and *PHP*] fans; Perl is dead,
> you can't write good code in Perl, look - Ruby has GEMS!, etc...
>
> Perl of the olden days was an easy language in which to write really
> shitty code.  Even the Perl of the BioPerl heyday wasn't really much
> help; role your own OO, role your own distro-building, mountains of
> monkey-work to provide consistent POD, versioning, etc...
>
> But that's not the Perl that I use.  I have Moose and Moo.  TAP and
> the things built on it.  Dist::Zilla.  PerlTidy.  PerlCritic.  cpanm.
> MetaCPAN.  Pinto.  GitHub.  Perlbrew.  Wow.

I share that experience.

> But BioPerl *is* dying.  You might be standing on the shoulders of
> giants when you use it to solve a problem, but you *definitely* have
> those same giants (and their extended families) on your shoulders
> every time I see you try move the project forward.  All of that
> history has become the tail that's wagging the dog.

I share your sentiment. Most of BioPerl is architected so badly I
can't stomach it most days, and I've worked on hairy codebases
included perl itself. There's just too much sick and wrong. It's like
hundreds of dot-com-era cgi scripts.

The problem (which is common in scientific computing) is that once
code works it's effectively abandoned. BioPerl is essentially a
gathering of more than a thousand such modules.

> If all y'all are going to keep the thing alive, moving forward and
> contributing to new great works then make Apple your hero.  Deprecate
> the stuff that's holding you back, give folks a path forward and move
> on.

That would be lovely, but who is going to do that? We're suffering
from the tragedy of the commons.

> Have fun.  Use sharp tools.  Do cool science.  Build cool things.
> Advance your careers (forgot that one last time).  Be reasonable and
> professional.

Sounds like good advice to me :-)

> Supporting last year's projects is someone else's business
> opportunity.

True!

> ps.  Are all y'all following this thread?
>
>      http://news.ycombinator.com/item?id=5123022
>
> Maybe someone should search down for this bit: "Where to start? Any
> list of this [sic] projects?" and insert a plug for the various
> open-bio projects.  (But "someone" doesn't work here, he said...).

Interesting discussion, though the original post is too cynical even
for my taste.

Leon


From cjfields at illinois.edu  Fri Feb  8 14:08:56 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Fri, 8 Feb 2013 14:08:56 +0000
Subject: [Bioperl-l] BioPerl long-term,
 was Re: dependencies on perl version
In-Reply-To: <CAC1jpXA-bu20fP0WsRi=bJKxnBkfL=KJyB5n8h_XMh6eTOq3uQ@mail.gmail.com>
References: <CAPOrs_0MpNzO7H9kNbN2NaZcfqpdJbaLfzTYN+geOzAzKkCzLA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBCNdv=+sNHzvkkt3FbqEEAwXkEsg=-sUKzb2fVpro5bQ@mail.gmail.com>
	<5111C653.2010703@gmail.com>
	<A9DCA783-CB54-4421-B14B-68503D7B1E54@drycafe.net>
	<CAC1jpXBC4b4Gj7uhsVbavFPX31Os9XvwZRb=r0HeWnaZ+DsLpA@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu>
	<20754.39343.128576.743448@gargle.gargle.HOWL>
	<CAErFSogmF3WKd+Np9=Xw47r+9Ogw5Ss6qVyS+erbFhJXzWqWxg@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu>
	<20756.7768.125680.662488@gargle.gargle.HOWL>
	<CAC1jpXA-bu20fP0WsRi=bJKxnBkfL=KJyB5n8h_XMh6eTOq3uQ@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1DA2D@CHIMBX5.ad.uillinois.edu>

On Feb 8, 2013, at 5:18 AM, Leon Timmermans <l.m.timmermans at students.uu.nl> wrote:

> On Thu, Feb 7, 2013 at 10:36 PM, George Hartzell <hartzell at alerce.com> wrote:
>> But I'm so sick of getting into arguments (or walking away from
>> them...) with Ruby and Python [and lisp and *PHP*] fans; Perl is dead,
>> you can't write good code in Perl, look - Ruby has GEMS!, etc...
>> 
>> Perl of the olden days was an easy language in which to write really
>> shitty code.  Even the Perl of the BioPerl heyday wasn't really much
>> help; role your own OO, role your own distro-building, mountains of
>> monkey-work to provide consistent POD, versioning, etc...
>> 
>> But that's not the Perl that I use.  I have Moose and Moo.  TAP and
>> the things built on it.  Dist::Zilla.  PerlTidy.  PerlCritic.  cpanm.
>> MetaCPAN.  Pinto.  GitHub.  Perlbrew.  Wow.
> 
> I share that experience.
> 
>> But BioPerl *is* dying.  You might be standing on the shoulders of
>> giants when you use it to solve a problem, but you *definitely* have
>> those same giants (and their extended families) on your shoulders
>> every time I see you try move the project forward.  All of that
>> history has become the tail that's wagging the dog.
> 
> I share your sentiment. Most of BioPerl is architected so badly I
> can't stomach it most days, and I've worked on hairy codebases
> included perl itself. There's just too much sick and wrong. It's like
> hundreds of dot-com-era cgi scripts.
> 
> The problem (which is common in scientific computing) is that once
> code works it's effectively abandoned. BioPerl is essentially a
> gathering of more than a thousand such modules.

Yep, the progression from 'it works' to 'it works very well' tends to have very high activation energy.  Many of the fixes tend to be more bandaids (get it working) than fundamental surgery.  I tried my hand at this, got a few things done.

>> If all y'all are going to keep the thing alive, moving forward and
>> contributing to new great works then make Apple your hero.  Deprecate
>> the stuff that's holding you back, give folks a path forward and move
>> on.
> 
> That would be lovely, but who is going to do that? We're suffering
> from the tragedy of the commons.

Spot on, but we could break that path for the time being.  I think BioPerl as is will have to be in maintenance mode; we need a new effort to break with older perl, older practices.  

>> Have fun.  Use sharp tools.  Do cool science.  Build cool things.
>> Advance your careers (forgot that one last time).  Be reasonable and
>> professional.
> 
> Sounds like good advice to me :-)
> 
>> Supporting last year's projects is someone else's business
>> opportunity.
> 
> True!

We just need to make a bioperl 1.x branch for the maintenance bit, rechristen 'master' as 'v2', and just move on to fixing the f****** code.  Let's move on that.

>> ps.  Are all y'all following this thread?
>> 
>>     http://news.ycombinator.com/item?id=5123022
>> 
>> Maybe someone should search down for this bit: "Where to start? Any
>> list of this [sic] projects?" and insert a plug for the various
>> open-bio projects.  (But "someone" doesn't work here, he said...).
> 
> Interesting discussion, though the original post is too cynical even
> for my taste.
> 
> Leon

Yes, that's not unusual unfortunately.  We have a number of physicists and mathematicians here who have started their initial forays into computational biology, they're all startled at how noisy it is and how messy code can.  Of course their disciplines have had the benefit of teaching students how to (somewhat decently) code for the last 40 years.

chris


From l.m.timmermans at students.uu.nl  Fri Feb  8 12:08:06 2013
From: l.m.timmermans at students.uu.nl (Leon Timmermans)
Date: Fri, 8 Feb 2013 13:08:06 +0100
Subject: [Bioperl-l] BioPerl long-term,
	was Re: dependencies on perl version
In-Reply-To: <CAPOrs_1+oYc20aMvUKOKdeX78XwdZaduh7LKeEG=UQrRgYB6+A@mail.gmail.com>
References: <CAPOrs_1+oYc20aMvUKOKdeX78XwdZaduh7LKeEG=UQrRgYB6+A@mail.gmail.com>
Message-ID: <CAC1jpXAZJK=B_GDOTb=zznj=p+bmTQq9QrD6Lkw+do7kM89K2w@mail.gmail.com>

On Fri, Feb 8, 2013 at 5:12 AM, Carn? Draug <carandraug+dev at gmail.com> wrote:
> Short version:
> I'd recommend to split the project into much smaller ones. Some of the
> small ones will wither and die but those are the less important ones,
> and will allow the others, the ones that people care about, freedom to
> grow faster. Bioperl would still be just one project, that
> incorporates a hundred or so of smaller modules. Let those who care
> the most about a specific module to take care of it and make the
> releases. Releasing a module becomes much simpler, which means more
> releases, more activity, and the smaller code base for each module
> also make it less intimidating for new contributors.

That has been a goal for some time now, but it's fairly complicated.
Not only do we have a LOT of modules (bioperl-live alone is more than
900), they also have complicated dependencies. I've attached the
results of my static dependency analysis of bioperl-live. I suspect
this split-up needs to done by automated graph analysis, it's too much
to do by hand.

Leon
-------------- next part --------------
A non-text attachment was scrubbed...
Name: deps.dot
Type: application/octet-stream
Size: 93463 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20130208/bdbbda1e/attachment-0004.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: deps.png
Type: image/png
Size: 6694525 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20130208/bdbbda1e/attachment-0004.png>

From sebastien.moretti at unil.ch  Fri Feb  8 16:19:29 2013
From: sebastien.moretti at unil.ch (=?ISO-8859-1?Q?Moretti_S=E9bastien?=)
Date: Fri, 08 Feb 2013 17:19:29 +0100
Subject: [Bioperl-l] PhyloXML
Message-ID: <51152591.9010402@unil.ch>

Hi

I would like to add some XML to an existing PhyloXML tree.

No problem to read and write it.
I would like to add <name>smthg</name> after the <phylogeny> tag as in 
http://www.phyloxml.org/examples_syntax/phyloxml_syntax_example_1.html
but get problems with add_phyloXML_annotation() :

Can't locate object method "annotation" via package "Bio::Tree::Tree" at
         /software/share/perl5/vendor_perl/Bio/TreeIO/phyloxml.pm line 
984, <GEN0> line 1 (#1)
     (F) You called a method correctly, and it correctly indicated a package
     functioning as a class, but that package doesn't define that particular
     method, nor does any of its base classes.  See perlobj.

Uncaught exception from user code:
         Can't locate object method "annotation" via package 
"Bio::Tree::Tree" at 
/software/share/perl5/vendor_perl/Bio/TreeIO/phyloxml.pm line 984, 
<GEN0> line 1.
  at /software/share/perl5/vendor_perl/Bio/TreeIO/phyloxml.pm line 984
 
Bio::TreeIO::phyloxml::element_default('Bio::TreeIO::phyloxml=HASH(0x134b1268)') 
called at /software/share/perl5/vendor_perl/Bio/TreeIO/phyloxml.pm line 670
 
Bio::TreeIO::phyloxml::processXMLNode('Bio::TreeIO::phyloxml=HASH(0x134b1268)') 
called at /software/share/perl5/vendor_perl/Bio/TreeIO/phyloxml.pm line 309
 
Bio::TreeIO::phyloxml::add_phyloXML_annotation('Bio::TreeIO::phyloxml=HASH(0x134b1268)', 
'-obj', 'Bio::Tree::Tree=HASH(0x13525258)', '-xml', '<name>SUMF 
family</name>') called at ./add_annotation_to_phyloxml.pl line 40


I think I do something wrong but what ?
Here is the code

my $treeio = new Bio::TreeIO(-file   => "$infile",
                              -format => 'phyloxml',
                             );
my $tree = $treeio->next_tree;

# Add annotation
$treeio->add_phyloXML_annotation(-obj => $tree,
                                  -xml => '<name>SUMF family</name>',
                                 );

-- 
S?bastien Moretti


From cjfields at illinois.edu  Sat Feb  9 06:25:17 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Sat, 9 Feb 2013 06:25:17 +0000
Subject: [Bioperl-l] BioPerl future
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1F217@CHIMBX5.ad.uillinois.edu>

All,

(cross-posting to gmod-gbrowse)

I want to gauge the community's thoughts on a few things.  At the moment I think we can safely say that BioPerl 1.x is in maintenance mode.  By 'maintenance mode', I mean that we can only do so much with it w/o breaking backwards compatibility with old scripts.  We need a way forward so that we can address fundamental problems within the core codebase, namely speed.

I am thinking at the moment of pushing a 'v1' branch next week after I make an official announcement, with a new 1.6 release coming out from that branch (as already announced, tentatively scheduled for March 1).  That frees up master for any code development, removal of modules/cruft, etc.  This will open an initial path forward and at least enable us to do more.  Make sense?  This of course means that any code reliant on v1 should pull from that branch instead of 'master'.  

Thoughts?  

chris


From cjfields at illinois.edu  Sat Feb  9 06:43:24 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Sat, 9 Feb 2013 06:43:24 +0000
Subject: [Bioperl-l] BioPerl long-term,
 was Re: dependencies on perl version
In-Reply-To: <CAC1jpXAZJK=B_GDOTb=zznj=p+bmTQq9QrD6Lkw+do7kM89K2w@mail.gmail.com>
References: <CAPOrs_1+oYc20aMvUKOKdeX78XwdZaduh7LKeEG=UQrRgYB6+A@mail.gmail.com>
	<CAC1jpXAZJK=B_GDOTb=zznj=p+bmTQq9QrD6Lkw+do7kM89K2w@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1F2C6@CHIMBX5.ad.uillinois.edu>

On Feb 8, 2013, at 6:08 AM, Leon Timmermans <l.m.timmermans at students.uu.nl> wrote:

> On Fri, Feb 8, 2013 at 5:12 AM, Carn? Draug <carandraug+dev at gmail.com> wrote:
>> Short version:
>> I'd recommend to split the project into much smaller ones. Some of the
>> small ones will wither and die but those are the less important ones,
>> and will allow the others, the ones that people care about, freedom to
>> grow faster. Bioperl would still be just one project, that
>> incorporates a hundred or so of smaller modules. Let those who care
>> the most about a specific module to take care of it and make the
>> releases. Releasing a module becomes much simpler, which means more
>> releases, more activity, and the smaller code base for each module
>> also make it less intimidating for new contributors.
> 
> That has been a goal for some time now, but it's fairly complicated.
> Not only do we have a LOT of modules (bioperl-live alone is more than
> 900), they also have complicated dependencies. I've attached the
> results of my static dependency analysis of bioperl-live. I suspect
> this split-up needs to done by automated graph analysis, it's too much
> to do by hand.
> 
> Leon
> <deps.dot><deps.png>

Leon, 

I'm hoping we can do this sooner than later.  In fact, if we proceed with make a 'v1' branch or something similar, we can start extricating out code sooner than later (next few weeks).

chris


From cjfields at illinois.edu  Sat Feb  9 13:51:35 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Sat, 9 Feb 2013 13:51:35 +0000
Subject: [Bioperl-l] [Gmod-gbrowse] BioPerl future
Message-ID: <prc698q0fqtymq1n70jhdi5w.1360417710993@email.android.com>

Sheldon,

The branch is where the old (v1.x) code would reside.  Master branch would be v2.

Chris


Sent via phone


-------- Original message --------
From: Sheldon McKay <sheldon.mckay at gmail.com>
Date:
To: "Fields, Christopher J" <cjfields at illinois.edu>
Cc: BioPerl List <Bioperl-l at lists.open-bio.org>,gmod-gbrowse at lists.sourceforge.net
Subject: Re: [Gmod-gbrowse] BioPerl future


Hi Chris,

This sounds like a good idea.  I think it will eventually allow bioperl to evolve into a leaner, meaner package that would be more likely to be adopted by new or isolated bioinformaticians, who tend to be put off by the size and complexity of bioperl as it now stands.

One question I have is whether the name of branch v1 might be perceived as a step backward.  How about v2?

Sheldon

On Saturday, February 9, 2013, Fields, Christopher J wrote:
All,

(cross-posting to gmod-gbrowse)

I want to gauge the community's thoughts on a few things.  At the moment I think we can safely say that BioPerl 1.x is in maintenance mode.  By 'maintenance mode', I mean that we can only do so much with it w/o breaking backwards compatibility with old scripts.  We need a way forward so that we can address fundamental problems within the core codebase, namely speed.

I am thinking at the moment of pushing a 'v1' branch next week after I make an official announcement, with a new 1.6 release coming out from that branch (as already announced, tentatively scheduled for March 1).  That frees up master for any code development, removal of modules/cruft, etc.  This will open an initial path forward and at least enable us to do more.  Make sense?  This of course means that any code reliant on v1 should pull from that branch instead of 'master'.

Thoughts?

chris
------------------------------------------------------------------------------
Free Next-Gen Firewall Hardware Offer
Buy your Sophos next-gen firewall before the end March 2013
and get the hardware for free! Learn more.
http://p.sf.net/sfu/sophos-d2d-feb
_______________________________________________
Gmod-gbrowse mailing list
Gmod-gbrowse at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse


--
Sheldon McKay, PhD
Computational Biologist
DNA Learning Center
Cold Spring Harbor Laboratory
1 Bungtown Rd
Cold Spring Harbor, NY 11724
(516) 367-5185
www.dnalc.org<http://www.dnalc.org>


From sheldon.mckay at gmail.com  Sat Feb  9 13:04:50 2013
From: sheldon.mckay at gmail.com (Sheldon McKay)
Date: Sat, 9 Feb 2013 08:04:50 -0500
Subject: [Bioperl-l] [Gmod-gbrowse] BioPerl future
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1F217@CHIMBX5.ad.uillinois.edu>
References: <118F034CF4C3EF48A96F86CE585B94BF6CE1F217@CHIMBX5.ad.uillinois.edu>
Message-ID: <CAEs59kkOhJ-czn_aXOcP+yOszQdGGLgaAMNp+u_0MqS=xXapng@mail.gmail.com>

Hi Chris,

This sounds like a good idea.  I think it will eventually allow bioperl to
evolve into a leaner, meaner package that would be more likely to be
adopted by new or isolated bioinformaticians, who tend to be put off by
the size and complexity of bioperl as it now stands.

One question I have is whether the name of branch v1 might be perceived as
a step backward.  How about v2?

Sheldon

On Saturday, February 9, 2013, Fields, Christopher J wrote:

> All,
>
> (cross-posting to gmod-gbrowse)
>
> I want to gauge the community's thoughts on a few things.  At the moment I
> think we can safely say that BioPerl 1.x is in maintenance mode.  By
> 'maintenance mode', I mean that we can only do so much with it w/o breaking
> backwards compatibility with old scripts.  We need a way forward so that we
> can address fundamental problems within the core codebase, namely speed.
>
> I am thinking at the moment of pushing a 'v1' branch next week after I
> make an official announcement, with a new 1.6 release coming out from that
> branch (as already announced, tentatively scheduled for March 1).  That
> frees up master for any code development, removal of modules/cruft, etc.
>  This will open an initial path forward and at least enable us to do more.
>  Make sense?  This of course means that any code reliant on v1 should pull
> from that branch instead of 'master'.
>
> Thoughts?
>
> chris
>
> ------------------------------------------------------------------------------
> Free Next-Gen Firewall Hardware Offer
> Buy your Sophos next-gen firewall before the end March 2013
> and get the hardware for free! Learn more.
> http://p.sf.net/sfu/sophos-d2d-feb
> _______________________________________________
> Gmod-gbrowse mailing list
> Gmod-gbrowse at lists.sourceforge.net <javascript:;>
> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
>


-- 
Sheldon McKay, PhD
Computational Biologist
DNA Learning Center
Cold Spring Harbor Laboratory
1 Bungtown Rd
Cold Spring Harbor, NY 11724
(516) 367-5185
www.dnalc.org


From cjfields at illinois.edu  Sun Feb 10 04:25:14 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Sun, 10 Feb 2013 04:25:14 +0000
Subject: [Bioperl-l] BioPerl future
In-Reply-To: <2F53583C-9E7D-4D6A-A4C2-E5C27DDBA493@illinois.edu>
References: <2F53583C-9E7D-4D6A-A4C2-E5C27DDBA493@illinois.edu>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1FC4C@CHIMBX5.ad.uillinois.edu>

Apologies if you receive this twice. I never received the replies from the gbrowse list through bioperl-l so it is possible there were mail issues last night.

------------------------

All,

(cross-posting to gmod-gbrowse)

I want to gauge the community's thoughts on a few things.  At the moment I think we can safely say that BioPerl 1.x is in maintenance mode.  By 'maintenance mode', I mean that we can only do so much with it w/o breaking backwards compatibility with old scripts.  We need a way forward so that we can address fundamental problems within the core codebase, namely speed.

I am thinking at the moment of pushing a 'v1' branch next week after I make an official announcement, with a new 1.6 release coming out from that branch (as already announced, tentatively scheduled for March 1).  That frees up master for any code development, removal of modules/cruft, etc.  This will open an initial path forward and at least enable us to do more.  Make sense?  This of course means that any code reliant on v1 should pull from that branch instead of 'master'.  

Thoughts?  

chris


From genehack at genehack.org  Sun Feb 10 04:36:07 2013
From: genehack at genehack.org (John SJ Anderson)
Date: Sat, 9 Feb 2013 20:36:07 -0800
Subject: [Bioperl-l] BioPerl future
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1FC4C@CHIMBX5.ad.uillinois.edu>
References: <2F53583C-9E7D-4D6A-A4C2-E5C27DDBA493@illinois.edu>
	<118F034CF4C3EF48A96F86CE585B94BF6CE1FC4C@CHIMBX5.ad.uillinois.edu>
Message-ID: <668BED38-61AE-4D21-A3BD-B7AEC9361EF6@genehack.org>

On Feb 9, 2013, at 8:25 PM, "Fields, Christopher J" <cjfields at illinois.edu> wrote:

> Thoughts?  

+1

The other thing to maybe give some advance thought to is organization of the new development. Maybe instead of one big repository, we can encourage the more loosely coupled small pieces that everybody seems to realize we need by having more, smaller repositories. 

j.

-- 
John SJ Anderson // genehack at genehack.org


From carandraug+dev at gmail.com  Sun Feb 10 18:40:33 2013
From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=)
Date: Sun, 10 Feb 2013 18:40:33 +0000
Subject: [Bioperl-l] BioPerl future
Message-ID: <CAPOrs_21WBiRwngD8_U4di_0WnXCz8cUHjv+oL6_m_UadBMfDg@mail.gmail.com>

On 10 February 2013 17:00,  <bioperl-l-request at lists.open-bio.org> wrote:
> Message: 3
> Date: Sat, 9 Feb 2013 20:36:07 -0800
> From: John SJ Anderson <genehack at genehack.org>
> Subject: Re: [Bioperl-l] BioPerl future
> To: "Fields, Christopher J" <cjfields at illinois.edu>
> Cc: BioPerl List <Bioperl-l at lists.open-bio.org>
> Message-ID: <668BED38-61AE-4D21-A3BD-B7AEC9361EF6 at genehack.org>
> Content-Type: text/plain; charset=us-ascii
>
> On Feb 9, 2013, at 8:25 PM, "Fields, Christopher J" <cjfields at illinois.edu> wrote:
>
>> Thoughts?
>
> +1
>
> The other thing to maybe give some advance thought to is organization of the new development. Maybe instead of one big repository, we can encourage the more loosely coupled small pieces that everybody seems to realize we need by having more, smaller repositories.

For those interested, I have just added instructions on the wiki on
how to split a subset of modules, tests, files, etc from the
bioperl-live repository into a new repository while keeping their old
history.

http://www.bioperl.org/wiki/Using_Git/Advanced#Split_a_module_from_bioperl-live

Carn?


From cjfields at illinois.edu  Sun Feb 10 20:08:35 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Sun, 10 Feb 2013 20:08:35 +0000
Subject: [Bioperl-l] BioPerl future
In-Reply-To: <CAPOrs_21WBiRwngD8_U4di_0WnXCz8cUHjv+oL6_m_UadBMfDg@mail.gmail.com>
References: <CAPOrs_21WBiRwngD8_U4di_0WnXCz8cUHjv+oL6_m_UadBMfDg@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE20632@CHIMBX5.ad.uillinois.edu>

On Feb 10, 2013, at 12:40 PM, Carn? Draug <carandraug+dev at gmail.com> wrote:

> On 10 February 2013 17:00,  <bioperl-l-request at lists.open-bio.org> wrote:
>> Message: 3
>> Date: Sat, 9 Feb 2013 20:36:07 -0800
>> From: John SJ Anderson <genehack at genehack.org>
>> Subject: Re: [Bioperl-l] BioPerl future
>> To: "Fields, Christopher J" <cjfields at illinois.edu>
>> Cc: BioPerl List <Bioperl-l at lists.open-bio.org>
>> Message-ID: <668BED38-61AE-4D21-A3BD-B7AEC9361EF6 at genehack.org>
>> Content-Type: text/plain; charset=us-ascii
>> 
>> On Feb 9, 2013, at 8:25 PM, "Fields, Christopher J" <cjfields at illinois.edu> wrote:
>> 
>>> Thoughts?
>> 
>> +1
>> 
>> The other thing to maybe give some advance thought to is organization of the new development. Maybe instead of one big repository, we can encourage the more loosely coupled small pieces that everybody seems to realize we need by having more, smaller repositories.
> 
> For those interested, I have just added instructions on the wiki on
> how to split a subset of modules, tests, files, etc from the
> bioperl-live repository into a new repository while keeping their old
> history.
> 
> http://www.bioperl.org/wiki/Using_Git/Advanced#Split_a_module_from_bioperl-live
> 
> Carn?

It's probably worth looking at this page as well, then:

http://www.bioperl.org/wiki/BioPerl_Modularization

We should probably merge the two.

chris


From hlapp at drycafe.net  Mon Feb 11 01:03:34 2013
From: hlapp at drycafe.net (Hilmar Lapp)
Date: Sun, 10 Feb 2013 20:03:34 -0500
Subject: [Bioperl-l] PhyloXML
In-Reply-To: <51152591.9010402@unil.ch>
References: <51152591.9010402@unil.ch>
Message-ID: <F041F111-CF8F-4096-9968-5F8CA5DCA866@drycafe.net>

On Feb 8, 2013, at 11:19 AM, Moretti S?bastien <sebastien.moretti at unil.ch> wrote:

> # Add annotation
> $treeio->add_phyloXML_annotation(-obj => $tree,
>                                -xml => '<name>SUMF family</name>',
>                               );

If you really have $treeio in your script in this line and not $tree, then that's at least one problem. But the stack suggests that the above isn't the exact line in your script - can you confirm that?

	-hilmar

-- 
===========================================================
: Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net :
===========================================================


From sebastien.moretti at unil.ch  Mon Feb 11 07:08:22 2013
From: sebastien.moretti at unil.ch (=?ISO-8859-1?Q?S=E9bastien_MORETTI?=)
Date: Mon, 11 Feb 2013 08:08:22 +0100
Subject: [Bioperl-l] PhyloXML
In-Reply-To: <F041F111-CF8F-4096-9968-5F8CA5DCA866@drycafe.net>
References: <51152591.9010402@unil.ch>
	<F041F111-CF8F-4096-9968-5F8CA5DCA866@drycafe.net>
Message-ID: <511898E6.7060400@unil.ch>

>> # Add annotation
>> $treeio->add_phyloXML_annotation(-obj => $tree,
>>                                 -xml => '<name>SUMF family</name>',
>>                                );
>
> If you really have $treeio in your script in this line and not $tree, then that's at least one problem. But the stack suggests that the above isn't the exact line in your script - can you confirm that?
>
> 	-hilmar

I replaced $treeio by $tree in the above line but still get an error.
Don't see what you mean by "the stack suggests that the above isn't the 
exact line in your script"

The only think I changed is the length of the xml string I try to 
insert. But get the same error with an empty xml string.


my $treeio = new Bio::TreeIO(-file   => "$infile",
                              -format => 'phyloxml',
                             );
my $tree = $treeio->next_tree;

# Add annotation
$tree->add_phyloXML_annotation(-obj => $tree,
                                -xml => '<name>SUMF family</name>',
                               );

Can't locate object method "add_phyloXML_annotation" via package
	"Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, <GEN0> 
line 1 (#1)
     (F) You called a method correctly, and it correctly indicated a package
     functioning as a class, but that package doesn't define that particular
     method, nor does any of its base classes.  See perlobj.

Uncaught exception from user code:
	Can't locate object method "add_phyloXML_annotation" via package 
"Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, <GEN0> line 1.
  at ./add_annotation_to_phyloxml.pl line 40


-- 
S?bastien Moretti
Department of Ecology and Evolution,
Biophore, University of Lausanne,
CH-1015 Lausanne, Switzerland
Tel.: +41 (21) 692 4221/4079
http://bioinfo.unil.ch/


From saladi1 at illinois.edu  Tue Feb 12 21:24:34 2013
From: saladi1 at illinois.edu (Shyam Saladi)
Date: Tue, 12 Feb 2013 13:24:34 -0800
Subject: [Bioperl-l] Bio::Tools::SeqStats->count_codons
Message-ID: <CAARX5cX31P-SwDAb1mfiCTUG00bBq_m37Eb3rBemSqD1TBo_nw@mail.gmail.com>

Hi,

I am using the count_codons method from Bio::Tools::SeqStats and keep
getting "AMBIGUOUS" codons, but I can't figure out why exactly.

When I translate the same sequence that gives the error using another
standard utility like (ExPASy - Translate), it seems to work alright.

An example sequence is below. Could anyone lend some insight?

Thanks,
Shyam


AAA     AAC     AAG     AAT     ACA     ACC     ACG     ACT     AGA     AGC
    AGT     *AMBIGUOUS*       ATA     ATC     ATG     ATT     CAA     CAC
  CAG     CAT     CCA     CCC     CCG     CCT     CGA     CGC     CGG
CGT     CTA     CTC     CTG     CTT     GAA     GAC     GAG     GAT     GCA
    GCC     GCG     GCT     GGA     GGC     GGG     GGT     GTA     GTC
GTG     GTT     TAA     TAC     TAT     TCA     TCC     TCG     TCT     TGG
    TGT     TTA     TTC     TTG     TTT     count   filename
1.722488038277511961722488038277511961722
2.966507177033492822966507177033492822967
1.531100478468899521531100478468899521531
0.9569377990430622009569377990430622009569
 0.4784688995215311004784688995215311004785
 1.722488038277511961722488038277511961722
1.33971291866028708133971291866028708134
 1.913875598086124401913875598086124401914
0.1913875598086124401913875598086124401914
 0.7655502392344497607655502392344497607656
 1.435406698564593301435406698564593301435       *
0.09569377990430622009569377990430622009569*
0.3827751196172248803827751196172248803828
 2.488038277511961722488038277511961722488
3.349282296650717703349282296650717703349
3.636363636363636363636363636363636363636
2.870813397129186602870813397129186602871
0.3827751196172248803827751196172248803828
 1.626794258373205741626794258373205741627
0.4784688995215311004784688995215311004785
 1.722488038277511961722488038277511961722
0.5741626794258373205741626794258373205742
 1.052631578947368421052631578947368421053
1.244019138755980861244019138755980861244
0.3827751196172248803827751196172248803828
 0.7655502392344497607655502392344497607656
 0.1913875598086124401913875598086124401914
 2.488038277511961722488038277511961722488
0.4784688995215311004784688995215311004785
 0.6698564593301435406698564593301435406699
 2.105263157894736842105263157894736842105
0.8612440191387559808612440191387559808612
 2.870813397129186602870813397129186602871
1.435406698564593301435406698564593301435
1.722488038277511961722488038277511961722
2.775119617224880382775119617224880382775
2.00956937799043062200956937799043062201
 2.488038277511961722488038277511961722488
3.540669856459330143540669856459330143541
2.00956937799043062200956937799043062201
 0.1913875598086124401913875598086124401914
 2.392344497607655502392344497607655502392
0.8612440191387559808612440191387559808612
 5.454545454545454545454545454545454545455
1.913875598086124401913875598086124401914
0.8612440191387559808612440191387559808612
 4.593301435406698564593301435406698564593
2.679425837320574162679425837320574162679
0.09569377990430622009569377990430622009569
1.148325358851674641148325358851674641148
1.148325358851674641148325358851674641148
0.8612440191387559808612440191387559808612
 0.4784688995215311004784688995215311004785
 2.105263157894736842105263157894736842105
0.9569377990430622009569377990430622009569
 0.9569377990430622009569377990430622009569
 0.09569377990430622009569377990430622009569
2.679425837320574162679425837320574162679
2.966507177033492822966507177033492822967
3.062200956937799043062200956937799043062
2.775119617224880382775119617224880382775       1045    temp.seq

ATGGCACGTTTTTTTATTGATCGTCCCATCTTTGCGTGGGTGATCGCCTTAATTATTATGTTGGCGGGGGTGCTTTCAATTCGCACCCTGCCGGTTTCTCAATATCCCAGCATTGCACCGCCAACCGTGGTGATCAGTGCTAACTACCCTGGTGCATCGGCCAAGATTGTTGAAGACTCAGTGACTCAGGTGATTGAGCAACGCATGAAGGGTATCGATCACCTACGTTATATTGCCTCAACCAGCGATAGTTTCGGTAATGCTGAAATCACTTTGACCTTCAATGCCGAAGCCGATCCTGATATTGCTCAGGTACAAGTTCAGAACAAATTGCAGGGTGCAATGACCCTGTTACCACAAGAGGTACAGGCTCAAGGGGTTGACGTTAACAAATCAAGTTCTGGCTTYTTGATGGTGCTGGGTTTCGTATCGACTGACGGTTCCTTAGATAAAGGCGACATCGCCGACTATGTGGGTGCAAACGTACAAGATCCCATGAGCCGTGTACCGGGCGTGGGTGAAATTCAGCTGTTTGGTGCCCAATATGCGATGCGTATATGGCTTGATCCTTTAAAACTGACTCAATATAACTTGACCAGTTTAGAGGTGATCTCGGCGATTCGTGCTCAAAACGCGCAGGTGTCTGCGGGTCAGTTGGGTGGTACGCCGTCAATTCAAGGGCAAGAACTTAACGCCACTGTTTCGGCGCAAAGTCGTTTGCAAACCCCTGAAGAGTTTCGCAAGATTATCCTGAAGTCTGATACTTCGGGTGCGAATGTGTTCCTCGGTGATGTGGCGCGCGTAGAGTTAGGTTCAGAGAGTTATGCCGTTGTCTCGTTCTACAATGGTAAGCCTGCTACTGGTTTAGCGATTAAACTGGCGACAGGCGCAAACGCGTTGGATACCGCTGAAGCTGTTCGTGATAAAGTTGAAGAATTGCGACCTTTCTTCCCGCAAGGGTTGGATGTTGTTTATCCCTACGATACTACGCCATTCGTTGAGAAATCGATAGAAGGCGTGGTACACACCCTGCTCGAAGCGATTGTTCTGGTGTTTGTCATCATGTACCTCTTCCTGCAAAACTTCCGTGCGACCTTAATTCCGACGATTGCGGTACCAGTGGTCTTGCTGGGAACGTTTGCGATTTTGTCGGCCACGGGCTTCTCTATCAACACCCTTACCATGTTTGCTATGGTGCTGGCGATTGGTCTGTTGGTGGACGACGCCATCGTGGTGGTTGAAAACGTTGAGCGGGTGATGTCGGAAGAAGGGTTGAGCCCACTCGAAGCGACTCGTAAATCGATGGATCAAATCACTGGCGCCTTAGTTGGTATTGGTTTGACGTTATCTGCTGTATTTGTGCCAATGGCATTTATGTCGGGTTCTACTGGGGTCATTTACCGTCAGTTCTCGATCACTATCGTGTCTGCGATGGCATTGTCGGTATTAGTGGCCTTGATTTTAACGCCGGCACTTTGTGCCACTATGTTAAAACCCGTGCAGAAGGGACATGGTCATATTGAAACCGGTTTCTTCGGTTGGTTTAACCGTAACTTTGATCGCTTAACTAACCGTTACGAATCCAGTGTGGCGGGCATAGTGAAGCGTGGCTTTAGAGTCATGATGATTTATGTGGCTTTAGTGGTCGCCGTCGGTTGGATCTTCATGCGTATGCCAACTGCATTCTTACCCGATGAAGACCAAGGTATCTTGTTTACGCAGGCGATTTTGCCAACAAACTCGACTCAAGAAAGTACCCTCAAAGTGCTGGATAAGGTATCCGATCACTTCATGGCTGAAGAAGGCGTGAGATCGGTATTCAGCGTGGCGGGCTTTAGCTTTGCGGGTCAAGGCCAAAACATGGGTATCGCTTTCGTTGGCTTGAAGGATTGGTCAGAGCGTGAAGCACCTGGTATGGATGTGCAGTCTATTGCGGGTCGTGCTATGGGTGCCTTTAGTCAAATTAAAGACGCCTTCGTATTTGCCTTCGTACCACCTGCGGTTATTGAGCTGGGTACGGCGAATGGTTTTGACATGTACCTGCAAGATAAAAACGGTCAAGGCCACGATAAGTTAATAGCGGCTCGTAACCAATTGCTGGGTATGGCGGCTCAGAATCCAAACCTTATGGGTGTTCGCCCTAATGGTCAGGAAGATGCGCCAATCTATCAATTGCATATTGATCATGCAAAGTTGAGCGCATTAGGCGTTGATATTGCTAACGTTAACAGTGTGTTGGCAACTGCTTGGGGTGGTTCCTATGTGAACGATTTTATCGACCGCGGCCGTGTGAAAAAGGTATTTGTGCAAGGTGATGCCCAATACCGTATGCAGCCTGAAGACCTCAACACTTGGTACGTGCGTAACAACAAGGGTGACATGGTGCCATTTTCGGCCTTTGCAACAGGTTCTTGGGAATACGGCTCACCGCGTCTAGAACGTTTTAACGGTTTACCAGCGGTGAATATTCAAGGCGCAACTGCACCAGGCTTTAGTACGGGTGCTGCCATGACTATCATGGAGGACTTAGTTAAGCAGCTACCACCTGGCTTTGGCATCGAGTGGAACGGCTTATCCTACGAGGAACGTTTATCGGGTAACCAAGCACCAGCCTTGTATGCGTTGTCGATTCTGGTGGTATTCCTTGTATTAGCAGCCTTGTATGAAAGCTGGTCAGTACCGTTTGCGGTTATCCTTGTGGTTCCATTGGGGATTATCGGTGCTCTATTGGCGATGAATGGTCGAGGCTTGCCTAACGACGTGTTCTTCCAAGTGGGTCTGTTAACAACGGTTGGTTTGGCAACCAAGAACGCCATCTTGATTGTGGAATTTGCAAAAGAATTCTACGAGAAGGGGGCGGGTCTGGTTGAGGCGACCTTACATGCGGTCCGCGTGCGTTTACGTCCGATTTTAATGACGTCGCTCGCTTTTGGTCTGGGGGTTGTACCGCTAGCCATTAGTACAGGTGTGGGTTCGGGCAGTCAGAACGCCATTGGTACCGGTGTACTTGGCGGTATGATGAGTTCGACCTTCTTAGGTATCTTCTTCGTGCCACTGTTCTTCGTCATTGTTGAGCGGATCTTCAGTAAACGAGAGCGAAAAGCGAAAGAGAAAAATCCTACGTCGACGGATTAA


From bosborne11 at verizon.net  Wed Feb 13 02:30:08 2013
From: bosborne11 at verizon.net (Brian Osborne)
Date: Tue, 12 Feb 2013 21:30:08 -0500
Subject: [Bioperl-l] Bio::Tools::SeqStats->count_codons
In-Reply-To: <CAARX5cX31P-SwDAb1mfiCTUG00bBq_m37Eb3rBemSqD1TBo_nw@mail.gmail.com>
References: <CAARX5cX31P-SwDAb1mfiCTUG00bBq_m37Eb3rBemSqD1TBo_nw@mail.gmail.com>
Message-ID: <C13C35A7-4DBE-4797-A584-DCB6AF772D25@verizon.net>

Shyam,

An ambiguous codon would be one that has a character other than [ACTGU] in it. I see '!' in your sequences, that would create an ambiguous codon.

Brian O.


On Feb 12, 2013, at 4:24 PM, Shyam Saladi <saladi1 at illinois.edu> wrote:

> Hi,
> 
> I am using the count_codons method from Bio::Tools::SeqStats and keep
> getting "AMBIGUOUS" codons, but I can't figure out why exactly.
> 
> When I translate the same sequence that gives the error using another
> standard utility like (ExPASy - Translate), it seems to work alright.
> 
> An example sequence is below. Could anyone lend some insight?
> 
> Thanks,
> Shyam
> 
> 
> 
> AAA     AAC     AAG     AAT     ACA     ACC     ACG     ACT     AGA     AGC
>    AGT     *AMBIGUOUS*       ATA     ATC     ATG     ATT     CAA     CAC
>  CAG     CAT     CCA     CCC     CCG     CCT     CGA     CGC     CGG
> CGT     CTA     CTC     CTG     CTT     GAA     GAC     GAG     GAT     GCA
>    GCC     GCG     GCT     GGA     GGC     GGG     GGT     GTA     GTC
> GTG     GTT     TAA     TAC     TAT     TCA     TCC     TCG     TCT     TGG
>    TGT     TTA     TTC     TTG     TTT     count   filename
> 1.722488038277511961722488038277511961722
> 2.966507177033492822966507177033492822967
> 1.531100478468899521531100478468899521531
> 0.9569377990430622009569377990430622009569
> 0.4784688995215311004784688995215311004785
> 1.722488038277511961722488038277511961722
> 1.33971291866028708133971291866028708134
> 1.913875598086124401913875598086124401914
> 0.1913875598086124401913875598086124401914
> 0.7655502392344497607655502392344497607656
> 1.435406698564593301435406698564593301435       *
> 0.09569377990430622009569377990430622009569*
> 0.3827751196172248803827751196172248803828
> 2.488038277511961722488038277511961722488
> 3.349282296650717703349282296650717703349
> 3.636363636363636363636363636363636363636
> 2.870813397129186602870813397129186602871
> 0.3827751196172248803827751196172248803828
> 1.626794258373205741626794258373205741627
> 0.4784688995215311004784688995215311004785
> 1.722488038277511961722488038277511961722
> 0.5741626794258373205741626794258373205742
> 1.052631578947368421052631578947368421053
> 1.244019138755980861244019138755980861244
> 0.3827751196172248803827751196172248803828
> 0.7655502392344497607655502392344497607656
> 0.1913875598086124401913875598086124401914
> 2.488038277511961722488038277511961722488
> 0.4784688995215311004784688995215311004785
> 0.6698564593301435406698564593301435406699
> 2.105263157894736842105263157894736842105
> 0.8612440191387559808612440191387559808612
> 2.870813397129186602870813397129186602871
> 1.435406698564593301435406698564593301435
> 1.722488038277511961722488038277511961722
> 2.775119617224880382775119617224880382775
> 2.00956937799043062200956937799043062201
> 2.488038277511961722488038277511961722488
> 3.540669856459330143540669856459330143541
> 2.00956937799043062200956937799043062201
> 0.1913875598086124401913875598086124401914
> 2.392344497607655502392344497607655502392
> 0.8612440191387559808612440191387559808612
> 5.454545454545454545454545454545454545455
> 1.913875598086124401913875598086124401914
> 0.8612440191387559808612440191387559808612
> 4.593301435406698564593301435406698564593
> 2.679425837320574162679425837320574162679
> 0.09569377990430622009569377990430622009569
> 1.148325358851674641148325358851674641148
> 1.148325358851674641148325358851674641148
> 0.8612440191387559808612440191387559808612
> 0.4784688995215311004784688995215311004785
> 2.105263157894736842105263157894736842105
> 0.9569377990430622009569377990430622009569
> 0.9569377990430622009569377990430622009569
> 0.09569377990430622009569377990430622009569
> 2.679425837320574162679425837320574162679
> 2.966507177033492822966507177033492822967
> 3.062200956937799043062200956937799043062
> 2.775119617224880382775119617224880382775       1045    temp.seq
> 
> ATGGCACGTTTTTTTATTGATCGTCCCATCTTTGCGTGGGTGATCGCCTTAATTATTATGTTGGCGGGGGTGCTTTCAATTCGCACCCTGCCGGTTTCTCAATATCCCAGCATTGCACCGCCAACCGTGGTGATCAGTGCTAACTACCCTGGTGCATCGGCCAAGATTGTTGAAGACTCAGTGACTCAGGTGATTGAGCAACGCATGAAGGGTATCGATCACCTACGTTATATTGCCTCAACCAGCGATAGTTTCGGTAATGCTGAAATCACTTTGACCTTCAATGCCGAAGCCGATCCTGATATTGCTCAGGTACAAGTTCAGAACAAATTGCAGGGTGCAATGACCCTGTTACCACAAGAGGTACAGGCTCAAGGGGTTGACGTTAACAAATCAAGTTCTGGCTTYTTGATGGTGCTGGGTTTCGTATCGACTGACGGTTCCTTAGATAAAGGCGACATCGCCGACTATGTGGGTGCAAACGTACAAGATCCCATGAGCCGTGTACCGGGCGTGGGTGAAATTCAGCTGTTTGGTGCCCAATATGCGATGCGTATATGGCTTGATCCTTTAAAACTGACTCAATATAACTTGACCAGTTTAGAGGTGATCTCGGCGATTCGTGCTCAAAACGCGCAGGTGTCTGCGGGTCAGTTGGGTGGTACGCCGTCAATTCAAGGGCAAGAACTTAACGCCACTGTTTCGGCGCAAAGTCGTTTGCAAACCCCTGAAGAGTTTCGCAAGATTATCCTGAAGTCTGATACTTCGGGTGCGAATGTGTTCCTCGGTGATGTGGCGCGCGTAGAGTTAGGTTCAGAGAGTTATGCCGTTGTCTCGTTCTACAATGGTAAGCCTGCTACTGGTTTAGCGATTAAACTGGCGACAGGCGCAAACGCGTTGGATACCGCTGAAGCTGTTCGTGATAAAGTTGAAGAATTGCGACCTTTCTTCCCGCAAGGGTTGGATGTTGTTTATCCCTACGATACTAC!
> GCCATTCGTTGAGAAATCGATAGAAGGCGTGGTACACACCCTGCTCGAAGCGATTGTTCTGGTGTTTGTCATCATGTACCTCTTCCTGCAAAACTTCCGTGCGACCTTAATTCCGACGATTGCGGTACCAGTGGTCTTGCTGGGAACGTTTGCGATTTTGTCGGCCACGGGCTTCTCTATCAACACCCTTACCATGTTTGCTATGGTGCTGGCGATTGGTCTGTTGGTGGACGACGCCATCGTGGTGGTTGAAAACGTTGAGCGGGTGATGTCGGAAGAAGGGTTGAGCCCACTCGAAGCGACTCGTAAATCGATGGATCAAATCACTGGCGCCTTAGTTGGTATTGGTTTGACGTTATCTGCTGTATTTGTGCCAATGGCATTTATGTCGGGTTCTACTGGGGTCATTTACCGTCAGTTCTCGATCACTATCGTGTCTGCGATGGCATTGTCGGTATTAGTGGCCTTGATTTTAACGCCGGCACTTTGTGCCACTATGTTAAAACCCGTGCAGAAGGGACATGGTCATATTGAAACCGGTTTCTTCGGTTGGTTTAACCGTAACTTTGATCGCTTAACTAACCGTTACGAATCCAGTGTGGCGGGCATAGTGAAGCGTGGCTTTAGAGTCATGATGATTTATGTGGCTTTAGTGGTCGCCGTCGGTTGGATCTTCATGCGTATGCCAACTGCATTCTTACCCGATGAAGACCAAGGTATCTTGTTTACGCAGGCGATTTTGCCAACAAACTCGACTCAAGAAAGTACCCTCAAAGTGCTGGATAAGGTATCCGATCACTTCATGGCTGAAGAAGGCGTGAGATCGGTATTCAGCGTGGCGGGCTTTAGCTTTGCGGGTCAAGGCCAAAACATGGGTATCGCTTTCGTTGGCTTGAAGGATTGGTCAGAGCGTGAAGCACCTGGTATGGATGTGCAGTCTATTGCGGGTCGTGCTATGGGTGCCTTTAGTCAAATTAAAGACGCCTTC!
> GTATTTGCCTTCGTACCACCTGCGGTTATTGAGCTGGGTACGGCGAATGGTTTTGACATGTACCTGCAAG
> ATAAAAACGGTCAAGGCCACGATAAGTTAATAGCGGCTCGTAACCAATTGCTGGGTATGGCGGCTCAGAATCCAAACCTTATGGGTGTTCGCCCTAATGGTCAGGAAGATGCGCCAATCTATCAATTGCATATTGATCATGCAAAGTTGAGCGCATTAGGCGTTGATATTGCTAACGTTAACAGTGTGTTGGCAACTGCTTGGGGTGGTTCCTATGTGAACGATTTTATCGACCGCGGCCGTGTGAAAAAGGTATTTGTGCAAGGTGATGCCCAATACCGTATGCAGCCTGAAGACCTCAACACTTGGTACGTGCGTAACAACAAGGGTGACATGGTGCCATTTTCGGCCTTTGCAACAGGTTCTTGGGAATACGGCTCACCGCGTCTAGAACGTTTTAACGGTTTACCAGCGGTGAATATTCAAGGCGCAACTGCACCAGGCTTTAGTACGGGTGCTGCCATGACTATCATGGAGGACTTAGTTAAGCAGCTACCACCTGGCTTTGGCATCGAGTGGAACGGCTTATCCTACGAGGAACGTTTATCGGGTAACCAAGCACCAGCCTTGTATGCGTTGTCGATTCTGGTGGTATTCCTTGTATTAGCAGCCTTGTATGAAAGCTGGTCAGTACCGTTTGCGGTTATCCTTGTGGTTCCATTGGGGATTATCGGTGCTCTATTGGCGATGAATGGTCGAGGCTTGCCTAACGACGTGTTCTTCCAAGTGGGTCTGTTAACAACGGTTGGTTTGGCAACCAAGAACGCCATCTTGATTGTGGAATTTGCAAAAGAATTCTACGAGAAGGGGGCGGGTCTGGTTGAGGCGACCTTACATGCGGTCCGCGTGCGTTTACGTCCGATTTTAATGACGTCGCTCGCTTTTGGTCTGGGGGTTGTACCGCTAGCCATTAGTACAGGTGTGGGTTCGGGCAGTCAGAACGCCATTGGTACCGGTGTACTTGGCGGTATGATGAGTTCGACCTTCTTA!
> GGTATCTTCTTCGTGCCACTGTTCTTCGTCATTGTTGAGCGGATCTTCAGTAAACGAGAGCGAAAAGCGAAAGAGAAAAATCCTACGTCGACGGATTAA
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Wed Feb 13 15:18:10 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 13 Feb 2013 15:18:10 +0000
Subject: [Bioperl-l] [ANNOUNCEMENT] BioPerl Future Development
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE24CF5@CHIMBX5.ad.uillinois.edu>

All,

tl;dr: A lot of change is coming.  Be forewarned and be prepared.

This is an 'official' announcement to the BioPerl community on future BioPerl plans.  We have decided to move continued maintenance of Bioperl release series over to the new 'v1' branch.  This branch will be the point where any future versions of 1.6.x code will be released, starting with the (already-scheduled) March 1 release.  The 'master' branch will become the main focal point for future development of BioPerl going into an eventual v2 release, with a focus on performance enhancements, addressing newer technologies like NGS and large data, code cleanup, and simplifying the code base.

We welcome any help with code improvements. GMOD folks? Want to help? This is a good opportunity to address BioPerl short-comings in the code base! 

What this means for anyone using BioPerl currently:

1) We anticipate significant issues if you are relying on the 'master' branch for anything.  To inelegantly state it, the core developers are taking back the 'master' branch for future development. Please please please do not rely on the 'master' branch for stable code; if you are reliant on the BioPerl 1.6.x, make sure to use 'v1'.  We can revisit whether to make 'v1' the default checkout branch if/when the need arises.

2) Expect not to find some modules.  We will be migrating modules requiring external dependencies and other associated chunks of the code base out into their own repositories over the next year to help future maintenance; the eventual intent is to release all of these independently on CPAN.  We will completely remove all code previously marked as deprecated, and we may immediately deprecate additional modules if needed (this will of course be discussed on list).

3) Expect version numbering to change significantly.  Because we are releasing code in separate repositories, I fully expect downstream versioning problems if we stick with the current system (e.g. all bioperl-live modules having the same version).  It will be too much of a headache to sync versions for all modules as this will entail making a full release of all bioperl code, one of the main reasons we are splitting out code to begin with.  At the moment, no specific versioning scheme has been chosen, though I *highly* recommend using X.Y versioning for simplicity (e.g. no more 3-point versions).  This is the standard that Lincoln has adopted for Bio::Graphics and GBrowse.

4) Expect quick deprecation of methods within modules as needed.  These should of course be brought up to the mail list prior to actual implementation, but I would anticipate some things changing as we try to adopt a more consistent method naming scheme.

5) The same steps outlined for bioperl-live will apply for bioperl-run modules.  We will have to decide the best approach to use for those, e.g. whether to separate them out based on task (alignment), application group (NGS, BLAST, RNA), etc. and how these may fit organically with bioperl-live modules where appropriate.

6) Do not expect a new CPAN release of such code until Dec 2013.  Even then it will be in an alpha stage.  We are all busy campers.

We do not anticipate significant changes to bioperl-network or bioperl-db at this time beyond updating them to deal with new changes. 

I'm sure there are many other points that need to be discussed.   Please reply over the next week if you have any concerns. 

chris


From cjfields at illinois.edu  Wed Feb 13 16:01:07 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 13 Feb 2013 16:01:07 +0000
Subject: [Bioperl-l] Test-pls ignore
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE2506D@CHIMBX5.ad.uillinois.edu>

testing the mail list to see if it is working.

-c


From sebastien.moretti at unil.ch  Wed Feb 13 16:21:23 2013
From: sebastien.moretti at unil.ch (=?ISO-8859-1?Q?Moretti_S=E9bastien?=)
Date: Wed, 13 Feb 2013 17:21:23 +0100
Subject: [Bioperl-l] PhyloXML
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE24E8D@CHIMBX5.ad.uillinois.edu>
References: <51152591.9010402@unil.ch>
	<F041F111-CF8F-4096-9968-5F8CA5DCA866@drycafe.net>
	<511898E6.7060400@unil.ch>
	<118F034CF4C3EF48A96F86CE585B94BF6CE24E8D@CHIMBX5.ad.uillinois.edu>
Message-ID: <511BBD83.2000708@unil.ch>

>>>> # Add annotation
>>>> $treeio->add_phyloXML_annotation(-obj => $tree,
>>>>                                 -xml => '<name>SUMF family</name>',
>>>>                                );
>>>
>>> If you really have $treeio in your script in this line and not $tree, then that's at least one problem. But the stack suggests that the above isn't the exact line in your script - can you confirm that?
>>>
>>> 	-hilmar
>>
>> I replaced $treeio by $tree in the above line but still get an error.
>> Don't see what you mean by "the stack suggests that the above isn't the exact line in your script"
>>
>> The only think I changed is the length of the xml string I try to insert. But get the same error with an empty xml string.
>>
>>
>>
>> my $treeio = new Bio::TreeIO(-file   => "$infile",
>>                              -format => 'phyloxml',
>>                             );
>> my $tree = $treeio->next_tree;
>>
>> # Add annotation
>> $tree->add_phyloXML_annotation(-obj => $tree,
>>                                -xml => '<name>SUMF family</name>',
>>                               );
>>
>> Can't locate object method "add_phyloXML_annotation" via package
>> 	"Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, <GEN0> line 1 (#1)
>>     (F) You called a method correctly, and it correctly indicated a package
>>     functioning as a class, but that package doesn't define that particular
>>     method, nor does any of its base classes.  See perlobj.
>>
>> Uncaught exception from user code:
>> 	Can't locate object method "add_phyloXML_annotation" via package "Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, <GEN0> line 1.
>> at ./add_annotation_to_phyloxml.pl line 40
>
> Will have to look into this.  One problem we have is that phyloXML support has dwindled, so if anyone wants to take this on I would be more than happy to help them get started.
>
> chris

You mean that BioPerl 1.6.901 has not a full support of PhyloXML ?
The problem I have is "expected" ?

-- 
S?bastien Moretti


From cjfields at illinois.edu  Wed Feb 13 15:47:17 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 13 Feb 2013 15:47:17 +0000
Subject: [Bioperl-l] PhyloXML
In-Reply-To: <511898E6.7060400@unil.ch>
References: <51152591.9010402@unil.ch>
	<F041F111-CF8F-4096-9968-5F8CA5DCA866@drycafe.net>
	<511898E6.7060400@unil.ch>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE24E8D@CHIMBX5.ad.uillinois.edu>

On Feb 11, 2013, at 1:08 AM, S?bastien MORETTI <sebastien.moretti at unil.ch> wrote:

>>> # Add annotation
>>> $treeio->add_phyloXML_annotation(-obj => $tree,
>>>                                -xml => '<name>SUMF family</name>',
>>>                               );
>> 
>> If you really have $treeio in your script in this line and not $tree, then that's at least one problem. But the stack suggests that the above isn't the exact line in your script - can you confirm that?
>> 
>> 	-hilmar
> 
> I replaced $treeio by $tree in the above line but still get an error.
> Don't see what you mean by "the stack suggests that the above isn't the exact line in your script"
> 
> The only think I changed is the length of the xml string I try to insert. But get the same error with an empty xml string.
> 
> 
> 
> my $treeio = new Bio::TreeIO(-file   => "$infile",
>                             -format => 'phyloxml',
>                            );
> my $tree = $treeio->next_tree;
> 
> # Add annotation
> $tree->add_phyloXML_annotation(-obj => $tree,
>                               -xml => '<name>SUMF family</name>',
>                              );
> 
> Can't locate object method "add_phyloXML_annotation" via package
> 	"Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, <GEN0> line 1 (#1)
>    (F) You called a method correctly, and it correctly indicated a package
>    functioning as a class, but that package doesn't define that particular
>    method, nor does any of its base classes.  See perlobj.
> 
> Uncaught exception from user code:
> 	Can't locate object method "add_phyloXML_annotation" via package "Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, <GEN0> line 1.
> at ./add_annotation_to_phyloxml.pl line 40
> 
> 
> 
> -- 
> S?bastien Moretti
> Department of Ecology and Evolution,
> Biophore, University of Lausanne,
> CH-1015 Lausanne, Switzerland
> Tel.: +41 (21) 692 4221/4079
> http://bioinfo.unil.ch/\

Will have to look into this.  One problem we have is that phyloXML support has dwindled, so if anyone wants to take this on I would be more than happy to help them get started.

chris


From carandraug+dev at gmail.com  Wed Feb 13 17:23:23 2013
From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=)
Date: Wed, 13 Feb 2013 17:23:23 +0000
Subject: [Bioperl-l] Next BioPerl release
Message-ID: <CAPOrs_0HoMHm6u5VFgCRONsv8YF_OX5TE1dJLTS+qBTRuh_Btw@mail.gmail.com>

On 5 February 2013 21:53, Fields, Christopher J <cjfields at illinois.edu> wrote:
> I am scheduling the next BioPerl CPAN release tentatively for March 1.  Any help in triaging bug reports would be greatly appreciated!

Hi

is this release of bioperl-live only or also includes bioperl-run?

Carn?


From cjfields at illinois.edu  Wed Feb 13 17:08:21 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 13 Feb 2013 17:08:21 +0000
Subject: [Bioperl-l] PhyloXML
In-Reply-To: <511BBD83.2000708@unil.ch>
References: <51152591.9010402@unil.ch>
	<F041F111-CF8F-4096-9968-5F8CA5DCA866@drycafe.net>
	<511898E6.7060400@unil.ch>
	<118F034CF4C3EF48A96F86CE585B94BF6CE24E8D@CHIMBX5.ad.uillinois.edu>
	<511BBD83.2000708@unil.ch>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE2532B@CHIMBX5.ad.uillinois.edu>

On Feb 13, 2013, at 10:21 AM, Moretti S?bastien <sebastien.moretti at unil.ch> wrote:

>>>>> # Add annotation
>>>>> $treeio->add_phyloXML_annotation(-obj => $tree,
>>>>>                                -xml => '<name>SUMF family</name>',
>>>>>                               );
>>>> 
>>>> If you really have $treeio in your script in this line and not $tree, then that's at least one problem. But the stack suggests that the above isn't the exact line in your script - can you confirm that?
>>>> 
>>>> 	-hilmar
>>> 
>>> I replaced $treeio by $tree in the above line but still get an error.
>>> Don't see what you mean by "the stack suggests that the above isn't the exact line in your script"
>>> 
>>> The only think I changed is the length of the xml string I try to insert. But get the same error with an empty xml string.
>>> 
>>> 
>>> 
>>> my $treeio = new Bio::TreeIO(-file   => "$infile",
>>>                             -format => 'phyloxml',
>>>                            );
>>> my $tree = $treeio->next_tree;
>>> 
>>> # Add annotation
>>> $tree->add_phyloXML_annotation(-obj => $tree,
>>>                               -xml => '<name>SUMF family</name>',
>>>                              );
>>> 
>>> Can't locate object method "add_phyloXML_annotation" via package
>>> 	"Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, <GEN0> line 1 (#1)
>>>    (F) You called a method correctly, and it correctly indicated a package
>>>    functioning as a class, but that package doesn't define that particular
>>>    method, nor does any of its base classes.  See perlobj.
>>> 
>>> Uncaught exception from user code:
>>> 	
>>> at ./add_annotation_to_phyloxml.pl line 40
>> 
>> Will have to look into this.  One problem we have is that phyloXML support has dwindled, so if anyone wants to take this on I would be more than happy to help them get started.
>> 
>> chris
> 
> You mean that BioPerl 1.6.901 has not a full support of PhyloXML ?
> The problem I have is "expected" ?
> 
> -- 
> S?bastien Moretti

I think it handles most of phyloXML fine, but the implementation of the parser is a little tricky.  I tried cleaning this up a few years back but didn't make much progress.

The function is in Bio::TreeIO::phyloxml, so the correct call should be (as you previously had it):

    $treeio->add_phyloXML_annotation(-obj => $tree,
                              -xml => '<name>SUMF family</name>',
                             );

My guess is that Bio::Tree::Tree was AnnotatableI at one point but that was removed, will have to trace that back.  Can you file a bug on this?

https://redmine.open-bio.org/

chris


From cjfields at illinois.edu  Wed Feb 13 18:05:53 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 13 Feb 2013 18:05:53 +0000
Subject: [Bioperl-l] Next BioPerl release
In-Reply-To: <CAPOrs_0HoMHm6u5VFgCRONsv8YF_OX5TE1dJLTS+qBTRuh_Btw@mail.gmail.com>
References: <CAPOrs_0HoMHm6u5VFgCRONsv8YF_OX5TE1dJLTS+qBTRuh_Btw@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE25573@CHIMBX5.ad.uillinois.edu>

On Feb 13, 2013, at 11:23 AM, Carn? Draug <carandraug+dev at gmail.com> wrote:

> On 5 February 2013 21:53, Fields, Christopher J <cjfields at illinois.edu> wrote:
>> I am scheduling the next BioPerl CPAN release tentatively for March 1.  Any help in triaging bug reports would be greatly appreciated!
> 
> Hi
> 
> is this release of bioperl-live only or also includes bioperl-run?
> 
> Carn?

We can work on a bioperl-run release.  It's too much to handle both in one go.  The problem I have faced with bioperl-run in the past is similar to bioperl-live, that the tools used are a moving target and that makes the wrappers easily out-of-date.  I would really like a more flexible generic way of defining these that would allow for easier maintenance.

chris


From l.m.timmermans at students.uu.nl  Wed Feb 13 19:44:22 2013
From: l.m.timmermans at students.uu.nl (Leon Timmermans)
Date: Wed, 13 Feb 2013 20:44:22 +0100
Subject: [Bioperl-l] Next BioPerl release
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE25573@CHIMBX5.ad.uillinois.edu>
References: <CAPOrs_0HoMHm6u5VFgCRONsv8YF_OX5TE1dJLTS+qBTRuh_Btw@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE25573@CHIMBX5.ad.uillinois.edu>
Message-ID: <CAC1jpXBf+uOXHKpxb7o8t3pYttnnRF35A49zY5M-3mEOuniGCA@mail.gmail.com>

On Wed, Feb 13, 2013 at 7:05 PM, Fields, Christopher J
<cjfields at illinois.edu> wrote:
> We can work on a bioperl-run release.  It's too much to handle both in one go.  The problem I have faced with bioperl-run in the past is similar to bioperl-live, that the tools used are a moving target and that makes the wrappers easily out-of-date.  I would really like a more flexible generic way of defining these that would allow for easier maintenance.

Also, bioperl-run needs to be cut into smaller distributions even more
than bioperl-live. Few people if anyone at all has all tools it tries
to wrap at hand, so its almost impossible to pass its testing suite.

We need dists that can realistically pass.

Leon


From cjfields at illinois.edu  Wed Feb 13 21:04:26 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 13 Feb 2013 21:04:26 +0000
Subject: [Bioperl-l] Next BioPerl release
In-Reply-To: <CAC1jpXBf+uOXHKpxb7o8t3pYttnnRF35A49zY5M-3mEOuniGCA@mail.gmail.com>
References: <CAPOrs_0HoMHm6u5VFgCRONsv8YF_OX5TE1dJLTS+qBTRuh_Btw@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE25573@CHIMBX5.ad.uillinois.edu>
	<CAC1jpXBf+uOXHKpxb7o8t3pYttnnRF35A49zY5M-3mEOuniGCA@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE25B07@CHIMBX5.ad.uillinois.edu>

On Feb 13, 2013, at 1:44 PM, Leon Timmermans <l.m.timmermans at students.uu.nl> wrote:

> On Wed, Feb 13, 2013 at 7:05 PM, Fields, Christopher J
> <cjfields at illinois.edu> wrote:
>> We can work on a bioperl-run release.  It's too much to handle both in one go.  The problem I have faced with bioperl-run in the past is similar to bioperl-live, that the tools used are a moving target and that makes the wrappers easily out-of-date.  I would really like a more flexible generic way of defining these that would allow for easier maintenance.
> 
> Also, bioperl-run needs to be cut into smaller distributions even more
> than bioperl-live. Few people if anyone at all has all tools it tries
> to wrap at hand, so its almost impossible to pass its testing suite.
> 
> We need dists that can realistically pass.
> 
> Leon

Yup.  It's a mess.

chris


From florent.angly at gmail.com  Wed Feb 13 22:33:14 2013
From: florent.angly at gmail.com (Florent Angly)
Date: Thu, 14 Feb 2013 08:33:14 +1000
Subject: [Bioperl-l] [ANNOUNCEMENT] BioPerl Future Development
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE24CF5@CHIMBX5.ad.uillinois.edu>
References: <118F034CF4C3EF48A96F86CE585B94BF6CE24CF5@CHIMBX5.ad.uillinois.edu>
Message-ID: <511C14AA.9030107@gmail.com>

On 14/02/13 01:18, Fields, Christopher J wrote:
> I*highly*  recommend using X.Y versioning for simplicity (e.g. no more 3-point versions)
Yes, I support the X.Y versioning as well.
Florent


From l.m.timmermans at students.uu.nl  Wed Feb 13 23:12:06 2013
From: l.m.timmermans at students.uu.nl (Leon Timmermans)
Date: Thu, 14 Feb 2013 00:12:06 +0100
Subject: [Bioperl-l] [ANNOUNCEMENT] BioPerl Future Development
In-Reply-To: <511C14AA.9030107@gmail.com>
References: <118F034CF4C3EF48A96F86CE585B94BF6CE24CF5@CHIMBX5.ad.uillinois.edu>
	<511C14AA.9030107@gmail.com>
Message-ID: <CAC1jpXBk9prChjjeHmnykWh4j7FRMN1adY0ibzM8uqH1+Z5uGA@mail.gmail.com>

On Wed, Feb 13, 2013 at 11:33 PM, Florent Angly <florent.angly at gmail.com> wrote:
> On 14/02/13 01:18, Fields, Christopher J wrote:
>>
>> I*highly*  recommend using X.Y versioning for simplicity (e.g. no more
>> 3-point versions)
>
> Yes, I support the X.Y versioning as well.
> Florent

See also: http://www.dagolden.com/index.php/369/version-numbers-should-be-boring/

Leon


From daisieh at gmail.com  Thu Feb 14 05:21:15 2013
From: daisieh at gmail.com (Daisie Huang)
Date: Wed, 13 Feb 2013 21:21:15 -0800 (PST)
Subject: [Bioperl-l] Question regarding while loops for reading files
In-Reply-To: <CADdQm2mHL-_X+bPh=cVwp1_xMCrVGhe0=D75Uf410X_L=qHz3g@mail.gmail.com>
References: <CADdQm2mHL-_X+bPh=cVwp1_xMCrVGhe0=D75Uf410X_L=qHz3g@mail.gmail.com>
Message-ID: <3cbbba3b-759d-4281-9592-6b690aea92ab@googlegroups.com>

I think you need to reset the pointer to the filehandle before you go 
through the while loop the second time: seek $fh,0,0

On Wednesday, February 13, 2013 6:46:41 PM UTC-8, Tiago Hori wrote:
>
> Hey Guys,
>
> I am still at the same place. I am writing these little pieces of code to 
> try to learn the language better, so any advice would be useful. I am again 
> parsing through tab delimited files and now trying to find fish from on id 
> (in these case families AS5 and AS9), retrieve the weights and average 
> them. When I started I did it for one family and it worked (instead of the 
> @families I had a scalar $family set to AS5). But really it is more useful 
> to look at more than one family at time (I should mention that are 2 types 
> of fish per family one ends in PS , the other doesn't). So I tried to use a 
> foreach loop to go through the file twice, once with a the search value set 
> to AS5 and a second time to AS9. It works for AS5, but for some reason, the 
> foreach loop sets $test to AS9 the second time, but it doesn't go through 
> the while loop. What am I doing wrong? 
>
> here is the code:
>
> #! /usr/bin/perl
> use strict;
> use warnings;
>
> my $file = $ARGV[0];
> my @family = ('AS5','AS9');
> my $i;
> my $ii;
> my $test;
>
> open (my $fh, "<", $file) or die ("Can't open $file: $!");
>
> foreach (@family){
>     $test = $_;
>     my @data_weight_2N = ();
>     my @data_weight_3N = ();
>     while (<$fh>){
>         chomp;  
>         my $line = $_;
>         my @data  = split ("\t", $line);
>         if ($data[0] !~ /[0-9]*/){
>         next;}
>         elsif ($data[1] eq "ABF09-$test"){
>             $i += 1; 
>             push (@data_weight_2N,  $data[6]);
>         }elsif ($data[1] eq "ABF09-".$test."PS"){
>         $ii += 1;
>             push (@data_weight_3N,$data[6]);
>     }
> }
>     my $mean_2N = &average (\@data_weight_2N);
>     my $stdev_2N = &stdev (\@data_weight_2N);
>     my $stderr_2N = ($stdev_2N/sqrt($i));
>
>     print "These are the the avearge weight, stdev and stderr for $test 
> 2N:\t", $mean_2N,"\t",$stdev_2N,"\t",$stderr_2N, "\n";
>
>     my $mean_3N = &average (\@data_weight_3N);
>     my $stdev_3N = &stdev (\@data_weight_3N);
>     my $stderr_3N = ($stdev_3N/sqrt($i));
>
>     print "These are the the avearge weight, stdev and stderr for $test 
> 3N:\t", $mean_3N,"\t",$stdev_3N,"\t",$stderr_3N, "\n";
> }
>
> close ($fh);
>
> sub average{
>         my($data) = @_;
>         if (not @$data) {
>                 print ("Empty array\n");
>                 return 0;
>         }
>         my $total = 0;
>         foreach (@$data) {
>                 $total += $_;
>         }
>         my $average = $total / @$data;
>         return $average;
> }
>
> sub stdev{
>         my($data) = @_;
>         if(@$data == 1){
>                 return 0;
>         }
>         my $average = &average($data);
>         my $sqtotal = 0;
>         foreach(@$data) {
>                 $sqtotal += ($average-$_) ** 2;
>         }
>         my $std = ($sqtotal / (@$data-1)) ** 0.5;
>         return $std;
> }
>
> Thanks,
>
> T.
>
> -- 
> "Education is not to be used to promote obscurantism." - Theodonius 
> Dobzhansky.
>
> "Gracias a la vida que me ha dado tanto
> Me ha dado el sonido y el abecedario
> Con ?l, las palabras que pienso y declaro
> Madre, amigo, hermano
> Y luz alumbrando la ruta del alma del que estoy amando
>
> Gracias a la vida que me ha dado tanto
> Me ha dado la marcha de mis pies cansados
> Con ellos anduve ciudades y charcos
> Playas y desiertos, monta?as y llanos
> Y la casa tuya, tu calle y tu patio"
>
> Violeta Parra - Gracias a la Vida
>
> Tiago S. F. Hori. PhD.
> Ocean Science Center-Memorial University of Newfoundland 
>


From sebastien.moretti at unil.ch  Thu Feb 14 08:09:06 2013
From: sebastien.moretti at unil.ch (=?ISO-8859-1?Q?S=E9bastien_MORETTI?=)
Date: Thu, 14 Feb 2013 09:09:06 +0100
Subject: [Bioperl-l] PhyloXML
In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE2532B@CHIMBX5.ad.uillinois.edu>
References: <51152591.9010402@unil.ch>
	<F041F111-CF8F-4096-9968-5F8CA5DCA866@drycafe.net>
	<511898E6.7060400@unil.ch>
	<118F034CF4C3EF48A96F86CE585B94BF6CE24E8D@CHIMBX5.ad.uillinois.edu>
	<511BBD83.2000708@unil.ch>
	<118F034CF4C3EF48A96F86CE585B94BF6CE2532B@CHIMBX5.ad.uillinois.edu>
Message-ID: <511C9BA2.9000508@unil.ch>

>>>>>> # Add annotation
>>>>>> $treeio->add_phyloXML_annotation(-obj => $tree,
>>>>>>                                 -xml => '<name>SUMF family</name>',
>>>>>>                                );
>>>>>
>>>>> If you really have $treeio in your script in this line and not $tree, then that's at least one problem. But the stack suggests that the above isn't the exact line in your script - can you confirm that?
>>>>>
>>>>> 	-hilmar
>>>>
>>>> I replaced $treeio by $tree in the above line but still get an error.
>>>> Don't see what you mean by "the stack suggests that the above isn't the exact line in your script"
>>>>
>>>> The only think I changed is the length of the xml string I try to insert. But get the same error with an empty xml string.
>>>>
>>>>
>>>>
>>>> my $treeio = new Bio::TreeIO(-file   => "$infile",
>>>>                              -format => 'phyloxml',
>>>>                             );
>>>> my $tree = $treeio->next_tree;
>>>>
>>>> # Add annotation
>>>> $tree->add_phyloXML_annotation(-obj => $tree,
>>>>                                -xml => '<name>SUMF family</name>',
>>>>                               );
>>>>
>>>> Can't locate object method "add_phyloXML_annotation" via package
>>>> 	"Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, <GEN0> line 1 (#1)
>>>>     (F) You called a method correctly, and it correctly indicated a package
>>>>     functioning as a class, but that package doesn't define that particular
>>>>     method, nor does any of its base classes.  See perlobj.
>>>>
>>>> Uncaught exception from user code:
>>>> 	
>>>> at ./add_annotation_to_phyloxml.pl line 40
>>>
>>> Will have to look into this.  One problem we have is that phyloXML support has dwindled, so if anyone wants to take this on I would be more than happy to help them get started.
>>>
>>> chris
>>
>> You mean that BioPerl 1.6.901 has not a full support of PhyloXML ?
>> The problem I have is "expected" ?
>>
>> --
>> S?bastien Moretti
>
> I think it handles most of phyloXML fine, but the implementation of the parser is a little tricky.  I tried cleaning this up a few years back but didn't make much progress.
>
> The function is in Bio::TreeIO::phyloxml, so the correct call should be (as you previously had it):
>
>      $treeio->add_phyloXML_annotation(-obj => $tree,
>                                -xml => '<name>SUMF family</name>',
>                               );
>
> My guess is that Bio::Tree::Tree was AnnotatableI at one point but that was removed, will have to trace that back.  Can you file a bug on this?
>
> https://redmine.open-bio.org/
>
> chris

I will fill a bug on this.

I'd be happy to try to contribute to the phyloxml code.
But don't know how to proceed for BioPerl.

-- 
S?bastien Moretti


From hartzell at alerce.com  Thu Feb 14 20:04:44 2013
From: hartzell at alerce.com (George Hartzell)
Date: Thu, 14 Feb 2013 12:04:44 -0800
Subject: [Bioperl-l] Question regarding while loops for reading files
In-Reply-To: <3cbbba3b-759d-4281-9592-6b690aea92ab@googlegroups.com>
References: <CADdQm2mHL-_X+bPh=cVwp1_xMCrVGhe0=D75Uf410X_L=qHz3g@mail.gmail.com>
	<3cbbba3b-759d-4281-9592-6b690aea92ab@googlegroups.com>
Message-ID: <20765.17244.185833.755900@gargle.gargle.HOWL>


I think that it's important to get feedback on code that one has
written and to try to understand how/what/why someone else has done in
their code.  To that end....

Since Tiago's using this to learn the language better I can't resist
some comments beyond resetting the file handle.

For grins I rewrote it using Text::CSV_XS and Statistics::Basic and to
take a single pass through the data file using a multilevel data
structure.

I resisted the urge to rewrite it in Moose.  Didn't even have an urge
to rewrite it in R.  Funny, that....

The script is here

  Tiago.pl
    https://gist.github.com/hartzell/4955401

With something like what I think the data looks like here:

    https://gist.github.com/hartzell/4955570

Even without that big of a rewrite, I had a bunch of local comments
which are inline below.

Daisie Huang writes:
 > [...]
 > On Wednesday, February 13, 2013 6:46:41 PM UTC-8, Tiago Hori wrote:
 > >
 > > Hey Guys,
 > >
 > > I am still at the same place. I am writing these little pieces of code to 
 > > try to learn the language better, so any advice would be useful.
 > > [...]
 > > here is the code:
 > >
 > > #! /usr/bin/perl
 > > use strict;
 > > use warnings;
 > >
 > > my $file = $ARGV[0];

Slightly better would be $filename, so that when you step up to
Path::Class you can differentiate a file object from a file name
string.

 > > my @family = ('AS5','AS9');

Better would be @families, plural.  See the use of $family below.

 > > my $i;
 > > my $ii;

As far as I can tell, these are just counting the number of things
that you push onto the various arrays.  You don't need them, referring
to the list in scalar context will give you its size.

 > > my $test;

You use this to hold the name of the family, so it's not particularly
evocative.  You should also restrict it's scope to within the loop.
See the comment for the foreach loop.

 > > open (my $fh, "<", $file) or die ("Can't open $file: $!");

You made my day, three arg. open *and* you checked for errors.  Nice!

 > > foreach (@family){

Better would be

  for my $family (@families) {

which is evocative and restricts the scope of $family to the for loop
(and for is 4 characters shorter than foreach...).

 > >     $test = $_;

No longer need this, using $family declared in the for loop with the
proper scoping.

 > >     my @data_weight_2N = ();
 > >     my @data_weight_3N = ();
 > >     while (<$fh>){
 > >         chomp;  
 > >         my $line = $_;
 > >         my @data  = split ("\t", $line);

Don't parse CSV (TSV) files yourself.  Get in the habit of using
Text::CSV_XS.

 > >         if ($data[0] !~ /[0-9]*/){
 > >         next;}
 > >         elsif ($data[1] eq "ABF09-$test"){
 > >             $i += 1; 

You don't need the counter.

 > >             push (@data_weight_2N,  $data[6]);
 > >         }elsif ($data[1] eq "ABF09-".$test."PS"){
 > >         $ii += 1;

You don't need the counter.

 > >             push (@data_weight_3N,$data[6]);
 > >     }
 > > }
 > >     my $mean_2N = &average (\@data_weight_2N);
 > >     my $stdev_2N = &stdev (\@data_weight_2N);

You don't need the ampersands on the subroutine calls.  They're old
school <joke> and just encourage people to make fun of our language for its
use of all those funny punctuation marks </joke>.

 > >     my $stderr_2N = ($stdev_2N/sqrt($i));

Unless I'm mistaken, this is equivalent

    my $stderr_2N = ($stdev_2N/sqrt(scalar @data_weight_2N));

and you don't need the counter, the explicit use of scalar there might
even be redundant (I'm a coward).  You use the same trick in your
subroutine defn's below.

 > >
 > >     print "These are the the avearge weight, stdev and stderr for $test 
 > > 2N:\t", $mean_2N,"\t",$stdev_2N,"\t",$stderr_2N, "\n";
 > >
 > >     my $mean_3N = &average (\@data_weight_3N);
 > >     my $stdev_3N = &stdev (\@data_weight_3N);
 > >     my $stderr_3N = ($stdev_3N/sqrt($i));
 > >
 > >     print "These are the the avearge weight, stdev and stderr for $test 
 > > 3N:\t", $mean_3N,"\t",$stdev_3N,"\t",$stderr_3N, "\n";
 > > }
 > >
 > > close ($fh);

Ah, rats.  You checked whether open worked, you need to do the same
thing on close too!

  close ($fh) or die !$;

Or you could just

  use autodie qw(open close);

and then they'll die appropriately when they have to and you don't
have to bother with the checking.

 > > sub average{
 > >         my($data) = @_;
 > >         if (not @$data) {
 > >                 print ("Empty array\n");
 > >                 return 0;
 > >         }
 > >         my $total = 0;
 > >         foreach (@$data) {
 > >                 $total += $_;
 > >         }

  use List::AllUtils qw(sum); # somewhere up at the top of the script...

  my $total = sum(@$data);
  if (not defined $total) {
     print "Empty array\n";
     return;
  }

List::AllUtils is your friend.  Learn to use it.

Your returning 0 for an empty list is probably the wrong thing, isn't
it possible to the total to actually be 0?  Just return instead.
Don't return undef, just return (and let perl take context into
account for you).

You probably don't actually want to spew "Empty array" out into your
output stream, imagine writing a script that postprocesses your output
and having to deal with it.  If you really need to say it, send it to
standard error with

  print STDERR "Empty array\n";

 > >         my $average = $total / @$data;
 > >         return $average;

If you don't really need the error message, then you can get to

  my $total = sum(@$data);
  return unless $total;
  return $total / @$data;

And if an empty data array is *truly* unexpected, maybe you should
just die/carp.

 > > }
 > >
 > > sub stdev{
 > >         my($data) = @_;
 > >         if(@$data == 1){
 > >                 return 0;
 > >         }
 > >         my $average = &average($data);
 > >         my $sqtotal = 0;
 > >         foreach(@$data) {
 > >                 $sqtotal += ($average-$_) ** 2;
 > >         }
 > >         my $std = ($sqtotal / (@$data-1)) ** 0.5;
 > >         return $std;
 > > }

Ditto on the use of List::AllUtils, etc...

Phew.

The only other thing I'd like to see would be an arrangement that
let's you write simple tests.  A simple sol'n would be to package the
entire main part of the code up into e.g. a subroutine that returns a
hashref keyed by family, containing a hashref keyed by 2N/3N/... and
then you could just:

  use Test::More;
  
  use Tiago qw(summarize);
  
  my $output = summarize("test_data.tsv");
  
  is($output->{AS5}->{'2N}, "42", "Got the magic number")
  
  # etc...
  
  done_testing;
  
Thanks for sharing your code.  Keep practicing!

g.


From carandraug+dev at gmail.com  Thu Feb 14 22:13:45 2013
From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=)
Date: Thu, 14 Feb 2013 22:13:45 +0000
Subject: [Bioperl-l] bioperl in Google Summer of Code 2013
Message-ID: <CAPOrs_2GA-h1hM73+jZ13Mjh3w3ZDh7jupQ4jHYcG=560jTQPg@mail.gmail.com>

Hi

we got word of it on another project I'm involved with and I was
wondering. Is bioperl going to apply for the Google Summer of Code
this year?

http://www.google-melange.com/gsoc/homepage/google/gsoc2013

Carn?


From hlapp at drycafe.net  Fri Feb 15 14:28:30 2013
From: hlapp at drycafe.net (Hilmar Lapp)
Date: Fri, 15 Feb 2013 09:28:30 -0500
Subject: [Bioperl-l] bioperl in Google Summer of Code 2013
In-Reply-To: <CAPOrs_2GA-h1hM73+jZ13Mjh3w3ZDh7jupQ4jHYcG=560jTQPg@mail.gmail.com>
References: <CAPOrs_2GA-h1hM73+jZ13Mjh3w3ZDh7jupQ4jHYcG=560jTQPg@mail.gmail.com>
Message-ID: <50A8D5F3-7248-49C1-AA1D-969C2BCA38A9@drycafe.net>

I presume the OBF does as an umbrella organization on behalf of all Bio* projects. If you fancy proposing a project idea or mentoring, now is not a bad time to think about that or looking for co-mentors.

-hilmar

Sent with a tap.

On Feb 14, 2013, at 5:13 PM, Carn? Draug <carandraug+dev at gmail.com> wrote:

> Hi
> 
> we got word of it on another project I'm involved with and I was
> wondering. Is bioperl going to apply for the Google Summer of Code
> this year?
> 
> http://www.google-melange.com/gsoc/homepage/google/gsoc2013
> 
> Carn?
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From p.j.a.cock at googlemail.com  Fri Feb 15 14:47:39 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Fri, 15 Feb 2013 14:47:39 +0000
Subject: [Bioperl-l] bioperl in Google Summer of Code 2013
In-Reply-To: <50A8D5F3-7248-49C1-AA1D-969C2BCA38A9@drycafe.net>
References: <CAPOrs_2GA-h1hM73+jZ13Mjh3w3ZDh7jupQ4jHYcG=560jTQPg@mail.gmail.com>
	<50A8D5F3-7248-49C1-AA1D-969C2BCA38A9@drycafe.net>
Message-ID: <CAKVJ-_5M9r9ZA7=KLFhzejcJ36dL11f_2kCrJBp1vR5+S9BF3Q@mail.gmail.com>

On Fri, Feb 15, 2013 at 2:28 PM, Hilmar Lapp <hlapp at drycafe.net> wrote:
> I presume the OBF does as an umbrella organization on behalf of all Bio*
> projects. If you fancy proposing a project idea or mentoring, now is not a
> bad time to think about that or looking for co-mentors.
>
> -hilmar

Yes, the plan is that as in the last few years, the OBF will apply to
GSoC and cover for BioPerl, BioJava, BioRuby, Biopython etc. At
this stage the Bio* projects would be wise to start coming up with
some good project ideas and experienced developers thinking about
being a mentor. For potential students, getting involved in the
community early is a good idea (e.g. bug reports, or better fixing
existing bugs)

See also:
http://lists.open-bio.org/mailman/listinfo/gsoc
http://lists.open-bio.org/mailman/listinfo/gsoc-mentors

Peter


From cjfields at illinois.edu  Fri Feb 15 14:59:43 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Fri, 15 Feb 2013 14:59:43 +0000
Subject: [Bioperl-l] bioperl in Google Summer of Code 2013
In-Reply-To: <CAKVJ-_5M9r9ZA7=KLFhzejcJ36dL11f_2kCrJBp1vR5+S9BF3Q@mail.gmail.com>
References: <CAPOrs_2GA-h1hM73+jZ13Mjh3w3ZDh7jupQ4jHYcG=560jTQPg@mail.gmail.com>
	<50A8D5F3-7248-49C1-AA1D-969C2BCA38A9@drycafe.net>
	<CAKVJ-_5M9r9ZA7=KLFhzejcJ36dL11f_2kCrJBp1vR5+S9BF3Q@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE28328@CHIMBX5.ad.uillinois.edu>

On Feb 15, 2013, at 8:47 AM, Peter Cock <p.j.a.cock at googlemail.com> wrote:

> On Fri, Feb 15, 2013 at 2:28 PM, Hilmar Lapp <hlapp at drycafe.net> wrote:
>> I presume the OBF does as an umbrella organization on behalf of all Bio*
>> projects. If you fancy proposing a project idea or mentoring, now is not a
>> bad time to think about that or looking for co-mentors.
>> 
>> -hilmar
> 
> Yes, the plan is that as in the last few years, the OBF will apply to
> GSoC and cover for BioPerl, BioJava, BioRuby, Biopython etc. At
> this stage the Bio* projects would be wise to start coming up with
> some good project ideas and experienced developers thinking about
> being a mentor. For potential students, getting involved in the
> community early is a good idea (e.g. bug reports, or better fixing
> existing bugs)
> 
> See also:
> http://lists.open-bio.org/mailman/listinfo/gsoc
> http://lists.open-bio.org/mailman/listinfo/gsoc-mentors
> 
> Peter

At the moment I'm not sure if Rob is heading this up or if the baton will be passed on to someone else.  I can't take charge of writing up a proposal at the moment but I can certainly help edit.

chris


From scott at scottcain.net  Fri Feb 15 19:18:37 2013
From: scott at scottcain.net (Scott Cain)
Date: Fri, 15 Feb 2013 14:18:37 -0500
Subject: [Bioperl-l] sequence-region directives in gff files
In-Reply-To: <CAPOrs_3r_cay3d59uBXCNqKwGHRBOBy+c+XOzvrfMeHdbzNTLg@mail.gmail.com>
References: <CAPOrs_3r_cay3d59uBXCNqKwGHRBOBy+c+XOzvrfMeHdbzNTLg@mail.gmail.com>
Message-ID: <CA+JTaox4SeQueWRpvgmq7GpdJ=EzQe6t3Lim2yn6y=_dBcp95A@mail.gmail.com>

Hi Carn?,

Thanks for pointing this out; I was only sort of paying attention to
the FeatureIO discussion, and it hadn't occurred to me that my commit
was the problem.

I believe I've reproduced the functionality from that commit, and I
even added a test that makes use of the added method (yes, I know, it
surprised me too!).  All of the tests now pass for me in the FeatureIO
master.  I'm putting it on my todo list to check that the Chado loader
that makes use of Bio::FeatureIO still works as expected with the new
incarnation.

Thanks,
Scott


On Wed, Feb 13, 2013 at 5:22 AM, Carn? Draug <carandraug+dev at gmail.com> wrote:
> Hi Scott
>
> 3 years ago, the code for the Bio::SeqFeatureIO::* modules was split
> from bioperl-live into a separate repository[1]. Because the code was
> not removed from the bioperl-live repository, people ended up patching
> on both sides, leading to 2 branches of development. Last weekend I
> merged them back together with the exception of one commit that would
> not longer apply[2].
>
> This commit was authored by you with the following commit message:
> "tiny change to Bio::FeatureIO::gff to allow the gmod chado gff3 bulk
> loader to not choke when the gff file has ##sequence-region
> directives.  The loader is documented not to support this, but now it
> will quitely ignore those directives."
>
> Do you think you could take a look at it?
>
> Thank you,
> Carn?
>
> [1] https://github.com/bioperl/Bio-FeatureIO
> [2] https://github.com/bioperl/bioperl-live/commit/7218728b66ad297953676236077fd0ec757378c0


-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   scott at scottcain dot net
GMOD Coordinator (http://gmod.org/)                     216-392-3087
Ontario Institute for Cancer Research


From carandraug+dev at gmail.com  Tue Feb 19 18:52:57 2013
From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=)
Date: Tue, 19 Feb 2013 18:52:57 +0000
Subject: [Bioperl-l] bioperl in Google Summer of Code 2013
In-Reply-To: <CAPOrs_0u2Qpft6_pWMaj3Wdf_-ZPOfnoYoOaevdCL443hnUsoA@mail.gmail.com>
References: <CAPOrs_2GA-h1hM73+jZ13Mjh3w3ZDh7jupQ4jHYcG=560jTQPg@mail.gmail.com>
	<50A8D5F3-7248-49C1-AA1D-969C2BCA38A9@drycafe.net>
	<CAKVJ-_5M9r9ZA7=KLFhzejcJ36dL11f_2kCrJBp1vR5+S9BF3Q@mail.gmail.com>
	<118F034CF4C3EF48A96F86CE585B94BF6CE28328@CHIMBX5.ad.uillinois.edu>
	<CAPOrs_0u2Qpft6_pWMaj3Wdf_-ZPOfnoYoOaevdCL443hnUsoA@mail.gmail.com>
Message-ID: <CAPOrs_0kiyqSfvS7ZgEkWwbAaiA2L5fV9U2r5U9cROTvyMGLRw@mail.gmail.com>

On 15 February 2013 14:28, Hilmar Lapp <hlapp at drycafe.net> wrote:
> [...]
> If you fancy proposing a project idea or mentoring, now is not a bad time to think about that or looking for co-mentors.

On 15 February 2013 14:59, Fields, Christopher J <cjfields at illinois.edu> wrote:
> At the moment I'm not sure if Rob is heading this up or if the baton will be passed on to someone else.  I can't take charge of writing up a proposal at the moment but I can certainly help edit.

I would like to participate this year as a student.

I do not have however, have any bioperl itch that would last a summer
to fix. The largest of them is to implement BLAST using NCBI's server.
They have made available a SOAP-based BLAST and doing this has been on
my todo for ages. Would you suggest any other project for bioperl?

Carn?


From peymanalavi at yahoo.com  Tue Feb 19 21:16:49 2013
From: peymanalavi at yahoo.com (peyman alavi)
Date: Tue, 19 Feb 2013 13:16:49 -0800 (PST)
Subject: [Bioperl-l] BioGraphics: Bio::SCF installation through cpan fails
Message-ID: <1361308609.90384.YahooMailNeo@web120901.mail.ne1.yahoo.com>

Hello,
I am having
problems for a while trying to install the Bio::SCF module on my Vista32. Now, I know that Bio::SCF isn't really a Bioperl module, but I need it for Bio::Graphics, and I thought perhaps other people had experienced the same problem before.? I
have installed zlib and io_lib (both their last available versions), but it
looks like sth. (presumably with io_lib) is missing. I should be very grateful
if someone could tell me what still needs to be done!
Here are
the paths where the io_lib "library" and "include" directories are installed, and I
set them to cpan before trying to install Bio::SCF:
o conf
makepl_arg ?LIBS=-Lc:/MinGW/msys/1.0/local/lib INC=-Ic:/MinGW/msys/1.0/local/include?
And the
following is what I get on the STDOUT:
?
Set up gcc environment - 4.7.2
[32m
cpan shell -- CPAN exploration and modules installation (v1.9800)
Enter 'h' for help.[0m
?
[32m??? makepl_arg???????? [LIBS=-Lc:/MinGW/msys/1.0/local/lib
INC=-Ic:/MinGW/msys/1.0/local/include][0m
[32mPlease use 'o conf commit' to make the config permanent![0m
?
[32m[0m
[32mReading 'D:\Perl\cpan\Metadata'[0m
[32m? Database was generated on
Sun, 17 Feb 2013 12:17:02 GMT[0m
[32mRunning install for module 'Bio::SCF'[0m
[32mRunning make for L/LD/LDS/Bio-SCF-1.03.tar.gz[0m
[32mChecksum for
D:\Perl\cpan\sources\authors\id\L\LD\LDS\Bio-SCF-1.03.tar.gz ok[0m
[32mScanning cache D:\Perl/cpan/build for sizes[0m
[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32mDONE[0m
[32mBio-SCF-1.03/[0m
[32mBio-SCF-1.03/t/[0m
[32mBio-SCF-1.03/t/scf.t[0m
[32mBio-SCF-1.03/eg/[0m
[32mBio-SCF-1.03/eg/write_test_obj.pl[0m
[32mBio-SCF-1.03/eg/write_test_tied.pl[0m
[32mBio-SCF-1.03/eg/read_test_obj.pl[0m
[32mBio-SCF-1.03/eg/read_test_tied.pl[0m
[32mBio-SCF-1.03/SCF/[0m
[32mBio-SCF-1.03/SCF/Arrays.pm[0m
[32mBio-SCF-1.03/DISCLAIMER[0m
[32mBio-SCF-1.03/README[0m
[32mBio-SCF-1.03/SCF.pm[0m
[32mBio-SCF-1.03/SCF.xs[0m
[32mBio-SCF-1.03/Changes[0m
[32mBio-SCF-1.03/test.scf[0m
[32mBio-SCF-1.03/Makefile.PL[0m
[32mBio-SCF-1.03/META.yml[0m
[32mBio-SCF-1.03/INSTALL[0m
[32mBio-SCF-1.03/MANIFEST[0m
[32m
? CPAN.pm: Building
L/LD/LDS/Bio-SCF-1.03.tar.gz[0m
?
Set up gcc environment - 4.7.2
Checking if your kit is complete...
Looks good
Writing Makefile for Bio::SCF
Writing MYMETA.yml and MYMETA.json
cp SCF.pm blib\lib\Bio\SCF.pm
cp SCF/Arrays.pm blib\lib\Bio\SCF\Arrays.pm
D:\Perl\bin\perl.exe D:\Perl\site\lib\ExtUtils\xsubpp? -typemap D:\Perl\lib\ExtUtils\typemap? SCF.xs > SCF.xsc &&
D:\Perl\bin\perl.exe -MExtUtils::Command -e mv -- SCF.xsc SCF.c
Please specify prototyping behavior for SCF.xs (see perlxs manual)
c:/MinGW/bin/gcc.exe -c? -Ic:/MinGW/msys/1.0/local/include ???????????? -DNDEBUG
-DWIN32 -D_CONSOLE -DNO_STRICT -DHAVE_DES_FCRYPT -DUSE_SITECUSTOMIZE
-DPERL_IMPLICIT_CONTEXT -DPERL_IMPLICIT_SYS -DUSE_PERLIO -D_USE_32BIT_TIME_T
-DPERL_MSVCRT_READFIX -DHASATTRIBUTE -fno-strict-aliasing -mms-bitfields -O2 ??????? ??-DVERSION=\"1.03\" ??????? -DXS_VERSION=\"1.03\"? "-ID:\Perl\lib\CORE"? -DLITTLE_ENDIAN SCF.c
In file included from c:/MinGW/msys/1.0/local/include/io_lib/scf.h:31:0,
???????????????? from SCF.xs:12:
c:/MinGW/msys/1.0/local/include/io_lib/mFILE.h:23:0: warning:
"MF_APPEND" redefined [enabled by default]
In file included from
c:\mingw\bin\../lib/gcc/mingw32/4.7.2/../../../../include/windows.h:55:0,
???????????????? from
D:\Perl\lib\CORE/win32.h:61,
???????????????? from
D:\Perl\lib\CORE/win32thread.h:4,
???????????????? from
D:\Perl\lib\CORE/perl.h:2825,
???????????????? from SCF.xs:5:
c:\mingw\bin\../lib/gcc/mingw32/4.7.2/../../../../include/winuser.h:131:0:
note: this is the location of the previous definition
SCF.xs: In function 'XS_Bio__SCF_get_scf_pointer':
SCF.xs:35:2: warning: passing argument 3 of '(*Perl_ILIO_ptr((struct
PerlInterpreter *)Perl_get_context()))->pNameStat' from incompatible pointer
type [enabled by default]
SCF.xs:35:2: note: expected 'struct _stati64 *' but argument is of type
'struct stat *'
Running Mkbootstrap for Bio::SCF ()
D:\Perl\bin\perl.exe -MExtUtils::Command -e chmod -- 644 SCF.bs
D:\Perl\bin\perl.exe -MExtUtils::Mksymlists \
???? -e
"Mksymlists('NAME'=>\"Bio::SCF\", 'DLBASE' => 'SCF',
'DL_FUNCS' => {? }, 'FUNCLIST' =>
[], 'IMPORTS' => {? }, 'DL_VARS' =>
[]);"
Set up gcc environment - 4.7.2
dlltool --def SCF.def --output-exp dll.exp
c:\MinGW\bin\g++.exe -o blib\arch\auto\Bio\SCF\SCF.dll -Wl,--base-file
-Wl,dll.base -mdll -L"D:\Perl\lib\CORE" SCF.o?? D:\Perl\lib\CORE\perl512.lib
c:\MinGW\lib\libkernel32.a c:\MinGW\lib\libuser32.a c:\MinGW\lib\libgdi32.a
c:\MinGW\lib\libwinspool.a c:\MinGW\lib\libcomdlg32.a c:\MinGW\lib\libadvapi32.a
c:\MinGW\lib\libshell32.a c:\MinGW\lib\libole32.a c:\MinGW\lib\liboleaut32.a
c:\MinGW\lib\libnetapi32.a c:\MinGW\lib\libuuid.a c:\MinGW\lib\libws2_32.a
c:\MinGW\lib\libmpr.a c:\MinGW\lib\libwinmm.a c:\MinGW\lib\libversion.a
c:\MinGW\lib\libodbc32.a c:\MinGW\lib\libodbccp32.a c:\MinGW\lib\libcomctl32.a
c:\MinGW\lib\libmsvcrt.a dll.exp
Warning: resolving _VirtualQuery at 12 by linking to _VirtualQuery
Use --enable-stdcall-fixup to disable these warnings
Use --disable-stdcall-fixup to disable these fixups
Warning: resolving _VirtualProtect at 16 by linking to _VirtualProtect
Warning: resolving _EnterCriticalSection at 4 by linking to
_EnterCriticalSection
Warning: resolving _TlsGetValue at 4 by linking to _TlsGetValue
Warning: resolving _GetLastError at 0 by linking to _GetLastError
Warning: resolving _LeaveCriticalSection at 4 by linking to
_LeaveCriticalSection
Warning: resolving _DeleteCriticalSection at 4 by linking to
_DeleteCriticalSection
Warning: resolving _InitializeCriticalSection at 4 by linking to
_InitializeCriticalSection
SCF.o:SCF.c:(.text+0xf35): undefined reference to `mfreopen'
SCF.o:SCF.c:(.text+0xf4b): undefined reference to `mfwrite_scf'
SCF.o:SCF.c:(.text+0xf6a): undefined reference to `mfflush'
SCF.o:SCF.c:(.text+0xf72): undefined reference to `mfdestroy'
SCF.o:SCF.c:(.text+0x1138): undefined reference to `write_scf'
SCF.o:SCF.c:(.text+0x16ac): undefined reference to `scf_deallocate'
SCF.o:SCF.c:(.text+0x17b1): undefined reference to `mfreopen'
SCF.o:SCF.c:(.text+0x17c1): undefined reference to `mfread_scf'
SCF.o:SCF.c:(.text+0x19bd): undefined reference to `read_scf'
c:/mingw/bin/../lib/gcc/mingw32/4.7.2/../../../../mingw32/bin/ld.exe:
SCF.o: bad reloc address 0xa4 in section `.rdata'
c:/mingw/bin/../lib/gcc/mingw32/4.7.2/../../../../mingw32/bin/ld.exe:
final link failed: Invalid operation
collect2.exe: error: ld returned 1 exit status
dmake.exe:? Error code 129, while
making 'blib\arch\auto\Bio\SCF\SCF.dll'
[32m? LDS/Bio-SCF-1.03.tar.gz[0m
[31m? D:\Perl\site\bin\dmake.exe
-- NOT OK[0m
[32mRunning make test[0m
[32m? Can't test without successful
make[0m
[32mRunning make install[0m
[32m? Make had returned bad
status, install seems impossible[0m
[32mFailed during this command:
?LDS/Bio-SCF-1.03.tar.gz????????????????????? : make NO[0m
[32m[0m
[31mWarning: Configuration not saved.[0m
[32mLockfile removed.[0m
?
?
?Thanks in advance for any useful
suggestions/help!!
Peyman


From scott at scottcain.net  Tue Feb 19 23:39:44 2013
From: scott at scottcain.net (Scott Cain)
Date: Tue, 19 Feb 2013 18:39:44 -0500
Subject: [Bioperl-l] BioGraphics: Bio::SCF installation through cpan
	fails
In-Reply-To: <1361308609.90384.YahooMailNeo@web120901.mail.ne1.yahoo.com>
References: <1361308609.90384.YahooMailNeo@web120901.mail.ne1.yahoo.com>
Message-ID: <777246AB-2EF0-403D-9652-8EA8390D5C53@scottcain.net>

Hi Peyman,

I have no idea what might be required to get staden and Bio::SCF installed on a windows machine; you have my sympathies for having to go through it. 

But what I wanted to touch on was what you wrote, that is, that you "need" it for Bio::Graphics. I just wanted to point out that you don't need it unless you want to be able to display traces from ABI sequencers (which most people don't really care to do these days). Bioi::SCF is listed as a recommended module, not a required one.

Scott


Sent from my iPad

On Feb 19, 2013, at 4:16 PM, peyman alavi <peymanalavi at yahoo.com> wrote:

> Hello,
> I am having
> problems for a while trying to install the Bio::SCF module on my Vista32. Now, I know that Bio::SCF isn't really a Bioperl module, but I need it for Bio::Graphics, and I thought perhaps other people had experienced the same problem before.  I
> have installed zlib and io_lib (both their last available versions), but it
> looks like sth. (presumably with io_lib) is missing. I should be very grateful
> if someone could tell me what still needs to be done!
> Here are
> the paths where the io_lib "library" and "include" directories are installed, and I
> set them to cpan before trying to install Bio::SCF:
> o conf
> makepl_arg ?LIBS=-Lc:/MinGW/msys/1.0/local/lib INC=-Ic:/MinGW/msys/1.0/local/include?
> And the
> following is what I get on the STDOUT:
>  
> Set up gcc environment - 4.7.2
> [32m
> cpan shell -- CPAN exploration and modules installation (v1.9800)
> Enter 'h' for help.[0m
>  
> [32m    makepl_arg         [LIBS=-Lc:/MinGW/msys/1.0/local/lib
> INC=-Ic:/MinGW/msys/1.0/local/include][0m
> [32mPlease use 'o conf commit' to make the config permanent![0m
>  
> [32m[0m
> [32mReading 'D:\Perl\cpan\Metadata'[0m
> [32m  Database was generated on
> Sun, 17 Feb 2013 12:17:02 GMT[0m
> [32mRunning install for module 'Bio::SCF'[0m
> [32mRunning make for L/LD/LDS/Bio-SCF-1.03.tar.gz[0m
> [32mChecksum for
> D:\Perl\cpan\sources\authors\id\L\LD\LDS\Bio-SCF-1.03.tar.gz ok[0m
> [32mScanning cache D:\Perl/cpan/build for sizes[0m
> [32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32mDONE[0m
> [32mBio-SCF-1.03/[0m
> [32mBio-SCF-1.03/t/[0m
> [32mBio-SCF-1.03/t/scf.t[0m
> [32mBio-SCF-1.03/eg/[0m
> [32mBio-SCF-1.03/eg/write_test_obj.pl[0m
> [32mBio-SCF-1.03/eg/write_test_tied.pl[0m
> [32mBio-SCF-1.03/eg/read_test_obj.pl[0m
> [32mBio-SCF-1.03/eg/read_test_tied.pl[0m
> [32mBio-SCF-1.03/SCF/[0m
> [32mBio-SCF-1.03/SCF/Arrays.pm[0m
> [32mBio-SCF-1.03/DISCLAIMER[0m
> [32mBio-SCF-1.03/README[0m
> [32mBio-SCF-1.03/SCF.pm[0m
> [32mBio-SCF-1.03/SCF.xs[0m
> [32mBio-SCF-1.03/Changes[0m
> [32mBio-SCF-1.03/test.scf[0m
> [32mBio-SCF-1.03/Makefile.PL[0m
> [32mBio-SCF-1.03/META.yml[0m
> [32mBio-SCF-1.03/INSTALL[0m
> [32mBio-SCF-1.03/MANIFEST[0m
> [32m
>   CPAN.pm: Building
> L/LD/LDS/Bio-SCF-1.03.tar.gz[0m
>  
> Set up gcc environment - 4.7.2
> Checking if your kit is complete...
> Looks good
> Writing Makefile for Bio::SCF
> Writing MYMETA.yml and MYMETA.json
> cp SCF.pm blib\lib\Bio\SCF.pm
> cp SCF/Arrays.pm blib\lib\Bio\SCF\Arrays.pm
> D:\Perl\bin\perl.exe D:\Perl\site\lib\ExtUtils\xsubpp  -typemap D:\Perl\lib\ExtUtils\typemap  SCF.xs > SCF.xsc &&
> D:\Perl\bin\perl.exe -MExtUtils::Command -e mv -- SCF.xsc SCF.c
> Please specify prototyping behavior for SCF.xs (see perlxs manual)
> c:/MinGW/bin/gcc.exe -c  -Ic:/MinGW/msys/1.0/local/include              -DNDEBUG
> -DWIN32 -D_CONSOLE -DNO_STRICT -DHAVE_DES_FCRYPT -DUSE_SITECUSTOMIZE
> -DPERL_IMPLICIT_CONTEXT -DPERL_IMPLICIT_SYS -DUSE_PERLIO -D_USE_32BIT_TIME_T
> -DPERL_MSVCRT_READFIX -DHASATTRIBUTE -fno-strict-aliasing -mms-bitfields -O2           -DVERSION=\"1.03\"         -DXS_VERSION=\"1.03\"  "-ID:\Perl\lib\CORE"  -DLITTLE_ENDIAN SCF.c
> In file included from c:/MinGW/msys/1.0/local/include/io_lib/scf.h:31:0,
>                  from SCF.xs:12:
> c:/MinGW/msys/1.0/local/include/io_lib/mFILE.h:23:0: warning:
> "MF_APPEND" redefined [enabled by default]
> In file included from
> c:\mingw\bin\../lib/gcc/mingw32/4.7.2/../../../../include/windows.h:55:0,
>                  from
> D:\Perl\lib\CORE/win32.h:61,
>                  from
> D:\Perl\lib\CORE/win32thread.h:4,
>                  from
> D:\Perl\lib\CORE/perl.h:2825,
>                  from SCF.xs:5:
> c:\mingw\bin\../lib/gcc/mingw32/4.7.2/../../../../include/winuser.h:131:0:
> note: this is the location of the previous definition
> SCF.xs: In function 'XS_Bio__SCF_get_scf_pointer':
> SCF.xs:35:2: warning: passing argument 3 of '(*Perl_ILIO_ptr((struct
> PerlInterpreter *)Perl_get_context()))->pNameStat' from incompatible pointer
> type [enabled by default]
> SCF.xs:35:2: note: expected 'struct _stati64 *' but argument is of type
> 'struct stat *'
> Running Mkbootstrap for Bio::SCF ()
> D:\Perl\bin\perl.exe -MExtUtils::Command -e chmod -- 644 SCF.bs
> D:\Perl\bin\perl.exe -MExtUtils::Mksymlists \
>      -e
> "Mksymlists('NAME'=>\"Bio::SCF\", 'DLBASE' => 'SCF',
> 'DL_FUNCS' => {  }, 'FUNCLIST' =>
> [], 'IMPORTS' => {  }, 'DL_VARS' =>
> []);"
> Set up gcc environment - 4.7.2
> dlltool --def SCF.def --output-exp dll.exp
> c:\MinGW\bin\g++.exe -o blib\arch\auto\Bio\SCF\SCF.dll -Wl,--base-file
> -Wl,dll.base -mdll -L"D:\Perl\lib\CORE" SCF.o   D:\Perl\lib\CORE\perl512.lib
> c:\MinGW\lib\libkernel32.a c:\MinGW\lib\libuser32.a c:\MinGW\lib\libgdi32.a
> c:\MinGW\lib\libwinspool.a c:\MinGW\lib\libcomdlg32.a c:\MinGW\lib\libadvapi32.a
> c:\MinGW\lib\libshell32.a c:\MinGW\lib\libole32.a c:\MinGW\lib\liboleaut32.a
> c:\MinGW\lib\libnetapi32.a c:\MinGW\lib\libuuid.a c:\MinGW\lib\libws2_32.a
> c:\MinGW\lib\libmpr.a c:\MinGW\lib\libwinmm.a c:\MinGW\lib\libversion.a
> c:\MinGW\lib\libodbc32.a c:\MinGW\lib\libodbccp32.a c:\MinGW\lib\libcomctl32.a
> c:\MinGW\lib\libmsvcrt.a dll.exp
> Warning: resolving _VirtualQuery at 12 by linking to _VirtualQuery
> Use --enable-stdcall-fixup to disable these warnings
> Use --disable-stdcall-fixup to disable these fixups
> Warning: resolving _VirtualProtect at 16 by linking to _VirtualProtect
> Warning: resolving _EnterCriticalSection at 4 by linking to
> _EnterCriticalSection
> Warning: resolving _TlsGetValue at 4 by linking to _TlsGetValue
> Warning: resolving _GetLastError at 0 by linking to _GetLastError
> Warning: resolving _LeaveCriticalSection at 4 by linking to
> _LeaveCriticalSection
> Warning: resolving _DeleteCriticalSection at 4 by linking to
> _DeleteCriticalSection
> Warning: resolving _InitializeCriticalSection at 4 by linking to
> _InitializeCriticalSection
> SCF.o:SCF.c:(.text+0xf35): undefined reference to `mfreopen'
> SCF.o:SCF.c:(.text+0xf4b): undefined reference to `mfwrite_scf'
> SCF.o:SCF.c:(.text+0xf6a): undefined reference to `mfflush'
> SCF.o:SCF.c:(.text+0xf72): undefined reference to `mfdestroy'
> SCF.o:SCF.c:(.text+0x1138): undefined reference to `write_scf'
> SCF.o:SCF.c:(.text+0x16ac): undefined reference to `scf_deallocate'
> SCF.o:SCF.c:(.text+0x17b1): undefined reference to `mfreopen'
> SCF.o:SCF.c:(.text+0x17c1): undefined reference to `mfread_scf'
> SCF.o:SCF.c:(.text+0x19bd): undefined reference to `read_scf'
> c:/mingw/bin/../lib/gcc/mingw32/4.7.2/../../../../mingw32/bin/ld.exe:
> SCF.o: bad reloc address 0xa4 in section `.rdata'
> c:/mingw/bin/../lib/gcc/mingw32/4.7.2/../../../../mingw32/bin/ld.exe:
> final link failed: Invalid operation
> collect2.exe: error: ld returned 1 exit status
> dmake.exe:  Error code 129, while
> making 'blib\arch\auto\Bio\SCF\SCF.dll'
> [32m  LDS/Bio-SCF-1.03.tar.gz[0m
> [31m  D:\Perl\site\bin\dmake.exe
> -- NOT OK[0m
> [32mRunning make test[0m
> [32m  Can't test without successful
> make[0m
> [32mRunning make install[0m
> [32m  Make had returned bad
> status, install seems impossible[0m
> [32mFailed during this command:
>  LDS/Bio-SCF-1.03.tar.gz                      : make NO[0m
> [32m[0m
> [31mWarning: Configuration not saved.[0m
> [32mLockfile removed.[0m
>  
>  
>  Thanks in advance for any useful
> suggestions/help!!
> Peyman
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From anngregory at email.arizona.edu  Wed Feb 20 05:20:41 2013
From: anngregory at email.arizona.edu (Ann Gregory)
Date: Tue, 19 Feb 2013 22:20:41 -0700
Subject: [Bioperl-l]  Problem Parsing BLAST output to annotate FASTA file
Message-ID: <CAHxs2gtL=UVAh_f7nSCFKAOj11wf92MThNqHCDxAEfRyb+M_zw@mail.gmail.com>

Hi BioPerl,

I am having issues with a BioPerl script. I have a blastxml file from a
blastx blast and the original multifasta file containing the original
nucleotides sequences.

I want to take the blast result (ie. the blast description) and annotate my
multifasta file.

I have written 2 while loops that extract the blast descriptions as well as
the nucleotide sequence from the multifasta file.

My problem is that I cannot incorporate one of the while loops into the
other without loosing the loop property of one of the loops. I would like
to take the 1st blast description, then the 1st nucleotide sequence, then
the 2nd blast description, then the 2nd nucleotide sequence and so
on...just can figure out how to alternate the results.

See script below:


use warnings;
use strict;
use Bio::SearchIO;
use Bio::SeqIO;


my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file =>
"$ARGV[0]");
while (my $result = $search_in->next_result) {
while (my $hit = $result->next_hit) {
while (my $hsp = $hit->next_hsp) {
my $qd = $hit->description;
print $qd, "\n";
}
}
}

my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]");
while (my $seqobj = $seqio->next_seq) {
my $nuc = $seqobj->seq();
print $nuc, "\n";
}--
Ann (Nina) Gregory
Graduate Student
Rich Lab / Sullivan Lab
Soil, Water, Environmental Science Department
University of Arizona


From yonexhalaolv at gmail.com  Wed Feb 20 09:17:12 2013
From: yonexhalaolv at gmail.com (Sebastian Lau)
Date: Wed, 20 Feb 2013 01:17:12 -0800 (PST)
Subject: [Bioperl-l] =?utf-8?q?failed_to_install_via_fink=EF=BC=9Ano_packa?=
 =?utf-8?q?ge_found_for_specification_=27bioperl-pm5100=27!?=
Message-ID: <84fa1bcb-a39f-4847-bff2-e3a9c2b909ea@googlegroups.com>

*Hi guys,*
*
*
*I just about to install bioperl on my MacOS 10.7.5 via fink. but after 
typing the command, fink said it couldn't find any package:*

fangmatoMacBook-Pro:~ yoyo$ fink install bioperl-pm5100
Information about 6901 packages read in 1 seconds.
Failed: no package found for specification 'bioperl-pm5100'!
fangmatoMacBook-Pro:~ yoyo$ fink install bioperl-pm588
Information about 6901 packages read in 1 seconds.
Failed: no package found for specification 'bioperl-pm588'!
fangmatoMacBook-Pro:~ yoyo$ fink install bioperl-pm586
Information about 6901 packages read in 1 seconds.
Failed: no package found for specification 'bioperl-pm586'!

*I followed the instruction on wiki. I don't know what's wrong with it. 
Thanks for your help.*


From awitney at sgul.ac.uk  Wed Feb 20 15:22:51 2013
From: awitney at sgul.ac.uk (Adam Witney)
Date: Wed, 20 Feb 2013 15:22:51 +0000
Subject: [Bioperl-l] Problem Parsing BLAST output to annotate FASTA file
In-Reply-To: <CAHxs2gtL=UVAh_f7nSCFKAOj11wf92MThNqHCDxAEfRyb+M_zw@mail.gmail.com>
References: <CAHxs2gtL=UVAh_f7nSCFKAOj11wf92MThNqHCDxAEfRyb+M_zw@mail.gmail.com>
Message-ID: <5124EA4B.5020409@sgul.ac.uk>


Hi Ann,

On 20/02/2013 05:20, Ann Gregory wrote:
> Hi BioPerl,
> 
> I am having issues with a BioPerl script. I have a blastxml file from a
> blastx blast and the original multifasta file containing the original
> nucleotides sequences.
> 
> I want to take the blast result (ie. the blast description) and annotate my
> multifasta file.
> 
> I have written 2 while loops that extract the blast descriptions as well as
> the nucleotide sequence from the multifasta file.
> 
> My problem is that I cannot incorporate one of the while loops into the
> other without loosing the loop property of one of the loops. I would like
> to take the 1st blast description, then the 1st nucleotide sequence, then
> the 2nd blast description, then the 2nd nucleotide sequence and so
> on...just can figure out how to alternate the results.
> 
> See script below:
> 
> 
> use warnings;
> use strict;
> use Bio::SearchIO;
> use Bio::SeqIO;
> 
> 
> my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file =>
> "$ARGV[0]");
> while (my $result = $search_in->next_result) {
> while (my $hit = $result->next_hit) {
> while (my $hsp = $hit->next_hsp) {
> my $qd = $hit->description;
> print $qd, "\n";
> }
> }
> }
> 
> my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]");
> while (my $seqobj = $seqio->next_seq) {
> my $nuc = $seqobj->seq();
> print $nuc, "\n";
> }--

I think what you are proposing assumes that the loop over the BLAST
results will come back in the same order as the loop over the Fasta
file, this may be the case, but I'm not sure its something I would rely on.

Anyway, I would loop over the BLAST results, storing the relevant data
to an array or hash and then loop over the fasta file to put the two
together. eg:

my $blast_data;

while ( ... blast data ... ) {
	...
	$blast_data->{$qd} = <whatever you want to store>
	...
}

while ( my $seqobj = $seqio->next_seq ) {
	my $id = $seqobj->id;
	print $blast_data->{$id}."\n";
}

something along those lines... or have i misunderstood you? if so can
you provide some more details, like what do you want your output to look
like?

HTH

Adam


From andreas.leimbach at uni-wuerzburg.de  Wed Feb 20 16:24:50 2013
From: andreas.leimbach at uni-wuerzburg.de (Andreas Leimbach)
Date: Wed, 20 Feb 2013 17:24:50 +0100
Subject: [Bioperl-l] Problem Parsing BLAST output to annotate FASTA file
In-Reply-To: <CAHxs2gtL=UVAh_f7nSCFKAOj11wf92MThNqHCDxAEfRyb+M_zw@mail.gmail.com>
References: <CAHxs2gtL=UVAh_f7nSCFKAOj11wf92MThNqHCDxAEfRyb+M_zw@mail.gmail.com>
Message-ID: <5124F8D2.4020904@uni-wuerzburg.de>

oops, I just realized I had one loop to much in there. Adam is correct. 
Sorry.

The last part of the code I send you should look like this:

my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]");
while (my $seqobj = $seqio->next_seq) {
print ">$hits{$seqobj->display_id}\n";
my $nuc = $seqobj->seq();
print $nuc, "\n";
}


Cheers,
Andreas

--
Andreas Leimbach
Universit?t M?nster
Institut f?r Hygiene
Mendelstr. 7
D-48149 M?nster
Germany

Tel.: +49 (0)551 39 3843
E-Mail: andreas.leimbach at uni-wuerzburg.de

On 20.2.13 06:20, Ann Gregory wrote:
> Hi BioPerl,
>
> I am having issues with a BioPerl script. I have a blastxml file from a
> blastx blast and the original multifasta file containing the original
> nucleotides sequences.
>
> I want to take the blast result (ie. the blast description) and annotate my
> multifasta file.
>
> I have written 2 while loops that extract the blast descriptions as well as
> the nucleotide sequence from the multifasta file.
>
> My problem is that I cannot incorporate one of the while loops into the
> other without loosing the loop property of one of the loops. I would like
> to take the 1st blast description, then the 1st nucleotide sequence, then
> the 2nd blast description, then the 2nd nucleotide sequence and so
> on...just can figure out how to alternate the results.
>
> See script below:
>
>
> use warnings;
> use strict;
> use Bio::SearchIO;
> use Bio::SeqIO;
>
>
> my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file =>
> "$ARGV[0]");
> while (my $result = $search_in->next_result) {
> while (my $hit = $result->next_hit) {
> while (my $hsp = $hit->next_hsp) {
> my $qd = $hit->description;
> print $qd, "\n";
> }
> }
> }
>
> my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]");
> while (my $seqobj = $seqio->next_seq) {
> my $nuc = $seqobj->seq();
> print $nuc, "\n";
> }--
> Ann (Nina) Gregory
> Graduate Student
> Rich Lab / Sullivan Lab
> Soil, Water, Environmental Science Department
> University of Arizona
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From andreas.leimbach at uni-wuerzburg.de  Wed Feb 20 16:14:29 2013
From: andreas.leimbach at uni-wuerzburg.de (Andreas Leimbach)
Date: Wed, 20 Feb 2013 17:14:29 +0100
Subject: [Bioperl-l] Problem Parsing BLAST output to annotate FASTA file
In-Reply-To: <CAHxs2gtL=UVAh_f7nSCFKAOj11wf92MThNqHCDxAEfRyb+M_zw@mail.gmail.com>
References: <CAHxs2gtL=UVAh_f7nSCFKAOj11wf92MThNqHCDxAEfRyb+M_zw@mail.gmail.com>
Message-ID: <5124F665.5050602@uni-wuerzburg.de>

Hi Ann,

I agree with Adam, but I was already writing my email, while his came 
in. Hope it helps:

I hope I understand correctly what you want to do.
Just to clarify, you queried a protein blast database with blastx and 
nucleotide queries. Now you want to associate the protein description 
for the FIRST blast hit with the corresponding nucleotide fasta file. Is 
that correct?
You have to put the two while loops into one another. Or associate the 
blast hits with the query descriptions. But it's not feasible to take 
the first blast hit and the first nucleotide fasta seq, then the 2nd of 
both etc, as Adam already pointed out.
You would have to iterate through both at the same time. I.e. take the 
first blast hit, then iterate through the nucleotide fasta until you 
find the hit. Then take the 2nd blast hit and iterate through the 
nucleotide fasta etc. It's probably easiest to do this in a hash.

Something along the lines of (not tested I just punched that in the E-Mail):

my %hits;
my $hit_desc;
my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file =>
"$ARGV[0]");
while (my $result = $search_in->next_result) {
while (my $hit = $result->next_hit) {
while (my $hsp = $hit->next_hsp) {
if ($hit->description eq $hit_desc) { # Only want the first blast hit
next;
}
my $hit_desc = $hit->description;
$hits{$result->query_description} = $hit_desc;
}
}
}

my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]");
foreach my $query (keys %hits) {
while (my $seqobj = $seqio->next_seq) {
if ($seqobj->display_id eq $query) {
print ">$hits{$query}\n";
my $nuc = $seqobj->seq();
print $nuc, "\n";
}

You might want to put some evalue cutoff in there to only score 
significant hits. Also if your nucleotide query multi-fasta file is very 
large, you might consider creating an index first:
http://www.bioperl.org/wiki/HOWTO:Local_Databases#Bio::Index

Hope that helps!

Cheers,
Andreas

P.S.: Please next time include version numbers for BioPerl and Perl and 
a little more detail what you want to do. ;-)


--
Andreas Leimbach
Universit?t M?nster
Institut f?r Hygiene
Mendelstr. 7
D-48149 M?nster
Germany

Tel.: +49 (0)551 39 3843
E-Mail: andreas.leimbach at uni-wuerzburg.de

On 20.2.13 06:20, Ann Gregory wrote:
> Hi BioPerl,
>
> I am having issues with a BioPerl script. I have a blastxml file from a
> blastx blast and the original multifasta file containing the original
> nucleotides sequences.
>
> I want to take the blast result (ie. the blast description) and annotate my
> multifasta file.
>
> I have written 2 while loops that extract the blast descriptions as well as
> the nucleotide sequence from the multifasta file.
>
> My problem is that I cannot incorporate one of the while loops into the
> other without loosing the loop property of one of the loops. I would like
> to take the 1st blast description, then the 1st nucleotide sequence, then
> the 2nd blast description, then the 2nd nucleotide sequence and so
> on...just can figure out how to alternate the results.
>
> See script below:
>
>
> use warnings;
> use strict;
> use Bio::SearchIO;
> use Bio::SeqIO;
>
>
> my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file =>
> "$ARGV[0]");
> while (my $result = $search_in->next_result) {
> while (my $hit = $result->next_hit) {
> while (my $hsp = $hit->next_hsp) {
> my $qd = $hit->description;
> print $qd, "\n";
> }
> }
> }
>
> my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]");
> while (my $seqobj = $seqio->next_seq) {
> my $nuc = $seqobj->seq();
> print $nuc, "\n";
> }--
> Ann (Nina) Gregory
> Graduate Student
> Rich Lab / Sullivan Lab
> Soil, Water, Environmental Science Department
> University of Arizona
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From andreas.leimbach at uni-wuerzburg.de  Wed Feb 20 17:00:51 2013
From: andreas.leimbach at uni-wuerzburg.de (Andreas Leimbach)
Date: Wed, 20 Feb 2013 18:00:51 +0100
Subject: [Bioperl-l] Problem Parsing BLAST output to annotate FASTA file
In-Reply-To: <CAHxs2gtYf70wvFtEX2nFZEtTsUcuw0i1nHzKBRL=H4tcVo+vBQ@mail.gmail.com>
References: <CAHxs2gtL=UVAh_f7nSCFKAOj11wf92MThNqHCDxAEfRyb+M_zw@mail.gmail.com>
	<5124F8D2.4020904@uni-wuerzburg.de>
	<CAHxs2gtYf70wvFtEX2nFZEtTsUcuw0i1nHzKBRL=H4tcVo+vBQ@mail.gmail.com>
Message-ID: <51250143.9050503@uni-wuerzburg.de>

Hey Ann,

damn, it 's not my best day ... Anyways, I wouldn't work with 
List::MoreUtils's each_array function, as this assumes that the blast 
hits and the nucleotide queries are in the same order (as Adam pointed 
out). Rather use a hash which associates a key to a certain value. Also, 
the hash can be used to skip sequences that have no hits.
Here's my new version:

my %hits;
my $hit_desc;
my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file =>
"$ARGV[0]");
while (my $result = $search_in->next_result) {
while (my $hit = $result->next_hit) {
while (my $hsp = $hit->next_hsp) {
$hits{$result->query_description} = $hit->description; # hash: associate 
query_desc (key) with hit_desc (value)
last; # jump out of the while loop; this should resolve getting only the 
first hit
}
last; # see above
}
}


my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]");
while (my $seqobj = $seqio->next_seq) {
if ($hits{$seqobj->display_id}) { # only true if display_id associated 
with hit_desc and should skip seqs without hits
print ">$hits{$seqobj->display_id}\n";
my $nuc = $seqobj->seq();
print $nuc, "\n";
}
}

Cheers,
Andreas

P.S.: I redirected your mail to the BioPerl mailing list, others might 
profit from my mistakes ;-) ...

--
Andreas Leimbach
Universit?t M?nster
Institut f?r Hygiene
Mendelstr. 7
D-48149 M?nster
Germany

Tel.: +49 (0)551 39 3843
E-Mail: andreas.leimbach at uni-wuerzburg.de

On 20.2.13 17:35, Ann Gregory wrote:
> Hi Andreas,
>
> Thanks for you help! I don't understand how this gets the first blast hit:
>
> if ($hit->description eq $hit_desc) { # Only want the first blast hit
> next;
> }
>
> I tried this and seems to be working...but I can't get the 1st blast hit
> or skip the sequences that had no hits. Do you know any quick fixes?
>
> *
> use warnings;
> use strict;
> use Bio::SearchIO;
> use Bio::SeqIO;
> use List::MoreUtils qw(each_array);
>
> my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file =>
> "$ARGV[0]");
> my @ids;
> while (my $result = $search_in->next_result) {
> while (my $hit = $result->next_hit) {
> while (my $hsp = $hit->next_hsp) {
> my $match = $result->num_hits;
> push(@ids, $qd);
> }
> }
> }
> }
>
> my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]");
> my @seqs;
> while (my $seqobj = $seqio->next_seq) {
> my $nuc = $seqobj->seq();
> push(@seqs, $nuc);
> }
>
> my $it = each_array(@ids, at seqs);
> while(my($ids,$seqs)=$it->()){
> print $ids, "\n", $seqs, "\n";
> }
> *
>
> Thanks again!
> ~Ann
>
> On Wed, Feb 20, 2013 at 9:24 AM, Andreas Leimbach
> <andreas.leimbach at uni-wuerzburg.de
> <mailto:andreas.leimbach at uni-wuerzburg.de>> wrote:
>
>     oops, I just realized I had one loop to much in there. Adam is
>     correct. Sorry.
>
>     The last part of the code I send you should look like this:
>
>
>     my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]");
>     while (my $seqobj = $seqio->next_seq) {
>     print ">$hits{$seqobj->display_id}\__n";
>
>     my $nuc = $seqobj->seq();
>     print $nuc, "\n";
>     }
>
>
>     Cheers,
>     Andreas
>
>
>     --
>     Andreas Leimbach
>     Universit?t M?nster
>     Institut f?r Hygiene
>     Mendelstr. 7
>     D-48149 M?nster
>     Germany
>
>     Tel.: +49 (0)551 39 3843 <tel:%2B49%20%280%29551%2039%203843>
>     E-Mail: andreas.leimbach at uni-__wuerzburg.de
>     <mailto:andreas.leimbach at uni-wuerzburg.de>
>
>     On 20.2.13 06:20, Ann Gregory wrote:
>
>         Hi BioPerl,
>
>         I am having issues with a BioPerl script. I have a blastxml file
>         from a
>         blastx blast and the original multifasta file containing the
>         original
>         nucleotides sequences.
>
>         I want to take the blast result (ie. the blast description) and
>         annotate my
>         multifasta file.
>
>         I have written 2 while loops that extract the blast descriptions
>         as well as
>         the nucleotide sequence from the multifasta file.
>
>         My problem is that I cannot incorporate one of the while loops
>         into the
>         other without loosing the loop property of one of the loops. I
>         would like
>         to take the 1st blast description, then the 1st nucleotide
>         sequence, then
>         the 2nd blast description, then the 2nd nucleotide sequence and so
>         on...just can figure out how to alternate the results.
>
>         See script below:
>
>
>         use warnings;
>         use strict;
>         use Bio::SearchIO;
>         use Bio::SeqIO;
>
>
>         my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file =>
>         "$ARGV[0]");
>         while (my $result = $search_in->next_result) {
>         while (my $hit = $result->next_hit) {
>         while (my $hsp = $hit->next_hsp) {
>         my $qd = $hit->description;
>         print $qd, "\n";
>         }
>         }
>         }
>
>         my $seqio = Bio::SeqIO->new(-format => 'fasta', -file =>
>         "$ARGV[1]");
>         while (my $seqobj = $seqio->next_seq) {
>         my $nuc = $seqobj->seq();
>         print $nuc, "\n";
>         }--
>         Ann (Nina) Gregory
>         Graduate Student
>         Rich Lab / Sullivan Lab
>         Soil, Water, Environmental Science Department
>         University of Arizona
>         _________________________________________________
>         Bioperl-l mailing list
>         Bioperl-l at lists.open-bio.org <mailto:Bioperl-l at lists.open-bio.org>
>         http://lists.open-bio.org/__mailman/listinfo/bioperl-l
>         <http://lists.open-bio.org/mailman/listinfo/bioperl-l>
>
>
>
>
> --
> Ann (Nina) Gregory
> Graduate Student
> Rich Lab / Sullivan Lab
> Soil, Water, Environmental Science Department
> University of Arizona
>
>
>


From cjfields at illinois.edu  Wed Feb 20 18:24:58 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 20 Feb 2013 18:24:58 +0000
Subject: [Bioperl-l] Problem Parsing BLAST output to annotate FASTA file
In-Reply-To: <51250143.9050503@uni-wuerzburg.de>
References: <CAHxs2gtL=UVAh_f7nSCFKAOj11wf92MThNqHCDxAEfRyb+M_zw@mail.gmail.com>
	<5124F8D2.4020904@uni-wuerzburg.de>
	<CAHxs2gtYf70wvFtEX2nFZEtTsUcuw0i1nHzKBRL=H4tcVo+vBQ@mail.gmail.com>
	<51250143.9050503@uni-wuerzburg.de>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE2EB4A@CHIMBX5.ad.uillinois.edu>

If this is meant to be something done using the same FASTA files for a bunch of BLAST reports, might be worth setting up a flat file index and using that to look up and grab the sequences; it should be a LOT faster, just the first pass (generation of the initial index) would take a little time.  Look at Bio::DB::Fasta for an example.

chris

On Feb 20, 2013, at 11:00 AM, Andreas Leimbach <andreas.leimbach at uni-wuerzburg.de>
 wrote:

> Hey Ann,
> 
> damn, it 's not my best day ... Anyways, I wouldn't work with List::MoreUtils's each_array function, as this assumes that the blast hits and the nucleotide queries are in the same order (as Adam pointed out). Rather use a hash which associates a key to a certain value. Also, the hash can be used to skip sequences that have no hits.
> Here's my new version:
> 
> my %hits;
> my $hit_desc;
> my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file =>
> "$ARGV[0]");
> while (my $result = $search_in->next_result) {
> while (my $hit = $result->next_hit) {
> while (my $hsp = $hit->next_hsp) {
> $hits{$result->query_description} = $hit->description; # hash: associate query_desc (key) with hit_desc (value)
> last; # jump out of the while loop; this should resolve getting only the first hit
> }
> last; # see above
> }
> }
> 
> 
> my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]");
> while (my $seqobj = $seqio->next_seq) {
> if ($hits{$seqobj->display_id}) { # only true if display_id associated with hit_desc and should skip seqs without hits
> print ">$hits{$seqobj->display_id}\n";
> my $nuc = $seqobj->seq();
> print $nuc, "\n";
> }
> }
> 
> Cheers,
> Andreas
> 
> P.S.: I redirected your mail to the BioPerl mailing list, others might profit from my mistakes ;-) ...
> 
> --
> Andreas Leimbach
> Universit?t M?nster
> Institut f?r Hygiene
> Mendelstr. 7
> D-48149 M?nster
> Germany
> 
> Tel.: +49 (0)551 39 3843
> E-Mail: andreas.leimbach at uni-wuerzburg.de
> 
> On 20.2.13 17:35, Ann Gregory wrote:
>> Hi Andreas,
>> 
>> Thanks for you help! I don't understand how this gets the first blast hit:
>> 
>> if ($hit->description eq $hit_desc) { # Only want the first blast hit
>> next;
>> }
>> 
>> I tried this and seems to be working...but I can't get the 1st blast hit
>> or skip the sequences that had no hits. Do you know any quick fixes?
>> 
>> *
>> use warnings;
>> use strict;
>> use Bio::SearchIO;
>> use Bio::SeqIO;
>> use List::MoreUtils qw(each_array);
>> 
>> my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file =>
>> "$ARGV[0]");
>> my @ids;
>> while (my $result = $search_in->next_result) {
>> while (my $hit = $result->next_hit) {
>> while (my $hsp = $hit->next_hsp) {
>> my $match = $result->num_hits;
>> push(@ids, $qd);
>> }
>> }
>> }
>> }
>> 
>> my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]");
>> my @seqs;
>> while (my $seqobj = $seqio->next_seq) {
>> my $nuc = $seqobj->seq();
>> push(@seqs, $nuc);
>> }
>> 
>> my $it = each_array(@ids, at seqs);
>> while(my($ids,$seqs)=$it->()){
>> print $ids, "\n", $seqs, "\n";
>> }
>> *
>> 
>> Thanks again!
>> ~Ann
>> 
>> On Wed, Feb 20, 2013 at 9:24 AM, Andreas Leimbach
>> <andreas.leimbach at uni-wuerzburg.de
>> <mailto:andreas.leimbach at uni-wuerzburg.de>> wrote:
>> 
>>    oops, I just realized I had one loop to much in there. Adam is
>>    correct. Sorry.
>> 
>>    The last part of the code I send you should look like this:
>> 
>> 
>>    my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]");
>>    while (my $seqobj = $seqio->next_seq) {
>>    print ">$hits{$seqobj->display_id}\__n";
>> 
>>    my $nuc = $seqobj->seq();
>>    print $nuc, "\n";
>>    }
>> 
>> 
>>    Cheers,
>>    Andreas
>> 
>> 
>>    --
>>    Andreas Leimbach
>>    Universit?t M?nster
>>    Institut f?r Hygiene
>>    Mendelstr. 7
>>    D-48149 M?nster
>>    Germany
>> 
>>    Tel.: +49 (0)551 39 3843 <tel:%2B49%20%280%29551%2039%203843>
>>    E-Mail: andreas.leimbach at uni-__wuerzburg.de
>>    <mailto:andreas.leimbach at uni-wuerzburg.de>
>> 
>>    On 20.2.13 06:20, Ann Gregory wrote:
>> 
>>        Hi BioPerl,
>> 
>>        I am having issues with a BioPerl script. I have a blastxml file
>>        from a
>>        blastx blast and the original multifasta file containing the
>>        original
>>        nucleotides sequences.
>> 
>>        I want to take the blast result (ie. the blast description) and
>>        annotate my
>>        multifasta file.
>> 
>>        I have written 2 while loops that extract the blast descriptions
>>        as well as
>>        the nucleotide sequence from the multifasta file.
>> 
>>        My problem is that I cannot incorporate one of the while loops
>>        into the
>>        other without loosing the loop property of one of the loops. I
>>        would like
>>        to take the 1st blast description, then the 1st nucleotide
>>        sequence, then
>>        the 2nd blast description, then the 2nd nucleotide sequence and so
>>        on...just can figure out how to alternate the results.
>> 
>>        See script below:
>> 
>> 
>>        use warnings;
>>        use strict;
>>        use Bio::SearchIO;
>>        use Bio::SeqIO;
>> 
>> 
>>        my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file =>
>>        "$ARGV[0]");
>>        while (my $result = $search_in->next_result) {
>>        while (my $hit = $result->next_hit) {
>>        while (my $hsp = $hit->next_hsp) {
>>        my $qd = $hit->description;
>>        print $qd, "\n";
>>        }
>>        }
>>        }
>> 
>>        my $seqio = Bio::SeqIO->new(-format => 'fasta', -file =>
>>        "$ARGV[1]");
>>        while (my $seqobj = $seqio->next_seq) {
>>        my $nuc = $seqobj->seq();
>>        print $nuc, "\n";
>>        }--
>>        Ann (Nina) Gregory
>>        Graduate Student
>>        Rich Lab / Sullivan Lab
>>        Soil, Water, Environmental Science Department
>>        University of Arizona
>>        _________________________________________________
>>        Bioperl-l mailing list
>>        Bioperl-l at lists.open-bio.org <mailto:Bioperl-l at lists.open-bio.org>
>>        http://lists.open-bio.org/__mailman/listinfo/bioperl-l
>>        <http://lists.open-bio.org/mailman/listinfo/bioperl-l>
>> 
>> 
>> 
>> 
>> --
>> Ann (Nina) Gregory
>> Graduate Student
>> Rich Lab / Sullivan Lab
>> Soil, Water, Environmental Science Department
>> University of Arizona
>> 
>> 
>> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From carandraug+dev at gmail.com  Mon Feb 25 10:08:23 2013
From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=)
Date: Mon, 25 Feb 2013 10:08:23 +0000
Subject: [Bioperl-l] module for description of sequence variants (where to
	place code)
Message-ID: <CAPOrs_0X9tF0_4q-KmV_OMu5vPDT7JbRsPZteLf5dYh1n9_vPg@mail.gmail.com>

Hi

I'm writing a perl module to write a description of the variance
between 2 sequences as described on
http://www.hgvs.org/mutnomen/recs-prot.html

Basically, given 2 sequences, would returns something like "p.Lys2del
p.His25_Met26insGln" if those are the differences. It also accounts
for the existence of - characters on the sequences that may come from
their alignment.

My question is, where on the project tree should I place the module?

Also, is there something already written that would convert from 1 to
3 letter code?

Carn?


From andreas.leimbach at uni-wuerzburg.de  Mon Feb 25 10:32:43 2013
From: andreas.leimbach at uni-wuerzburg.de (Andreas Leimbach)
Date: Mon, 25 Feb 2013 11:32:43 +0100
Subject: [Bioperl-l] module for description of sequence variants (where
 to place code)
In-Reply-To: <CAPOrs_0X9tF0_4q-KmV_OMu5vPDT7JbRsPZteLf5dYh1n9_vPg@mail.gmail.com>
References: <CAPOrs_0X9tF0_4q-KmV_OMu5vPDT7JbRsPZteLf5dYh1n9_vPg@mail.gmail.com>
Message-ID: <512B3DCB.7050008@uni-wuerzburg.de>

Hi Carn?,

for your last question:
You can convert aa strings from one to three letter code with 
'Bio::SeqUtils'.

Cheers,
Andreas

--
Andreas Leimbach
Universit?t M?nster
Institut f?r Hygiene
Mendelstr. 7
D-48149 M?nster
Germany

Tel.: +49 (0)551 39 3843
E-Mail: andreas.leimbach at uni-wuerzburg.de

On 25.2.13 11:08, Carn? Draug wrote:
> Hi
>
> I'm writing a perl module to write a description of the variance
> between 2 sequences as described on
> http://www.hgvs.org/mutnomen/recs-prot.html
>
> Basically, given 2 sequences, would returns something like "p.Lys2del
> p.His25_Met26insGln" if those are the differences. It also accounts
> for the existence of - characters on the sequences that may come from
> their alignment.
>
> My question is, where on the project tree should I place the module?
>
> Also, is there something already written that would convert from 1 to
> 3 letter code?
>
> Carn?
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From genehack at genehack.org  Thu Feb 28 00:57:48 2013
From: genehack at genehack.org (John SJ Anderson)
Date: Wed, 27 Feb 2013 16:57:48 -0800
Subject: [Bioperl-l] YAPC talks?
Message-ID: <CABJ3DF_o2n2nS5ywzweYaaA6AQzXuQ-KPQHp80QkVv+U09T0aw@mail.gmail.com>

Hi -

Is there anyone that was planning on submitting a Bioperl talk to
YAPC::NA? In an unrelated conversation, one of the organizers
expressed an interest in getting a Bioperl talk this year.

If no one else is planning on a talk submission, Jay Hannah (aka
deafferret) and I are promising/threatening a tag-team style "Bioperl
rules / Bioperl sucks" overview/state of the dist style talk...

thanks,
john.


From cjfields at illinois.edu  Thu Feb 28 02:48:55 2013
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Thu, 28 Feb 2013 02:48:55 +0000
Subject: [Bioperl-l] YAPC talks?
In-Reply-To: <CABJ3DF_o2n2nS5ywzweYaaA6AQzXuQ-KPQHp80QkVv+U09T0aw@mail.gmail.com>
References: <CABJ3DF_o2n2nS5ywzweYaaA6AQzXuQ-KPQHp80QkVv+U09T0aw@mail.gmail.com>
Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6E705CD3@CHIMBX5.ad.uillinois.edu>

At the moment I personally have no plans on going, but I think a no-holds-barred bioperl talk is a good idea.  

chris

On Feb 27, 2013, at 6:57 PM, John SJ Anderson <genehack at genehack.org> wrote:

> Hi -
> 
> Is there anyone that was planning on submitting a Bioperl talk to
> YAPC::NA? In an unrelated conversation, one of the organizers
> expressed an interest in getting a Bioperl talk this year.
> 
> If no one else is planning on a talk submission, Jay Hannah (aka
> deafferret) and I are promising/threatening a tag-team style "Bioperl
> rules / Bioperl sucks" overview/state of the dist style talk...
> 
> thanks,
> john.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From hlapp at drycafe.net  Thu Feb 28 03:20:34 2013
From: hlapp at drycafe.net (Hilmar Lapp)
Date: Wed, 27 Feb 2013 22:20:34 -0500
Subject: [Bioperl-l] YAPC talks?
In-Reply-To: <CABJ3DF_o2n2nS5ywzweYaaA6AQzXuQ-KPQHp80QkVv+U09T0aw@mail.gmail.com>
References: <CABJ3DF_o2n2nS5ywzweYaaA6AQzXuQ-KPQHp80QkVv+U09T0aw@mail.gmail.com>
Message-ID: <42C1F1B8-FE26-43A8-B601-E80D17D215EC@drycafe.net>


On Feb 27, 2013, at 7:57 PM, John SJ Anderson wrote:

> Jay Hannah (aka deafferret) and I are promising/threatening a tag-team style "Bioperl
> rules / Bioperl sucks" overview/state of the dist style talk...

Please videotape. I'll be sure to watch and promote it :-)

	-hilmar
-- 
===========================================================
: Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net :
===========================================================


From saladi1 at illinois.edu  Thu Feb 28 06:58:20 2013
From: saladi1 at illinois.edu (Shyam Saladi)
Date: Wed, 27 Feb 2013 22:58:20 -0800
Subject: [Bioperl-l] EUtilities Cookbook - Accn to gi
Message-ID: <CAARX5cXXD_DNb+Sbt-_zXvsn63QAaVBcot9YGtEjQ7ucrqAEKQ@mail.gmail.com>

Hi,

I think that rettype for the section "Get GIs for a list of accessions"
should be

-rettype => 'gi');

instead of 'gilist' as it is now. I think this change is due to a change in
NCBI eutils.

webpage:
http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook#Get_GIs_for_a_list_of_accessions

Thanks,
Shyam


From fossandonc at hotmail.com  Thu Feb 28 15:36:34 2013
From: fossandonc at hotmail.com (=?iso-8859-1?Q?Francisco_J._Ossand=F3n?=)
Date: Thu, 28 Feb 2013 12:36:34 -0300
Subject: [Bioperl-l] Fix for Bug #3376 broke somewhere else
Message-ID: <SNT133-ds14A180BAFAE068EE359031CFFE0@phx.gbl>

Hi,
I was re-checking Bug #3302 using the Bio::SearchIO modules of the
repository and found that now it can't parse a Hmmer2 file that was
previously fine. After tracking the problem, I discovered that a change in a
regular expression to fix another bug broke the parse.
 
The fix for the Bug #3376 consisted in adding an extra condition to omit
lines where end of domain indicator is split across lines
(https://redmine.open-bio.org/issues/3376):
TEST: domain 1 of 1, from 8 to 97: score 184.7, E = 2.5e-56
                   *->svfqqqqssksttgstvtAiAiAigYRYRYRAvtWnsGsLssGvnDn
                      sv+qqqq+  +    +vtAiAiAigYRYRYRAv Wn GsLs G nDn
        Test     8    SVYQQQQGGSA----MVTAIAIAIGYRYRYRAVVWNKGSLSTGTNDN 50   

                   DnDqqsdgLYtiYYsvtvpssslpsqtviHHHaHkasstkiiikiePr<-
                   DnDq +d LYtiYYsvtv +ss+p q+v+HHHaH+asstkiiiki P   
        Test    51 DNDQAAD-LYTIYYSVTVSASSWPGQSVTHHHAHPASSTKIIIKIAPS   97   

                   *

        Test     -   -
This case is characterized by the 2 dashes in the line...

So the expression added in hmmer2.pm - ?next_result?
(https://github.com/bioperl/bioperl-live/commit/142e5d79e3a6593db32bf0af9904
8f47d01bd3f2):
                        elsif (CORE::length($_) == 0
                            || ( $count != 1 && /^\s+$/o )
                            || /^\s+\-?\*\s*$/
                            || /^.+\-\s+\-\s*$/ ) ### <--- This regex was
designed for bug 3376
                        {
                            next;
                        }

But the expression used is too broad because it uses the "^.+" just before
the 2 dashes, and it broke these lines parsing, where is full of dashes:
                   KyACrqCdtiVQAPaPakpIErGiptaGLLArvlVSKyaEHlPLYRQsEI
                                                                     
  lcl|gi|340     - -------------------------------------------------- -    

                   yaRqGVeiaRstLadWVgrtgarLaPLvdALaeyVLkeGklHADeTPVqV
                         +i  s L   V++ + r                           
  lcl|gi|340 60938 ------AIMISGLIHGVSARCLRF-------------------------- 60955

I think a reasonable fix that still fixes the original bug and restore the
function for this case is to add an extra \s+ in the regex just before the
first dash, so the expression makes sure that the first dash is the one that
comes AFTER the description (and is replacing the usual coordinate number)
and is not the last of an alignment or a series of dashes like the one
above:
                        elsif (CORE::length($_) == 0
                            || ( $count != 1 && /^\s+$/o )
                            || /^\s+\-?\*\s*$/
                            || /^.+\s+\-\s+\-\s*$/ ) ### <--- Tweaked regex
                        {
                            next;
                        }
I tested it and it works fine, hope you find the fix acceptable.

Cheers,

--
Francisco J. Ossandon
Bioinformatician.
Ph.D. Candidate, University Andres Bello.
Center for Bioinformatics and Genome Biology,
Fundacion Ciencia para la Vida.
Santiago, Chile.
www.cienciavida.cl/CBGB.htm


From PDagosto at edgebio.com  Mon Feb 25 16:50:34 2013
From: PDagosto at edgebio.com (Phil Dagosto)
Date: Mon, 25 Feb 2013 16:50:34 +0000
Subject: [Bioperl-l] Error when running Build.PL
Message-ID: <DC8C6FE0AED292469CF192A00459937BC0F8660B@EDGE-EXCH02.edgebio.com>

Greetings,

I downloaded BioPerl 1.6.1 from this location: http://www.bioperl.org/wiki/Getting_BioPerl

When I ran Build.PL with all of the default settings chosen in the interactive mode I got the following error message:

Could not get valid metadata. Error is: Invalid metadata structure. Errors: 'Perl_5' for 'license' does not have a URL scheme (resources -> license) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::FeatureIO::gff -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::WebAgent -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::EUtilParameters -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::OntologyIO::InterProParser -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Biblio::IO::medlinexml -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::strider -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::PhyloNetwork::RandomFactory -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::Analysis::DNA::ESEfinder -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::game::gameSubs -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::FeatureIO::interpro -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::GFF::Adaptor::berkeleydb -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::entrezgene -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::tinyseq -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::chadoxml -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::game::gameWriter -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::FileCache -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::bsml_sax -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::Primer3 -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::GFF::Adaptor::ace -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::PopGen::HtSNP -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tree::Compatible -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::Ace -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::Taxonomy::entrez -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::agave -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::PopGen::TagHaplotype -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::SeqFeature::Store::FeatureFileLoader -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::* -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::Analysis::Protein* -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SearchIO::blastxml -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::EUtilities -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tree::Draw::Cladogram -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::SeqPattern -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::tigrxml -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqFeature::Collection -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Draw::Pictogram -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SearchIO::Writer::BSMLResultWriter -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::Query::HIVQuery -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::TreeIO::svggraph -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::Biblio::eutils -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::PhyloNetwork -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::SeqPattern::BackTranslate -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::Query::GenBank -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Variation::IO::xml -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::PhyloNetwork::GraphViz -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqFeature::Annotated -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::NCBIHelper -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::HIV -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::Analysis::DNA* -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::Run::RemoteBlast -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::excel -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::ClusterIO::dbsnp -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Microarray::Tools::ReseqChip -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::Biblio::soap -> requires) [Validation: 1.2]
at /usr/local/lib/perl5/5.10.1/Module/Build/Base.pm line 4559

Could not create MYMETA files
Creating new 'Build' script for 'BioPerl' version '1.006001'

I have no idea whether this is a problem or not or if I can proceed. Also, I'm confused by the version number referenced in the last line. 1.006001 is our current version - I thought I was installing version 1.6.1. Are these version numbers equivalent, i.e., are the zeros not meaningful?.

I was actually looking for version 1.2.3 (or greater) - where can I find that?

Thanks,
Phil

Phil Dagosto
Sr. Software Engineer
Edge Bio
201 Perry Parkway, Suite 5
Gaithersburg, MD 20850

pdagosto at edgebio.com
(240) 912-8669