From jason.stajich at gmail.com Fri Feb 1 01:58:57 2013 From: jason.stajich at gmail.com (Jason Stajich) Date: Thu, 31 Jan 2013 22:58:57 -0800 Subject: [Bioperl-l] Bioperl-l Digest, Vol 117, Issue 13 In-Reply-To: <575B184F-C051-4FEF-9BEB-7AB98E3C52A6@yahoo.com> References: <575B184F-C051-4FEF-9BEB-7AB98E3C52A6@yahoo.com> Message-ID: Dan - I think the answer is yes if others are doing it - I am not in a position to be much of a main coder. I don't know which format you speak of here or if you had to write something for the text blast changes or something else. Specific bug reports on formats that aren't working is always helpful. The XML format has been pretty stable so I would suggest that if you are simply parsing reports not looking at them. Chris posted instructions on how to contribute and the move to github simplifies this. That you had to write a whole new parser seems probably a bit severe - I hope that in the future people can speak to the problems sooner. If I hit a wall with something I can't do I usually write the code to fix it and contribute it back but I don't play follow-the-format-changes with the tools anymore, but hopefully others like yourself can make the contributions. If you speak to the response I made to the question below, I don't think anyone will be trying and support the NCBI's additional markups that refer to the upstream and downstream features as they are laid out in the text files without some serious effort. Perhaps in the future that information will be reported in the XML format and thus be more parseable. best wishes, Jason On Jan 30, 2013, at 1:40 PM, Dan kilburn wrote: > Hi Jason, > > Are there any plans to keep SearchIO up to date with ncbi blast? I know they change formats ridiculously often, but I had to write my own parser to get sequence identity, which I would rather not have done. I realize that this job would be a big load on anyone who takes it, but it's so fundamental. Maybe I can help. > > --Dan > Sent from my iPhone > > On Jan 30, 2013, at 12:00 PM, bioperl-l-request at lists.open-bio.org wrote: > >> Send Bioperl-l mailing list submissions to >> bioperl-l at lists.open-bio.org >> >> To subscribe or unsubscribe via the World Wide Web, visit >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> or, via email, send a message with subject or body 'help' to >> bioperl-l-request at lists.open-bio.org >> >> You can reach the person managing the list at >> bioperl-l-owner at lists.open-bio.org >> >> When replying, please edit your Subject line so it is more specific >> than "Re: Contents of Bioperl-l digest..." >> >> >> Today's Topics: >> >> 1. Re: Parsing Blast-Report extracting "Features flanking .." >> (Jason Stajich) >> >> >> ---------------------------------------------------------------------- >> >> Message: 1 >> Date: Tue, 29 Jan 2013 11:00:16 -0800 >> From: Jason Stajich >> Subject: Re: [Bioperl-l] Parsing Blast-Report extracting "Features >> flanking .." >> To: buschj at hhu.de >> Cc: bioperl-l at lists.open-bio.org >> Message-ID: <6E83E3F3-C304-4DC4-9A11-FE1CA90F207D at gmail.com> >> Content-Type: text/plain; charset=us-ascii >> >> We don't parse the NCBI feature info from the BLAST reports per your query. To look up a specific feature you can use Bio::DB::GenBank to query for sequence from a specific feature by accession number - see the HOWTOs for that. >> >> However, most people use tools that generate SAM/BAM files with short reads - then you can use a tool like bedtools to find overlaps of reads with the locations of features. >> >> basically: >> - download the genome and GFF for arabidopsis >> - align your sRNA to the genome with a short read aligner - bowtie, bwa, others >> - convert your sam to bam file with SAMtools or picard >> - compare the location of features with the reads to get expression summaries or individuals reads with BEDTools >> >> >> On Jan 25, 2013, at 2:20 AM, jobu wrote: >> >>> Am 22.01.2013 19:03, schrieb Mgavi Brathwaite: >>>> What upstream and downstream elements are you interested in? >>> >>> >>> I've got a huge pile of short RNA reads. >>> Part of the question now is whether those RNA fragments originate from >>> siRNA events, >>> or may represent miRNAs / parts of pre-miRNAs. >>> >>> So I did an online blast search against database nt. >>> The resulting report quite often just gives subject information like this: >>> >>> ----- >>>> gb|CP002686.1| Arabidopsis thaliana chromosome 3, complete sequence >>> Length=23459830 >>> ----- >>> >>> Now I would like to get the hit's neighbouring regions for further >>> analysis. >>> Preferably I would like to do that in an automized way, but the only >>> possible action with this kind of subject gi | description would be to >>> fetch the entire chromosomal sequence I guess ? >>> >>> However, >>> right below the line above, the report states more precisely: >>> >>> ------ >>> Features flanking this part of subject sequence: >>> 8872 bp at 5' side: cytochrome P450 90B1 >>> 402 bp at 3' side: U1 small nuclear ribonucleoprotein-70K >>> ------ >>> >>> Still I would like to have the possibility to automatically fetch the >>> subject's sequence(s), >>> as of now I think parsing the report with SearchIO won't let me aquire >>> that information, because SearchIO does not recognize report sections >>> like those. >>> >>> I hope I did not miss any of SearchIOs capabilities, but I could not >>> find any method covering my wish?! >>> >>> Right now maybe the only way to get the information I want is to >>> construct my own parser and write it out into a separate file, which in >>> turn again I could read into a hash before processing the Blast-Report >>> with SearchIO to combine both data for further automized work. >>> >>> I am aware though that even successfully getting the flanking features >>> would leave me with the more or less wide intergenic gap my hsp is >>> located in. >>> >>> However I'm in need of a way to get the flanking features including >>> their annotation and the region spanning between them. >>> But I hope I do not have to get complete sequences to accomplish that, >>> as this would be kind of an overkill. >>> >>> with kind regards >>> Jochen >>> >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> Jason Stajich >> jason.stajich at gmail.com >> jason at bioperl.org >> >> >> >> >> ------------------------------ >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> End of Bioperl-l Digest, Vol 117, Issue 13 >> ****************************************** > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Jason Stajich jason.stajich at gmail.com jason at bioperl.org From dr_kilburn59 at yahoo.com Fri Feb 1 09:25:34 2013 From: dr_kilburn59 at yahoo.com (Dan Kilburn) Date: Fri, 1 Feb 2013 06:25:34 -0800 (PST) Subject: [Bioperl-l] Bioperl-l Digest, Vol 117, Issue 13 In-Reply-To: References: <575B184F-C051-4FEF-9BEB-7AB98E3C52A6@yahoo.com> Message-ID: <1359728734.27412.YahooMailNeo@web162006.mail.bf1.yahoo.com> Hi Jason, ? Thanks for?the detailed feedback.? The real reason I had to write my own parser is that even with close, repeated support from NCBI we couldn't get XML output with short_web_blast.pl?because the parameter that turns on XML output was not functioning (they've probably fixed it by now), and I had to crank out a parser asap to support a job talk. ? I don't think the upstream and downstream feature reports are particulalry useful, becase in mammals they tend to be so far away that they are not likely to be biologically relevant.? But the internal motif reports are useful, maybe especially if you are blasting short reads, like I was.? A 16-mer preserved domain hit is really good if you're blasting 18-mer Illumina short reads, like I was. ? As far as my involvement goes, I got diagnosed with cancer on Wednesday, so I'll be taking a step back until next week's surgery and taking a lot a deep breaths.? On the other hand, this just makes me more motivated: I've been thinking alot about time, and timely contributions, the last two days. ? Cheers, Dan ________________________________ From: Jason Stajich To: Dan kilburn Cc: "bioperl-l at lists.open-bio.org" Sent: Friday, February 1, 2013 1:58 AM Subject: Re: [Bioperl-l] Bioperl-l Digest, Vol 117, Issue 13 Dan -? I think the answer is yes if others are doing it - I am not in a position to be much of a main coder. I don't know which format you speak of here or if you had to write something for the text blast changes or something else. ?Specific bug reports on formats that aren't working is always helpful. ?The XML format has been pretty stable so I would suggest that if you are simply parsing reports not looking at them. Chris posted instructions on how to contribute and the move to github simplifies this. ?That you had to write a whole new parser seems probably a bit severe - I hope that in the future people can speak to the problems sooner. If I hit a wall with something I can't do I usually write the code to fix it and contribute it back but I don't play follow-the-format-changes with the tools anymore, but hopefully others like yourself can make the contributions. If you speak to the response I made to the question below, I don't think anyone will be trying and support the NCBI's additional markups that refer to the upstream and downstream features as they are laid out in the text files without some serious effort. Perhaps in the future that information will be reported in the XML format and thus be more parseable. best wishes, Jason On Jan 30, 2013, at 1:40 PM, Dan kilburn wrote: Hi Jason, > >Are there any plans to keep SearchIO up to date with ncbi blast? I know they change formats ridiculously often, but I had to write my own parser to get sequence identity, which I would rather not have done. I realize that this job would be a big load on anyone who takes it, but it's so fundamental. Maybe I can help. > >--Dan >Sent from my iPhone > >On Jan 30, 2013, at 12:00 PM, bioperl-l-request at lists.open-bio.org wrote: > > >Send Bioperl-l mailing list submissions to >>??bioperl-l at lists.open-bio.org >> >>To subscribe or unsubscribe via the World Wide Web, visit >>??http://lists.open-bio.org/mailman/listinfo/bioperl-l >>or, via email, send a message with subject or body 'help' to >>??bioperl-l-request at lists.open-bio.org >> >>You can reach the person managing the list at >>??bioperl-l-owner at lists.open-bio.org >> >>When replying, please edit your Subject line so it is more specific >>than "Re: Contents of Bioperl-l digest..." >> >> >>Today's Topics: >> >>?1. Re: ?Parsing Blast-Report extracting "Features flanking ???.." >>????(Jason Stajich) >> >> >>---------------------------------------------------------------------- >> >>Message: 1 >>Date: Tue, 29 Jan 2013 11:00:16 -0800 >>From: Jason Stajich >>Subject: Re: [Bioperl-l] Parsing Blast-Report extracting "Features >>??flanking ???.." >>To: buschj at hhu.de >>Cc: bioperl-l at lists.open-bio.org >>Message-ID: <6E83E3F3-C304-4DC4-9A11-FE1CA90F207D at gmail.com> >>Content-Type: text/plain; ???charset=us-ascii >> >>We don't parse the NCBI feature info from the BLAST reports per your query. To look up a specific feature you can use Bio::DB::GenBank to query for sequence from a specific feature by accession number - see the HOWTOs for that. >> >>However, most people use tools that generate SAM/BAM files with short reads - then you can use a tool like bedtools to find overlaps of reads with the locations of features. >> >>basically: >>- download the genome and GFF for arabidopsis >>- align your sRNA to the genome with a short read aligner - bowtie, bwa, others >>- convert your sam to bam file with SAMtools or picard >>- compare the location of features with the reads to get expression summaries or individuals reads with BEDTools >> >> >>On Jan 25, 2013, at 2:20 AM, jobu wrote: >> >> >>Am 22.01.2013 19:03, schrieb Mgavi Brathwaite: >>> >>>What upstream and downstream elements are you interested in? >>>> >>> >>>I've got a huge pile of short RNA reads. >>>Part of the question now is whether those RNA fragments originate from >>>siRNA events, >>>or may represent miRNAs / parts of pre-miRNAs. >>> >>>So I did an online ?blast search against database nt. >>>The resulting report quite often just gives subject information like this: >>> >>>----- >>> >>>gb|CP002686.1| Arabidopsis thaliana chromosome 3, complete sequence >>>>Length=23459830 >>>----- >>> >>>Now I would like to get the hit's neighbouring regions ?for further >>>analysis. >>>Preferably I would like to do that ?in an automized way, but the only >>>possible action with this kind of subject gi | description would be to >>>fetch the entire chromosomal ?sequence I guess ? >>> >>>However, >>>right below the line above, the report states more precisely: >>> >>>------ >>>Features flanking this part of subject sequence: >>>8872 bp at 5' side: cytochrome P450 90B1 >>>402 bp at 3' side: U1 small nuclear ribonucleoprotein-70K >>>------ >>> >>>Still I would like to have the possibility to automatically fetch the >>>subject's sequence(s), >>>as of now I think ?parsing the report with SearchIO won't let me aquire >>>that information, because SearchIO does not recognize report sections >>>like those. >>> >>>I hope I did not miss any of SearchIOs capabilities, but I could not >>>find any method covering my wish?! >>> >>>Right now maybe the only way to get the information I want is to >>>construct my own parser and write it out into a separate file, which in >>>turn again ?I could read into a hash before processing the Blast-Report >>>with SearchIO to combine both data for further automized work. >>> >>>I am aware though that even successfully getting the flanking features >>>would leave me with the more or less wide ?intergenic gap my hsp is >>>located in. >>> >>>However I'm in need of a way to get the flanking features including >>>their annotation and the region spanning between them. >>>But I hope I do not have to get complete sequences to accomplish that, >>>as this would be kind of an overkill. >>> >>>with kind regards >>>Jochen >>> >>> >>> >>>_______________________________________________ >>>Bioperl-l mailing list >>>Bioperl-l at lists.open-bio.org >>>http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>Jason Stajich >>jason.stajich at gmail.com >>jason at bioperl.org >> >> >> >> >>------------------------------ >> >>_______________________________________________ >>Bioperl-l mailing list >>Bioperl-l at lists.open-bio.org >>http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >>End of Bioperl-l Digest, Vol 117, Issue 13 >>****************************************** >> >_______________________________________________ >Bioperl-l mailing list >Bioperl-l at lists.open-bio.org >http://lists.open-bio.org/mailman/listinfo/bioperl-l > Jason Stajich jason.stajich at gmail.com jason at bioperl.org From carandraug+dev at gmail.com Sat Feb 2 20:44:31 2013 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Sun, 3 Feb 2013 01:44:31 +0000 Subject: [Bioperl-l] TCofee does not accept named arguments and issue with output option Message-ID: Hi the TCoffee module does not options of the named argument type: -arg => option one needs to do like 'arg' => option Is there a special reason for this? I tracked down this to the commit 7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e 12 years ago[1]. A comment on the code actually says "don't want named parameters"[2] (though the commit message sounds pretty innocuous "migrated to new Bio::Root::RootI chained new"). Is there a reason for this? The rest of bioperl has no issue with named parameters, and the API should be the same as Clustalw which also has no problem with it. This is very easy to fix, I can submit a pull request no problem. Also, shouldn't the code complain in the case of non-supported options? Took me a very long time to find out the problem because there was no complaints coming from the code. There is also a problem with the way it handles the output option. I'll have to look closer into it, but the documentation is simply incorrect. "'output' => 'fasta_aln'" gives an error while just 'fasta' (undocumented), works fine. Carn? [1] https://github.com/carandraug/bioperl-run/commit/7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e [2] https://github.com/carandraug/bioperl-run/commit/7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e#L0R374 From cjfields at illinois.edu Sun Feb 3 16:54:51 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Sun, 3 Feb 2013 21:54:51 +0000 Subject: [Bioperl-l] TCofee does not accept named arguments and issue with output option In-Reply-To: References: Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE14D30@CHIMBX5.ad.uillinois.edu> Carn?, On Feb 2, 2013, at 7:44 PM, Carn? Draug wrote: > Hi > > the TCoffee module does not options of the named argument type: > > -arg => option > > one needs to do like > > 'arg' => option > > Is there a special reason for this? I tracked down this to the commit > > 7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e > > 12 years ago[1]. A comment on the code actually says "don't want named > parameters"[2] (though the commit message sounds pretty innocuous > "migrated to new Bio::Root::RootI chained new"). Is there a reason for > this? The rest of bioperl has no issue with named parameters, and the > API should be the same as Clustalw which also has no problem with it. > This is very easy to fix, I can submit a pull request no problem. IIRC the reasoning behind this was to differentiate Bioperl parameters from command-specific ones. This decision predates my involvement w/ core dev, but my general feeling is that anything that is an object attribute (regardless whether it is a direct representation of a value passed to a wrapped program or not) should be preceded by '-' for consistency. The downside of big changes like this: potential backwards compatibility issues. Such changes would need to be tested out rigorously, as there are a ton of old scripts that would potentially break with a direct change. I don't have a problem breaking this with a bioperl 2.0 release, though. > Also, shouldn't the code complain in the case of non-supported > options? Took me a very long time to find out the problem because > there was no complaints coming from the code. Yes, it should complain when options are given that do not make sense, some validation would help there. With some modules this might be a side-effect of using AUTOLOAD or simply not checking the parameters. > There is also a problem with the way it handles the output option. > I'll have to look closer into it, but the documentation is simply > incorrect. "'output' => 'fasta_aln'" gives an error while just 'fasta' > (undocumented), works fine. That's entirely possible. > Carn? > [1] https://github.com/carandraug/bioperl-run/commit/7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e > [2] https://github.com/carandraug/bioperl-run/commit/7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e#L0R374 As an aside, there are a few downsides of trying to implement command-line parameters as perl object attributes (getter/setter), one being that many can't be directly represented as an object attribute (namely, anything that can't be a getter/setter named subroutine, such as those having hyphens, starting with a number, etc) so you have to hack your way around it. Infernal was this way IIRC. Maybe these should just be simply stored as a semi-validated set of key-value pairs. chris From carandraug+dev at gmail.com Sun Feb 3 23:34:22 2013 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Mon, 4 Feb 2013 04:34:22 +0000 Subject: [Bioperl-l] TCofee does not accept named arguments and issue with output option In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE14D30@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE14D30@CHIMBX5.ad.uillinois.edu> Message-ID: On 3 February 2013 21:54, Fields, Christopher J wrote: > On Feb 2, 2013, at 7:44 PM, Carn? Draug wrote: > >> Hi >> >> the TCoffee module does not options of the named argument type: >> >> -arg => option >> >> one needs to do like >> >> 'arg' => option >> >> Is there a special reason for this? I tracked down this to the commit >> >> 7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e >> >> 12 years ago[1]. A comment on the code actually says "don't want named >> parameters"[2] (though the commit message sounds pretty innocuous >> "migrated to new Bio::Root::RootI chained new"). Is there a reason for >> this? The rest of bioperl has no issue with named parameters, and the >> API should be the same as Clustalw which also has no problem with it. >> This is very easy to fix, I can submit a pull request no problem. > > IIRC the reasoning behind this was to differentiate Bioperl parameters from command-specific ones. This decision predates my involvement w/ core dev, but my general feeling is that anything that is an object attribute (regardless whether it is a direct representation of a value passed to a wrapped program or not) should be preceded by '-' for consistency. > > The downside of big changes like this: potential backwards compatibility issues. Such changes would need to be tested out rigorously, as there are a ton of old scripts that would potentially break with a direct change. I don't have a problem breaking this with a bioperl 2.0 release, though. Should passing the tests be enough? There's one for TCofee. At the moment I don't see how this would cause compatibility issues, we are adding an option, not removing it. But the comment on the code, stating plainly that the -param API was not wanted caught me by surpise and why I'm asking. > As an aside, there are a few downsides of trying to implement command-line parameters as perl object attributes (getter/setter), one being that many can't be directly represented as an object attribute (namely, anything that can't be a getter/setter named subroutine, such as those having hyphens, starting with a number, etc) so you have to hack your way around it. Infernal was this way IIRC. Maybe these should just be simply stored as a semi-validated set of key-value pairs. >From a quick glance at the list of TCoffee parameters I don't at the moment see any that should cause problem. I have submitted a bug report[1] which mentions some other issues I found with TCoffee. If someone could comment on them would be great and I can start fixing it. Carn? [1] https://redmine.open-bio.org/issues/3406 From whereverroadgoes at gmail.com Mon Feb 4 10:39:19 2013 From: whereverroadgoes at gmail.com (Slym) Date: Mon, 4 Feb 2013 07:39:19 -0800 (PST) Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases Message-ID: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com> The result I get is: Number of bases of type A = Number of bases of type C = Number of bases of type G = Number of bases of type T = i.e. There's no expected values. Please help! #! /usr/bin/perl use Bio::Tools::SeqStats; use Bio::Seq; open (FILE, "seq.fasta"); @array = ; # Removing first line of fasta shift (@array); $array = join('', at array); open (FILE2, ">>seq2.fasta"); print FILE2 "$array"; $seqobj = Bio::PrimarySeq->new( -file => "sekw2.fasta", - alphabet => 'dna',); my $seq_stats = Bio::Tools::SeqStats->new(-seq=>$seqobj); my $monomer_ref = $seq_stats->count_monomers(); foreach $base (sort keys %$monomer_ref) { print "Liczba zasad typu ", $base," = ", $monomer_ref{$base},"\n"; } From hamish.mcwilliam at bioinfo-user.org.uk Mon Feb 4 11:59:16 2013 From: hamish.mcwilliam at bioinfo-user.org.uk (Hamish McWilliam) Date: Mon, 4 Feb 2013 16:59:16 +0000 Subject: [Bioperl-l] Where to get BLASTCLUST or equivalent? In-Reply-To: References: <200305311150.h4VBopn2019091@localhost.localdomain> Message-ID: BLASTCLUST is part of the legacy NCBI BLAST package (not NCBI BLAST+) and can be obtained from: ftp://ftp.ncbi.nlm.nih.gov/blast/executables/release/LATEST As Robert notes there are many other tools which can be used to perform sequence clustering, Wikipedia has a Sequence Clustering article (http://en.wikipedia.org/wiki/Sequence_clustering) which lists some of the most commonly used. All the best, Hamish On 1 February 2013 04:15, Rob wrote: > Cyril C.C. Chua bmb.leeds.ac.uk> writes: > >> >> Hi, >> >> I have some difficulty in sourcing for BLASTCLUST or related >> programs/mods. Does any1 know exactly how to locate them? >> >> Regards >> >> Cyril Chua >> > > > Hi Cyril, > > I heard of the following programmes that might do similar things (I HAVEN'T > used any of them yet): > > Afree - http://www.vicbioinformatics.com/software.afree.shtml > Uclust - http://drive5.com/uclust/uclust_userguide_2_1.pdf > Usearch - http://www.drive5.com/usearch/ > DomClust - http://mbgd.genome.ad.jp/domclust/ > > or > > Check this: > > http://ppod.princeton.edu/help/help_tech.html > > God bless, > > > Robert > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- ---- "Saying the internet has changed dramatically over the last five years is clich? ? the internet is always changing dramatically" - Craig Labovitz, Arbor Networks. From whereverroadgoes at gmail.com Mon Feb 4 12:34:10 2013 From: whereverroadgoes at gmail.com (Slym) Date: Mon, 4 Feb 2013 09:34:10 -0800 (PST) Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases In-Reply-To: References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com> Message-ID: Thanks Roy, It still doesn't seem to produce anything. :/ From roy.chaudhuri at gmail.com Mon Feb 4 12:51:03 2013 From: roy.chaudhuri at gmail.com (Roy Chaudhuri) Date: Mon, 4 Feb 2013 17:51:03 +0000 Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases In-Reply-To: References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com> Message-ID: Sorry, I'd missed another problem in your code - you are trying to load a fasta file using Bio::PrimarySeq. To read sequence data from a file you should use Bio::SeqIO, see: http://www.bioperl.org/wiki/HOWTO:Beginners#Retrieving_a_sequence_from_a_file http://www.bioperl.org/wiki/HOWTO:SeqIO Cheers, Roy. From asjo at koldfront.dk Mon Feb 4 12:58:25 2013 From: asjo at koldfront.dk (Adam =?iso-8859-1?Q?Sj=F8gren?=) Date: Mon, 04 Feb 2013 18:58:25 +0100 Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases In-Reply-To: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com> (Slym's message of "Mon, 4 Feb 2013 07:39:19 -0800 (PST)") References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com> Message-ID: <8738xc2c72.fsf@topper.koldfront.dk> On Mon, 4 Feb 2013 07:39:19 -0800 (PST), Slym wrote: > #! /usr/bin/perl > use Bio::Tools::SeqStats; > use Bio::Seq; It can be a good idea to add "use strict; use warnings;" to the top of your script. At least two problems in your program would have been caught by perl if you had. > open (FILE, "seq.fasta"); Using (global) literal filehandles and the two parameter open() is somewhat outdated, a more current way to do it could be: open my $fh, '<', 'seq.fasta'; > @array = ; > # Removing first line of fasta > shift (@array); > $array = join('', at array); > open (FILE2, ">>seq2.fasta"); > print FILE2 "$array"; Note that you are writing just the sequence to your seq2.fasta file here, so the new file isn't really a fasta file. > $seqobj = Bio::PrimarySeq->new( -file => "sekw2.fasta", > - alphabet => 'dna',); Bio::PrimarySeq doesn't take a '-file' parameter. Also, note that the filename is different than before "sekw2" vs. "seq2"! Either you should use Bio::SeqIO with a '-file' parameter, or you can use Bio::PrimarySeq with a '-seq' parameter. > my $seq_stats = Bio::Tools::SeqStats->new(-seq=>$seqobj); > my $monomer_ref = $seq_stats->count_monomers(); > foreach $base (sort keys %$monomer_ref) { > print "Liczba zasad typu ", $base," = ", $monomer_ref{$base},"\n"; Here you wanted $monomer_ref->{$base}, as %monomer_ref isn't mentioned anywhere else. > } Here is a complete version of your script - I chose to use Bio::SeqIO - that works: #!/usr/bin/perl use strict; use warnings; use Bio::SeqIO; use Bio::Tools::SeqStats; my $io=Bio::SeqIO->new(-file=>'seq.fasta', -alphabet=>'dna'); my $seqobj=$io->next_seq; # Get the first sequence from the file my $seq_stats = Bio::Tools::SeqStats->new(-seq=>$seqobj); my $monomer_ref = $seq_stats->count_monomers(); foreach my $base (sort keys %$monomer_ref) { print "Liczba zasad typu ", $base," = ", $monomer_ref->{$base},"\n"; } E.g.: $ cat seq.fasta >test aaaacccggt $ ./slym.pl Liczba zasad typu A = 4 Liczba zasad typu C = 3 Liczba zasad typu G = 2 Liczba zasad typu T = 1 $ Best regards, Adam -- "Grittings. Ma nam is Kahlfin." Adam Sj?gren asjo at koldfront.dk From whereverroadgoes at gmail.com Mon Feb 4 13:02:29 2013 From: whereverroadgoes at gmail.com (Slym) Date: Mon, 4 Feb 2013 10:02:29 -0800 (PST) Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases In-Reply-To: References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com> Message-ID: The thing is, if I use Bio::SeqIO then Bio::Tools::SeqStats produces an error (saying that it wants input provided by Bio::PrimarySeq). (btw in this line $seqobj = Bio::PrimarySeq->new( -file => "sekw2.fasta", - alphabet => 'dna',); there's a typo "sekw2" instead of "seq2" but this is correct in my original code). From whereverroadgoes at gmail.com Mon Feb 4 13:02:29 2013 From: whereverroadgoes at gmail.com (Slym) Date: Mon, 4 Feb 2013 10:02:29 -0800 (PST) Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases In-Reply-To: References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com> Message-ID: The thing is, if I use Bio::SeqIO then Bio::Tools::SeqStats produces an error (saying that it wants input provided by Bio::PrimarySeq). (btw in this line $seqobj = Bio::PrimarySeq->new( -file => "sekw2.fasta", - alphabet => 'dna',); there's a typo "sekw2" instead of "seq2" but this is correct in my original code). From cjfields at illinois.edu Mon Feb 4 13:54:39 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Mon, 4 Feb 2013 18:54:39 +0000 Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases In-Reply-To: References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE161ED@CHIMBX5.ad.uillinois.edu> Please make sure and read both Roy's and Adam's responses all the way through; Bio::SeqIO is not a sequence object but the front-end for format parsing (e.g. FASTA, etc). Bio::PrimarySeq does not have a '-file' parameter, Bio::SeqIO does. If SeqStats truly doesn't work with Bio::Seq we can fix that, but according to Adam he has tested using Bio::SeqIO out and it seems to work. chris On Feb 4, 2013, at 12:02 PM, Slym wrote: > The thing is, if I use Bio::SeqIO then Bio::Tools::SeqStats produces an > error (saying that it wants input provided by Bio::PrimarySeq). > (btw in this line > $seqobj = Bio::PrimarySeq->new( -file => "sekw2.fasta", - alphabet => > 'dna',); > there's a typo "sekw2" instead of "seq2" but this is correct in my original > code). > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From asjo at koldfront.dk Mon Feb 4 15:00:32 2013 From: asjo at koldfront.dk (Adam =?iso-8859-1?Q?Sj=F8gren?=) Date: Mon, 04 Feb 2013 21:00:32 +0100 Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases In-Reply-To: (Slym's message of "Mon, 4 Feb 2013 10:02:29 -0800 (PST)") References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com> Message-ID: <87txpr26jj.fsf@topper.koldfront.dk> On Mon, 4 Feb 2013 10:02:29 -0800 (PST), Slym wrote: > The thing is, if I use Bio::SeqIO then Bio::Tools::SeqStats produces an > error (saying that it wants input provided by Bio::PrimarySeq). That sounds like you forgot to call ->next_seq() on the Bio::SeqIO object - to get a sequence object - please see the complete, working example I sent earlier. Best regards, Adam -- "Denial springs eternal." Adam Sj?gren asjo at koldfront.dk From scott at scottcain.net Tue Feb 5 09:45:14 2013 From: scott at scottcain.net (Scott Cain) Date: Tue, 5 Feb 2013 09:45:14 -0500 Subject: [Bioperl-l] Have your say in the 2013 GMOD Community Survey! Message-ID: Give us your thoughts on the GMOD project and win a personal DNA test from 23andMe! The GMOD project provides tools like GBrowse, Galaxy, MAKER, JBrowse, Tripal, Apollo, Chado, and many more to a huge community of users and developers around the world. To make sure that GMOD is giving you the support you need, we want to know how you use GMOD, which components you find valuable, your opinion on support, training, and GMOD's strengths and weaknesses. Your feedback is vital in helping GMOD to serve its user community more effectively and to suggest future directions for the project. Do the survey: http://gmod.org/survey.html The survey should take between 10 and 15 minutes (including thinking time), and participants can enter a draw to win "A Journey Through Your DNA", the personal DNA test from 23andMe (the winner can pick a $50 Amazon gift voucher if they prefer). The survey will be open until March 1st. Results will be collated and discussed at the April 2013 GMOD Meeting in Cambridge, UK, and posted on the GMOD wiki at http://gmod.org. Please spread the word to other friends and colleagues who use GMOD: the more voices we hear, the better the picture we get of the needs of our users, and the better we can help you! Do the survey: http://gmod.org/survey.html If you have any questions or problems with the survey, please email me -- I will be happy to help out! Thanks, Scott -- ------------------------------------------------------------------------ Scott Cain, Ph. D. scott at scottcain dot net GMOD Coordinator (http://gmod.org/) 216-392-3087 Ontario Institute for Cancer Research From tiago.hori at gmail.com Tue Feb 5 10:21:55 2013 From: tiago.hori at gmail.com (Tiago Hori) Date: Tue, 5 Feb 2013 07:21:55 -0800 (PST) Subject: [Bioperl-l] Search I::O Message-ID: <39b1269f-63a7-4b29-af79-8c93ab231abf@googlegroups.com> Hi All, I am trying to find the best putative orthologs for 44K Atlantic Salmon sequences, and so I need to parse 44K BLAST reports to find the best human hit. I am trying to learn Seach::IO, but when I try the first example on the HOWTO: use strict; use Bio::SearchIO; my $in = new Bio::SearchIO(-format => 'blast' -file => 'C001R047.txt'); while( my $result = $in->next_result ) { ## $result is a Bio::Search::Result::ResultI compliant object while( my $hit = $result->next_hit ) { ## $hit is a Bio::Search::Hit::HitI compliant object while( my $hsp = $hit->next_hsp ) { ## $hsp is a Bio::Search::HSP::HSPI compliant object if( $hsp->length('total') > 50 ) { if ( $hsp->percent_identity >= 75 ) { print "Query=", $result->query_name, " Hit=", $hit->name, " Length=", $hsp->length('total'), " Percent_id=", $hsp->percent_identity, "\n"; } } } } } I get this error: Odd number of elements in hash assignment at /usr/local/share/perl/5.14.2/Bio/SearchIO.pm line 189. I am using BioPerl version 1.6.901. Is there a format problem with the blast reports? Any help would be greatly appreciated! T. From tiago.hori at gmail.com Tue Feb 5 10:33:32 2013 From: tiago.hori at gmail.com (Tiago Hori) Date: Tue, 5 Feb 2013 07:33:32 -0800 (PST) Subject: [Bioperl-l] Search::IO example from HOWTO Message-ID: Hi All, I am trying to run tha example from the Search::IO how to use strict; use Bio::SearchIO; my $in = new Bio::SearchIO(-format => 'blast' -file => 'test.txt'); while( my $result = $in->next_result ) { ## $result is a Bio::Search::Result::ResultI compliant object while( my $hit = $result->next_hit ) { ## $hit is a Bio::Search::Hit::HitI compliant object while( my $hsp = $hit->next_hsp ) { ## $hsp is a Bio::Search::HSP::HSPI compliant object if( $hsp->length('total') > 50 ) { if ( $hsp->percent_identity >= 75 ) { print "Query=", $result->query_name, " Hit=", $hit->name, " Length=", $hsp->length('total'), " Percent_id=", $hsp->percent_identity, "\n"; } } } } } And I get this error:Odd number of elements in hash assignment at /usr/local/share/perl/5.14.2/Bio/SearchIO.pm line 189. Can anybody help! Cheers, T. From carandraug+dev at gmail.com Tue Feb 5 13:56:21 2013 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Tue, 5 Feb 2013 18:56:21 +0000 Subject: [Bioperl-l] removing packages from bioperl-live Message-ID: Hi some of the bioperl-live packages have already been split into separate repositories. However, they were never actually removed from bioperl-live. This creates 2 entry points for bug fixes and implementations. After a chat on #bioperl, I was told to ask here. Should these be removed? For example, there's bioperl-FeatureIO but that code alo exists in bioperl-live. Can I remove it from bioperl-live? Carn? From cjfields at illinois.edu Tue Feb 5 14:34:07 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Tue, 5 Feb 2013 19:34:07 +0000 Subject: [Bioperl-l] Removing Bio::FeatureIO? was Re: removing packages from bioperl-live In-Reply-To: References: Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu> Probably should retitle this to ask the question directly (make sure the right radars are pinged). My vote is yes, it should be removed. There were a lot of implementation issues with it that ended up becoming problematic. I do believe it is used, though, so I would like to get additional responses from the community before removing it and pointing to the separate repository (where there has been a lot of experimenting going on). chris On Feb 5, 2013, at 12:56 PM, Carn? Draug wrote: > Hi > > some of the bioperl-live packages have already been split into > separate repositories. However, they were never actually removed from > bioperl-live. This creates 2 entry points for bug fixes and > implementations. After a chat on #bioperl, I was told to ask here. > > Should these be removed? For example, there's bioperl-FeatureIO but > that code alo exists in bioperl-live. Can I remove it from > bioperl-live? > > Carn? > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From scott at scottcain.net Tue Feb 5 14:36:10 2013 From: scott at scottcain.net (Scott Cain) Date: Tue, 5 Feb 2013 14:36:10 -0500 Subject: [Bioperl-l] Removing Bio::FeatureIO? was Re: removing packages from bioperl-live In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu> Message-ID: I'm sure it will lead to lots of fun, but I suspect you are right and it should be removed. It's time you yank on that bandaid :-) Scott On Tue, Feb 5, 2013 at 2:34 PM, Fields, Christopher J wrote: > Probably should retitle this to ask the question directly (make sure the right radars are pinged). > > My vote is yes, it should be removed. There were a lot of implementation issues with it that ended up becoming problematic. I do believe it is used, though, so I would like to get additional responses from the community before removing it and pointing to the separate repository (where there has been a lot of experimenting going on). > > chris > > On Feb 5, 2013, at 12:56 PM, Carn? Draug wrote: > >> Hi >> >> some of the bioperl-live packages have already been split into >> separate repositories. However, they were never actually removed from >> bioperl-live. This creates 2 entry points for bug fixes and >> implementations. After a chat on #bioperl, I was told to ask here. >> >> Should these be removed? For example, there's bioperl-FeatureIO but >> that code alo exists in bioperl-live. Can I remove it from >> bioperl-live? >> >> Carn? >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- ------------------------------------------------------------------------ Scott Cain, Ph. D. scott at scottcain dot net GMOD Coordinator (http://gmod.org/) 216-392-3087 Ontario Institute for Cancer Research From carandraug+dev at gmail.com Tue Feb 5 15:06:23 2013 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Tue, 5 Feb 2013 20:06:23 +0000 Subject: [Bioperl-l] dependencies on perl version Message-ID: Hi how much perl backwards compatibility does bioperl needs to keep? If I have something I want to implement and use state (requires 5.010), is it acceptable? 5.010 is already a quite old perl version. Of course, there are other less elegant ways to implement those features. If I can't use modern perl stuff, what version number is the limit? Carn? From carandraug+dev at gmail.com Tue Feb 5 15:10:01 2013 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Tue, 5 Feb 2013 20:10:01 +0000 Subject: [Bioperl-l] Removing Bio::FeatureIO? was Re: removing packages from bioperl-live In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu> Message-ID: On 5 February 2013 19:34, Fields, Christopher J wrote: > Probably should retitle this to ask the question directly (make sure the right radars are pinged). > > My vote is yes, it should be removed. There were a lot of implementation issues with it that ended up becoming problematic. I do believe it is used, though, so I would like to get additional responses from the community before removing it and pointing to the separate repository (where there has been a lot of experimenting going on). Mentioning Bio::FeatureIO was just an example. I meant to ask it as more general. If the code is already in a separate repository, should it be removed from bioperl-live? Carn? From cjfields at illinois.edu Tue Feb 5 15:56:48 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Tue, 5 Feb 2013 20:56:48 +0000 Subject: [Bioperl-l] dependencies on perl version In-Reply-To: References: Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> Aim for 5.10.1, but be careful of smart-match. If you do this, make sure to add a 'use 5.010' pragma at the top. (for those who don't like this, please speak up. perl 5.8 has been around almost 8 yrs, we would like to allow using new features if at all possible) chris On Feb 5, 2013, at 2:06 PM, Carn? Draug wrote: > Hi > > how much perl backwards compatibility does bioperl needs to keep? > > If I have something I want to implement and use state (requires > 5.010), is it acceptable? 5.010 is already a quite old perl version. > Of course, there are other less elegant ways to implement those > features. If I can't use modern perl stuff, what version number is the > limit? > > Carn? > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Tue Feb 5 15:59:38 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Tue, 5 Feb 2013 20:59:38 +0000 Subject: [Bioperl-l] Removing Bio::FeatureIO? was Re: removing packages from bioperl-live In-Reply-To: References: <118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1829D@CHIMBX5.ad.uillinois.edu> On Feb 5, 2013, at 2:10 PM, Carn? Draug wrote: > On 5 February 2013 19:34, Fields, Christopher J wrote: >> Probably should retitle this to ask the question directly (make sure the right radars are pinged). >> >> My vote is yes, it should be removed. There were a lot of implementation issues with it that ended up becoming problematic. I do believe it is used, though, so I would like to get additional responses from the community before removing it and pointing to the separate repository (where there has been a lot of experimenting going on). > > Mentioning Bio::FeatureIO was just an example. I meant to ask it as > more general. If the code is already in a separate repository, should > it be removed from bioperl-live? > > Carn? Yes for Bio::FeatureIO, no for Bio::Root::Root and the others at the moment (I want to get a release out by March 1, which I'm planning on announcing later today, so the less disruptive it is the better). Once we get a new release out we should remove the rest. chris From cjfields at illinois.edu Tue Feb 5 16:53:29 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Tue, 5 Feb 2013 21:53:29 +0000 Subject: [Bioperl-l] Next BioPerl release Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu> All, I am scheduling the next BioPerl CPAN release tentatively for March 1. Any help in triaging bug reports would be greatly appreciated! Amongst all other changes, as mentioned in a separate thread we will remove Bio::FeatureIO, now developed in a separate repository: https://github.com/bioperl/Bio-FeatureIO Feedback, suggestions, etc are greatly appreciated. chris From miker at htblis.com Tue Feb 5 19:54:17 2013 From: miker at htblis.com (Michael Rogoff) Date: Tue, 5 Feb 2013 16:54:17 -0800 Subject: [Bioperl-l] Bio::Graphics error when rendering features with Split locations Message-ID: When trying to render features from a genbank file that include a split location e.g.: promoter join(1000..1080,1..5) /label=PROM1 The following exception is raised: Can't locate object method "has_tag" via package "Bio::Location::Simple" at lib/perl5/site_perl/5.10.1/Bio/Graphics/Glyph.pm line 704, line 36. This can be reproduced with the code in the example "Rendering Features from a GenBank or EMBL File" from the Graphics HOW-TO: http://www.bioperl.org/wiki/HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File Is there a way to change the script so that split locations would, at the very least, not cause a fatal error? Is there a different glyph type that needs to be used? Thanks in advance for any help. I've attached a simple genbank input that will reproduce the error: LOCUS sample2 1080 bp DNA circular DEFINITION Cloning vector sample2 ACCESSION sample2 VERSION sample2.1 GI:4352432 COMMENT Component Fragments FEATURES Location/Qualifiers terminator 39..328 /label=TERM1 /note="terminator 1" misc_feature 393..488 /label=MF1 CDS complement(800..900) /label=CDS1 /note="resistence gene" promoter join(1000..1080,1..5) /label=PROM1 ORIGIN 1 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 61 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 121 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 181 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 241 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 301 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 361 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 421 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 481 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 541 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 601 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 661 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 721 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 781 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 841 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 901 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 961 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1021 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn // P.S. I think I have traced the source of the problem to Glyph's _subfeat method, which in the case of a feature with split locations is returning location objects instead of feature objects. Is this a bug? sub _subfeat { my $class = shift; my $feature = shift; return $feature->segments if $feature->can('segments'); my @split = eval { my $id = $feature->location->seq_id; my @subs = $feature->location->sub_Location; grep {$id eq $_->seq_id} @subs; }; return @split if @split; # Either the APIs have changed, or I got confused at some point... return $feature->get_SeqFeatures if $feature->can('get_SeqFeatures'); return $feature->sub_SeqFeature if $feature->can('sub_SeqFeature'); return; } From l.m.timmermans at students.uu.nl Tue Feb 5 21:40:27 2013 From: l.m.timmermans at students.uu.nl (Leon Timmermans) Date: Wed, 6 Feb 2013 03:40:27 +0100 Subject: [Bioperl-l] dependencies on perl version In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> Message-ID: On Tue, Feb 5, 2013 at 9:56 PM, Fields, Christopher J wrote: > Aim for 5.10.1, but be careful of smart-match. If you do this, make sure to add a 'use 5.010' pragma at the top. > > (for those who don't like this, please speak up. perl 5.8 has been around almost 8 yrs, we would like to allow using new features if at all possible) I *really* hate saying it, but I fear a lot of places are still stuck on 5.8, in particular on 5.8.8 because of CentOS 5. I know my department still is and doesn't seem to be in a hurry to upgrade, and I'm pretty sure it won't be the only one (though personally I use a self-compiled 5.16). Leon From florent.angly at gmail.com Tue Feb 5 21:51:27 2013 From: florent.angly at gmail.com (Florent Angly) Date: Wed, 06 Feb 2013 12:51:27 +1000 Subject: [Bioperl-l] Removing Bio::FeatureIO? was Re: removing packages from bioperl-live In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1829D@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu> <118F034CF4C3EF48A96F86CE585B94BF6CE1829D@CHIMBX5.ad.uillinois.edu> Message-ID: <5111C52F.50101@gmail.com> On 06/02/13 06:59, Fields, Christopher J wrote: > On Feb 5, 2013, at 2:10 PM, Carn? Draug wrote: > >> On 5 February 2013 19:34, Fields, Christopher J wrote: >>> Probably should retitle this to ask the question directly (make sure the right radars are pinged). >>> >>> My vote is yes, it should be removed. There were a lot of implementation issues with it that ended up becoming problematic. I do believe it is used, though, so I would like to get additional responses from the community before removing it and pointing to the separate repository (where there has been a lot of experimenting going on). >> Mentioning Bio::FeatureIO was just an example. I meant to ask it as >> more general. If the code is already in a separate repository, should >> it be removed from bioperl-live? >> >> Carn? > Yes for Bio::FeatureIO, no for Bio::Root::Root and the others at the moment (I want to get a release out by March 1, which I'm planning on announcing later today, so the less disruptive it is the better). Once we get a new release out we should remove the rest. > > chris > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Sounds good to me (I've been burnt once by the fact that Bio::FeatureIO is in two places). Florent From florent.angly at gmail.com Tue Feb 5 21:56:19 2013 From: florent.angly at gmail.com (Florent Angly) Date: Wed, 06 Feb 2013 12:56:19 +1000 Subject: [Bioperl-l] dependencies on perl version In-Reply-To: References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> Message-ID: <5111C653.2010703@gmail.com> For what it's worth, the current stable version of Debian uses perl 5.10.1 (http://packages.debian.org/stable/perl/perl). Florent On 06/02/13 12:40, Leon Timmermans wrote: > On Tue, Feb 5, 2013 at 9:56 PM, Fields, Christopher J > wrote: >> Aim for 5.10.1, but be careful of smart-match. If you do this, make sure to add a 'use 5.010' pragma at the top. >> >> (for those who don't like this, please speak up. perl 5.8 has been around almost 8 yrs, we would like to allow using new features if at all possible) > I *really* hate saying it, but I fear a lot of places are still stuck > on 5.8, in particular on 5.8.8 because of CentOS 5. I know my > department still is and doesn't seem to be in a hurry to upgrade, and > I'm pretty sure it won't be the only one (though personally I use a > self-compiled 5.16). > > Leon > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From hlapp at drycafe.net Tue Feb 5 22:27:35 2013 From: hlapp at drycafe.net (Hilmar Lapp) Date: Tue, 5 Feb 2013 22:27:35 -0500 Subject: [Bioperl-l] Next BioPerl release In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu> Message-ID: <09524241-59F8-4BFF-8054-53CD0A649C11@drycafe.net> On Feb 5, 2013, at 4:53 PM, Fields, Christopher J wrote: > I am scheduling the next BioPerl CPAN release tentatively for March 1. Yay!! Thanks for your leadership again, Chris, and for volunteering your time for the project. If nothing else, and I know this is no compensation really worth speaking of, we owe you beer, and I'll certainly pay my debt to you in Berlin if you come there. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : =========================================================== From hlapp at drycafe.net Tue Feb 5 22:32:40 2013 From: hlapp at drycafe.net (Hilmar Lapp) Date: Tue, 5 Feb 2013 22:32:40 -0500 Subject: [Bioperl-l] dependencies on perl version In-Reply-To: <5111C653.2010703@gmail.com> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> Message-ID: Does anyone know what Ubuntu uses? I've heard lots of other old version problems with CentOS. 8 years is really old, and at some point I fear that weighing backwards compatibility too much just holds us back in a real detrimental way. -hilmar On Feb 5, 2013, at 9:56 PM, Florent Angly wrote: > For what it's worth, the current stable version of Debian uses perl 5.10.1 (http://packages.debian.org/stable/perl/perl). > Florent > > On 06/02/13 12:40, Leon Timmermans wrote: >> On Tue, Feb 5, 2013 at 9:56 PM, Fields, Christopher J >> wrote: >>> Aim for 5.10.1, but be careful of smart-match. If you do this, make sure to add a 'use 5.010' pragma at the top. >>> >>> (for those who don't like this, please speak up. perl 5.8 has been around almost 8 yrs, we would like to allow using new features if at all possible) >> I *really* hate saying it, but I fear a lot of places are still stuck >> on 5.8, in particular on 5.8.8 because of CentOS 5. I know my >> department still is and doesn't seem to be in a hurry to upgrade, and >> I'm pretty sure it won't be the only one (though personally I use a >> self-compiled 5.16). >> >> Leon >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : =========================================================== From cjfields at illinois.edu Tue Feb 5 22:58:08 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 6 Feb 2013 03:58:08 +0000 Subject: [Bioperl-l] dependencies on perl version In-Reply-To: References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE18CBE@CHIMBX5.ad.uillinois.edu> Re: being held back, I agree. I don't necessarily want to intentionally break current modules by adding modern code unless it can be demonstrated to be a decent benefit performance-wise, but I don't want to impede new additions by requiring compat with perl 5.8 (hence my suggestion of a 'use 5.01x' pragma when appropriate). Ubuntu 12.04 LTS is on perl 5.14.2: http://askubuntu.com/questions/80672/what-perl-version-will-be-in-12-04-lts BTW, I was wrong about perl 5.8 being 8 yrs old; it's almost 11 yrs old (perl 5.8.0 was released on 7/18/2002). perl 5.8 reached end-of-life in 2008, fixes being only for security reasons. So, I support dropping perl 5.8 support, but we should have a decent route of use for the folks stuck on old clusters. chris On Feb 5, 2013, at 9:32 PM, Hilmar Lapp wrote: > Does anyone know what Ubuntu uses? I've heard lots of other old version problems with CentOS. > > 8 years is really old, and at some point I fear that weighing backwards compatibility too much just holds us back in a real detrimental way. > > -hilmar > > On Feb 5, 2013, at 9:56 PM, Florent Angly wrote: > >> For what it's worth, the current stable version of Debian uses perl 5.10.1 (http://packages.debian.org/stable/perl/perl). >> Florent >> >> On 06/02/13 12:40, Leon Timmermans wrote: >>> On Tue, Feb 5, 2013 at 9:56 PM, Fields, Christopher J >>> wrote: >>>> Aim for 5.10.1, but be careful of smart-match. If you do this, make sure to add a 'use 5.010' pragma at the top. >>>> >>>> (for those who don't like this, please speak up. perl 5.8 has been around almost 8 yrs, we would like to allow using new features if at all possible) >>> I *really* hate saying it, but I fear a lot of places are still stuck >>> on 5.8, in particular on 5.8.8 because of CentOS 5. I know my >>> department still is and doesn't seem to be in a hurry to upgrade, and >>> I'm pretty sure it won't be the only one (though personally I use a >>> self-compiled 5.16). >>> >>> Leon >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From l.m.timmermans at students.uu.nl Tue Feb 5 23:11:52 2013 From: l.m.timmermans at students.uu.nl (Leon Timmermans) Date: Wed, 6 Feb 2013 05:11:52 +0100 Subject: [Bioperl-l] dependencies on perl version In-Reply-To: References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> Message-ID: On Wed, Feb 6, 2013 at 4:32 AM, Hilmar Lapp wrote: > Does anyone know what Ubuntu uses? 5.14.2, distrowatch is your friend ;-) > I've heard lots of other old version problems with CentOS. I know people who still use CentOS 4 in production :-| > 8 years is really old, and at some point I fear that weighing backwards compatibility too much just holds us back in a real detrimental way. CentOS 5 is 6 years old (and will be supported another 4), but CentOS 6 is 'only' 19 months. perl missing a release in the 5.8-5.10 timeframe combined with an unfortunate alignment of its release schedule with Red Hat's don't do us any favors here. Leon From cjfields at illinois.edu Tue Feb 5 23:14:24 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 6 Feb 2013 04:14:24 +0000 Subject: [Bioperl-l] dependencies on perl version In-Reply-To: References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE18E52@CHIMBX5.ad.uillinois.edu> On Feb 5, 2013, at 8:40 PM, Leon Timmermans wrote: > On Tue, Feb 5, 2013 at 9:56 PM, Fields, Christopher J > wrote: >> Aim for 5.10.1, but be careful of smart-match. If you do this, make sure to add a 'use 5.010' pragma at the top. >> >> (for those who don't like this, please speak up. perl 5.8 has been around almost 8 yrs, we would like to allow using new features if at all possible) > > I *really* hate saying it, but I fear a lot of places are still stuck > on 5.8, in particular on 5.8.8 because of CentOS 5. I know my > department still is and doesn't seem to be in a hurry to upgrade, and > I'm pretty sure it won't be the only one (though personally I use a > self-compiled 5.16). > > Leon We had the same problem for a while, but our sysadmins were willing to set up perl 5.12 (at that time) loadable as a module (we can of course set up a local perl as well). We're now using a sysadmin-installed perl 5.16 with our current cluster. chris From cjfields at illinois.edu Tue Feb 5 23:24:31 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 6 Feb 2013 04:24:31 +0000 Subject: [Bioperl-l] dependencies on perl version In-Reply-To: References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> On Feb 5, 2013, at 10:11 PM, Leon Timmermans wrote: > On Wed, Feb 6, 2013 at 4:32 AM, Hilmar Lapp wrote: >> Does anyone know what Ubuntu uses? > > 5.14.2, distrowatch is your friend ;-) > >> I've heard lots of other old version problems with CentOS. > > I know people who still use CentOS 4 in production :-| > >> 8 years is really old, and at some point I fear that weighing backwards compatibility too much just holds us back in a real detrimental way. > > CentOS 5 is 6 years old (and will be supported another 4), but CentOS > 6 is 'only' 19 months. perl missing a release in the 5.8-5.10 > timeframe combined with an unfortunate alignment of its release > schedule with Red Hat's don't do us any favors here. > > Leon Right, it took ~8 yrs to go from 5.8 to 5.10. I'd like to point out that Python users are in the same boat: the Python version for CentOS 5 is 2.4.3, and Biopython requires a minimum of python 2.5 (and recommends python 2.7). We can always state that perl 5.8 is supported for the upcoming Bioperl release, but we're dropping v5.8 support for any future releases. chris From l.m.timmermans at students.uu.nl Tue Feb 5 23:33:57 2013 From: l.m.timmermans at students.uu.nl (Leon Timmermans) Date: Wed, 6 Feb 2013 05:33:57 +0100 Subject: [Bioperl-l] dependencies on perl version In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> Message-ID: On Wed, Feb 6, 2013 at 5:24 AM, Fields, Christopher J wrote: > Right, it took ~8 yrs to go from 5.8 to 5.10. I'd like to point out that Python users are in the same boat: the Python version for CentOS 5 is 2.4.3, and Biopython requires a minimum of python 2.5 (and recommends python 2.7). > > We can always state that perl 5.8 is supported for the upcoming Bioperl release, but we're dropping v5.8 support for any future releases. Sounds reasonable. These things shouldn't come as a surprise. I suspect that the thing that will save us is that most of these people install it once and then never upgrade. Leon From hartzell at alerce.com Wed Feb 6 12:58:07 2013 From: hartzell at alerce.com (George Hartzell) Date: Wed, 6 Feb 2013 09:58:07 -0800 Subject: [Bioperl-l] dependencies on perl version In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> Message-ID: <20754.39343.128576.743448@gargle.gargle.HOWL> Fields, Christopher J writes: > [...] > Right, it took ~8 yrs to go from 5.8 to 5.10. I'd like to point > out that Python users are in the same boat: the Python version for > CentOS 5 is 2.4.3, and Biopython requires a minimum of python 2.5 > (and recommends python 2.7). > > We can always state that perl 5.8 is supported for the upcoming > Bioperl release, but we're dropping v5.8 support for any future > releases. Do more than drop support for 5.8. The Perl community has put a transparent and predictable process in place for releasing [generally] better versions of the language. It means that Perl has a chance of continuing to be relevant, attracting new talent and actually *fixing* some of the s&%t that gives Perl a bad rap. It gives people something to plan around, no one should be surprised that v 5.X.Y is coming out in mid 20ZZ. BioPerl should do the same thing, declare a release policy that trails along with the Perl release schedule. Keep it simple and no one can argue with it. Support Perl releases as long as the releases themselves are supported. Rather than expending energy supporting out of date platforms, put the energy into being modern (or Modern...), better distro building and packaging, testing, documentation and releasing so that the process of staying current is painless. Look forward. Keep it interesting and fun. Everyone running Mac OS 9 on their Pismo, raise your hand. Anyone make their living running sequencing gels in Plexiglas doohickeys on their lab bench? I'm not suggesting that the BioPerl community is free to make arbitrary and capricious changes that makes it difficult for *anyone* to get anything done. Churn is a waste of time. But why should the all-volunteer BioPerl community be stuck supporting code from 12 years ago because it's cost effective for someone else to avoid spending *their* $/time/people to stay up to date. Those sites that value stability/maturity/stagnation so highly have already accepted the cost/difficulty of nailing one of their feet to the floor as they try to run forward. They recognize and depend on the benefits of having that stable base but generally they've also accepted the costs associated with their restrictive choices. They know how to pull in separate kernel/driver updates so that they can actually run on nearly modern hardware. They know, and live with, the fact that they're not going to have access to the shiny new stuff. And they know how to stay up to date, when they need to, with the software that their users need to be competitive (e.g. BioConductor and R). As long as (if/when...) updating a BioPerl release is something that can reliably happen with a few cpanm invocations then the sites that otherwise favor punctuated equilibrium will learn to handle gradual change. Those folks that are "stuck" on older releases always have the option of supporting professional Perl programmers to keep older releases going, backport changes, etc.... They're already buying support for their platforms (or freeloading and coping), let them put bread on the table at one of the bioinformatics consultancies or labs if they have something special they need. Have fun. Use sharp tools. Do cool science. Build cool things. No one is paying you to be backwards compatible with the previous millennium. g. From amackey at virginia.edu Wed Feb 6 13:47:46 2013 From: amackey at virginia.edu (Aaron Mackey) Date: Wed, 6 Feb 2013 13:47:46 -0500 Subject: [Bioperl-l] dependencies on perl version In-Reply-To: <20754.39343.128576.743448@gargle.gargle.HOWL> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> Message-ID: Huzzah! -- Aaron J. Mackey, PhD Assistant Professor Center for Public Health Genomics University of Virginia amackey at virginia.edu http://www.cphg.virginia.edu/mackey On Wed, Feb 6, 2013 at 12:58 PM, George Hartzell wrote: > Fields, Christopher J writes: > > [...] > > Right, it took ~8 yrs to go from 5.8 to 5.10. I'd like to point > > out that Python users are in the same boat: the Python version for > > CentOS 5 is 2.4.3, and Biopython requires a minimum of python 2.5 > > (and recommends python 2.7). > > > > We can always state that perl 5.8 is supported for the upcoming > > Bioperl release, but we're dropping v5.8 support for any future > > releases. > > Do more than drop support for 5.8. > > The Perl community has put a transparent and predictable process in > place for releasing [generally] better versions of the language. It > means that Perl has a chance of continuing to be relevant, attracting > new talent and actually *fixing* some of the s&%t that gives Perl a > bad rap. It gives people something to plan around, no one should be > surprised that v 5.X.Y is coming out in mid 20ZZ. > > BioPerl should do the same thing, declare a release policy that trails > along with the Perl release schedule. Keep it simple and no one can > argue with it. Support Perl releases as long as the releases > themselves are supported. > > Rather than expending energy supporting out of date platforms, put the > energy into being modern (or Modern...), better distro building and > packaging, testing, documentation and releasing so that the process of > staying current is painless. > > Look forward. Keep it interesting and fun. > > Everyone running Mac OS 9 on their Pismo, raise your hand. Anyone > make their living running sequencing gels in Plexiglas doohickeys on > their lab bench? > > I'm not suggesting that the BioPerl community is free to make > arbitrary and capricious changes that makes it difficult for *anyone* > to get anything done. Churn is a waste of time. > > But why should the all-volunteer BioPerl community be stuck supporting > code from 12 years ago because it's cost effective for someone else to > avoid spending *their* $/time/people to stay up to date. > > Those sites that value stability/maturity/stagnation so highly have > already accepted the cost/difficulty of nailing one of their feet to > the floor as they try to run forward. They recognize and depend on > the benefits of having that stable base but generally they've also > accepted the costs associated with their restrictive choices. They > know how to pull in separate kernel/driver updates so that they can > actually run on nearly modern hardware. They know, and live with, the > fact that they're not going to have access to the shiny new stuff. > And they know how to stay up to date, when they need to, with the > software that their users need to be competitive (e.g. BioConductor > and R). > > As long as (if/when...) updating a BioPerl release is something that > can reliably happen with a few cpanm invocations then the sites that > otherwise favor punctuated equilibrium will learn to handle gradual > change. > > Those folks that are "stuck" on older releases always have the option > of supporting professional Perl programmers to keep older releases > going, backport changes, etc.... They're already buying support for > their platforms (or freeloading and coping), let them put bread on the > table at one of the bioinformatics consultancies or labs if they have > something special they need. > > Have fun. Use sharp tools. Do cool science. Build cool things. No > one is paying you to be backwards compatible with the previous > millennium. > > g. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From tiago.hori at gmail.com Wed Feb 6 08:25:41 2013 From: tiago.hori at gmail.com (Tiago Hori) Date: Wed, 6 Feb 2013 05:25:41 -0800 (PST) Subject: [Bioperl-l] Problems installing Bio::Tools::Run:StandAloneBlastPlus Message-ID: <9b488c6e-34b3-4269-a7ac-e2206720939a@googlegroups.com> Hi Guys, I am trying to install the module Bio::Tools::Run:StandAloneBlastPlus, but it has been hard so far. I managed to install and compile samtools, after finding all the dependencies, but I am still missing something! I posted the complete report below! Any help, would be great! Cheers, T. cpan[1]> install Bio::Tools::Run::StandAloneBlastPlus Reading '/home/tiagohori/.cpan/Metadata' Database was generated on Tue, 05 Feb 2013 18:41:03 GMT Running install for module 'Bio::Tools::Run::StandAloneBlastPlus' Running make for C/CJ/CJFIELDS/BioPerl-Run-1.006900.tar.gz Checksum for /home/tiagohori/.cpan/sources/authors/id/C/CJ/CJFIELDS/BioPerl-Run-1.006900.tar.gz ok Scanning cache /home/tiagohori/.cpan/build for sizes ..................................------------------------------------------DONE DEL(1/20): /home/tiagohori/.cpan/build/BioPerl-Run-1.006900-qpHfzz DEL(2/20): /home/tiagohori/.cpan/build/BioPerl-Run-1.006900-qpHfzz.yml DEL(3/20): /home/tiagohori/.cpan/build/BioPerl-Run-1.006900-nMOXgO DEL(4/20): /home/tiagohori/.cpan/build/BioPerl-Run-1.006900-nMOXgO.yml DEL(5/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-bgBQyC DEL(6/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-bgBQyC.yml DEL(7/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-Ki3dbt DEL(8/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-Ki3dbt.yml DEL(9/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-ciM7U4 DEL(10/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-ciM7U4.yml DEL(11/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-oDyi_5 DEL(12/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-oDyi_5.yml DEL(13/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-AQiiAn DEL(14/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-AQiiAn.yml DEL(15/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-0H2Z9o DEL(16/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-0H2Z9o.yml DEL(17/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-c_8A_U DEL(18/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-c_8A_U.yml DEL(19/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-lWtV8v DEL(20/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-lWtV8v.yml CPAN.pm: Building C/CJ/CJFIELDS/BioPerl-Run-1.006900.tar.gz Install scripts? y/n [n ] n Do you want to run tests that require connection to servers across the internet (likely to cause some failures)? y/n [n ] n - will not run internet-requiring tests Created MYMETA.yml and MYMETA.json Creating new 'Build' script for 'BioPerl-Run' version '1.006900' Building BioPerl-Run CJFIELDS/BioPerl-Run-1.006900.tar.gz ./Build -- OK Running Build test t/Amap.t ...................... 1/18 # Required executable for Bio::Tools::Run::Alignment::Amap is not present t/Amap.t ...................... ok t/AnalysisFactory_soap.t ...... skipped: Network tests have not been requested t/Analysis_soap.t ............. skipped: Network tests have not been requested t/BEDTools.t .................. 3/423 # Required executable for Bio::Tools::Run::BEDTools is not present t/BEDTools.t .................. ok t/BWA.t ....................... 1/36 # Required executable for Bio::Tools::Run::BWA is not present t/BWA.t ....................... ok t/Blat.t ...................... 1/33 # Required executable for Bio::Tools::Run::Alignment::Blat is not present # Looks like you planned 33 tests but ran 20. t/Blat.t ...................... Dubious, test returned 255 (wstat 65280, 0xff00) Failed 13/33 subtests (less 15 skipped subtests: 5 okay) t/Bowtie.t .................... 1/73 # Required executable for Bio::Tools::Run::Bowtie is not present t/Bowtie.t .................... ok t/Cap3.t ...................... 1/91 # Required executable for Bio::Tools::Run::Cap3 is not present t/Cap3.t ...................... ok t/Clustalw.t .................. 1/45 # Required executable for Bio::Tools::Run::Alignment::Clustalw is not present t/Clustalw.t .................. ok t/Coil.t ...................... 2/6 # Required executable for Bio::Tools::Run::Coil is not present t/Coil.t ...................... ok t/Consense.t .................. 1/9 # Required executable for Bio::Tools::Run::Phylo::Phylip::Consense is not present t/Consense.t .................. ok t/DBA.t ....................... 1/18 # Required executable for Bio::Tools::Run::Alignment::DBA is not present t/DBA.t ....................... ok t/DrawGram.t .................. 1/6 # Required executable for Bio::Tools::Run::Phylo::Phylip::DrawGram is not present t/DrawGram.t .................. ok t/DrawTree.t .................. 1/6 # Required executable for Bio::Tools::Run::Phylo::Phylip::DrawTree is not present t/DrawTree.t .................. ok t/EMBOSS.t .................... ok t/Ensembl.t ................... skipped: Network tests have not been requested t/Eponine.t ................... 1/7 # Looks like you planned 7 tests but ran 2. t/Eponine.t ................... Dubious, test returned 255 (wstat 65280, 0xff00) Failed 5/7 subtests t/Exonerate.t ................. 1/89 # Required executable for Bio::Tools::Run::Alignment::Exonerate is not present t/Exonerate.t ................. ok t/FootPrinter.t ............... 1/24 # Required executable for Bio::Tools::Run::FootPrinter is not present t/FootPrinter.t ............... ok t/Genemark.hmm.prokaryotic.t .. 1/99 # Required environment variable $GENEMARK_MODELS is not set t/Genemark.hmm.prokaryotic.t .. ok t/Genewise.t .................. 1/20 # Required executable for Bio::Tools::Run::Genewise is not present t/Genewise.t .................. ok t/Genscan.t ................... 1/6 # Required environment variable $GENSCANDIR is not set t/Genscan.t ................... ok t/Gerp.t ...................... 1/33 # Required executable for Bio::Tools::Run::Phylo::Gerp is not present t/Gerp.t ...................... ok t/Glimmer2.t .................. 1/217 # Required executable for Bio::Tools::Run::Glimmer is not present t/Glimmer2.t .................. ok t/Glimmer3.t .................. 1/111 # Required executable for Bio::Tools::Run::Glimmer is not present t/Glimmer3.t .................. ok t/Gumby.t ..................... 1/124 # Required executable for Bio::Tools::Run::Phylo::Gumby is not present t/Gumby.t ..................... ok t/Hmmer.t ..................... 1/27 # Required executable for Bio::Tools::Run::Hmmer is not present t/Hmmer.t ..................... ok t/Hyphy.t ..................... 2/15 # Required executable for Bio::Tools::Run::Phylo::Hyphy::SLAC is not present t/Hyphy.t ..................... ok t/Infernal.t .................. 1/43 # Required executable for Bio::Tools::Run::Infernal is not present t/Infernal.t .................. ok t/Kalign.t .................... 1/8 # Required executable for Bio::Tools::Run::Alignment::Kalign is not present t/Kalign.t .................... ok t/LVB.t ....................... 1/19 # Required executable for Bio::Tools::Run::Phylo::LVB is not present t/LVB.t ....................... ok t/Lagan.t ..................... 1/12 # Required executable for Bio::Tools::Run::Alignment::Lagan is not present t/Lagan.t ..................... ok t/MAFFT.t ..................... 1/17 # Required executable for Bio::Tools::Run::Alignment::MAFFT is not present t/MAFFT.t ..................... ok t/MCS.t ....................... 1/24 # Required executable for Bio::Tools::Run::MCS is not present t/MCS.t ....................... ok t/Maq.t ....................... 1/51 # Required executable for Bio::Tools::Run::Maq is not present t/Maq.t ....................... ok t/Match.t ..................... 1/7 # Required executable for Bio::Tools::Run::Match is not present t/Match.t ..................... ok t/Mdust.t ..................... 1/5 # Required executable for Bio::Tools::Run::Mdust is not present t/Mdust.t ..................... ok t/Meme.t ...................... 1/25 # Required executable for Bio::Tools::Run::Meme is not present t/Meme.t ...................... ok t/Minimo.t .................... 1/72 # Required executable for Bio::Tools::Run::Minimo is not present t/Minimo.t .................... ok t/Molphy.t .................... 1/10 # Required executable for Bio::Tools::Run::Phylo::Molphy::ProtML is not present t/Molphy.t .................... ok t/Muscle.t .................... 1/16 # Required executable for Bio::Tools::Run::Alignment::Muscle is not present t/Muscle.t .................... ok t/Neighbor.t .................. 1/17 # Required executable for Bio::Tools::Run::Phylo::Phylip::Neighbor is not present t/Neighbor.t .................. ok t/Newbler.t ................... 1/98 # Required executable for Bio::Tools::Run::Newbler is not present t/Newbler.t ................... ok t/Njtree.t .................... 1/6 # Required executable for Bio::Tools::Run::Phylo::Njtree::Best is not present t/Njtree.t .................... ok t/PAML.t ...................... 1/28 # Required executable for Bio::Tools::Run::Phylo::PAML::Codeml is not present t/PAML.t ...................... ok t/Pal2Nal.t ................... 1/9 # Required executable for Bio::Tools::Run::Alignment::Pal2Nal is not present t/Pal2Nal.t ................... ok t/PhastCons.t ................. 1/181 # Required executable for Bio::Tools::Run::Phylo::Phast::PhastCons is not present t/PhastCons.t ................. ok t/Phrap.t ..................... 1/127 # Required executable for Bio::Tools::Run::Phrap is not present t/Phrap.t ..................... ok t/Phyml.t ..................... 1/47 # Required executable for Bio::Tools::Run::Phylo::Phyml is not present t/Phyml.t ..................... ok t/Primate.t ................... 1/8 # Required executable for Bio::Tools::Run::Primate is not present t/Primate.t ................... ok t/Primer3.t ................... 1/9 # Required executable for Bio::Tools::Run::Primer3 is not present t/Primer3.t ................... ok t/Prints.t .................... 1/7 # Required executable for Bio::Tools::Run::Prints is not present t/Prints.t .................... ok t/Probalign.t ................. 1/13 # Required executable for Bio::Tools::Run::Alignment::Probalign is not present t/Probalign.t ................. ok t/Probcons.t .................. 1/11 # Required executable for Bio::Tools::Run::Alignment::Probcons is not present t/Probcons.t .................. ok t/Profile.t ................... 1/7 # Required executable for Bio::Tools::Run::Profile is not present t/Profile.t ................... ok t/Promoterwise.t .............. 1/9 # Required executable for Bio::Tools::Run::Promoterwise is not present t/Promoterwise.t .............. ok t/ProtDist.t .................. 1/14 # Required executable for Bio::Tools::Run::Phylo::Phylip::ProtDist is not present t/ProtDist.t .................. ok t/ProtPars.t .................. 1/11 # Required executable for Bio::Tools::Run::Phylo::Phylip::ProtPars is not present t/ProtPars.t .................. ok t/Pseudowise.t ................ 1/18 # Required executable for Bio::Tools::Run::Pseudowise is not present t/Pseudowise.t ................ ok t/QuickTree.t ................. 1/13 # Required executable for Bio::Tools::Run::Phylo::QuickTree is not present t/QuickTree.t ................. ok t/RepeatMasker.t .............. 1/12 RepeatMasker program not found as or not executable. # Required executable for Bio::Tools::Run::RepeatMasker is not present t/RepeatMasker.t .............. ok t/SABlastPlus.t ............... 1/65 # Required executable for Bio::Tools::Run::BlastPlus is not present # Looks like you planned 65 tests but ran 63. t/SABlastPlus.t ............... Dubious, test returned 255 (wstat 65280, 0xff00) Failed 2/65 subtests (less 59 skipped subtests: 4 okay) t/SLR.t ....................... 1/7 # Required executable for Bio::Tools::Run::Phylo::SLR is not present t/SLR.t ....................... ok t/Samtools.t .................. ok t/Seg.t ....................... 1/8 # Required executable for Bio::Tools::Run::Seg is not present t/Seg.t ....................... ok t/Semphy.t .................... 1/19 # Required executable for Bio::Tools::Run::Phylo::Semphy is not present t/Semphy.t .................... ok t/SeqBoot.t ................... 1/9 # Required executable for Bio::Tools::Run::Phylo::Phylip::SeqBoot is not present t/SeqBoot.t ................... ok t/Signalp.t ................... 1/7 # Required executable for Bio::Tools::Run::Signalp is not present t/Signalp.t ................... ok t/Sim4.t ...................... 1/23 # Required executable for Bio::Tools::Run::Alignment::Sim4 is not present t/Sim4.t ...................... ok t/Simprot.t ................... 1/6 # Required executable for Bio::Tools::Run::Simprot is not present t/Simprot.t ................... ok t/SoapEU-function.t ........... skipped: The optional module Bio::DB::ESoap (or dependencies thereof) was not installed t/SoapEU-unit.t ............... skipped: The optional module Bio::DB::ESoap (or dependencies thereof) was not installed t/StandAloneFasta.t ........... 1/15 # Required executable for Bio::Tools::Run::Alignment::StandAloneFasta is not present t/StandAloneFasta.t ........... ok t/TCoffee.t ................... 1/27 # Required executable for Bio::Tools::Run::Alignment::TCoffee is not present t/TCoffee.t ................... ok t/TigrAssembler.t ............. 1/88 # Required executable for Bio::Tools::Run::TigrAssembler is not present # Required executable for Bio::Tools::Run::TigrAssembler is not present t/TigrAssembler.t ............. ok t/Tmhmm.t ..................... 1/9 # Required executable for Bio::Tools::Run::Tmhmm is not present t/Tmhmm.t ..................... ok t/TribeMCL.t .................. ok t/Vista.t ..................... ok t/gmap-run.t .................. 1/8 # Required executable for Bio::Tools::Run::Alignment::Gmap is not present t/gmap-run.t .................. ok t/tRNAscanSE.t ................ 1/12 # Required executable for Bio::Tools::Run::tRNAscanSE is not present t/tRNAscanSE.t ................ ok Test Summary Report ------------------- t/Blat.t (Wstat: 65280 Tests: 20 Failed: 0) Non-zero exit status: 255 Parse errors: Bad plan. You planned 33 tests but ran 20. t/Eponine.t (Wstat: 65280 Tests: 2 Failed: 0) Non-zero exit status: 255 Parse errors: Bad plan. You planned 7 tests but ran 2. t/SABlastPlus.t (Wstat: 65280 Tests: 63 Failed: 0) Non-zero exit status: 255 Parse errors: Bad plan. You planned 65 tests but ran 63. Files=80, Tests=2876, 39 wallclock secs ( 0.54 usr 0.23 sys + 32.54 cusr 4.94 csys = 38.25 CPU) Result: FAIL Failed 3/80 test programs. 0/2876 subtests failed. CJFIELDS/BioPerl-Run-1.006900.tar.gz ./Build test -- NOT OK //hint// to see the cpan-testers results for installing this module, try: reports CJFIELDS/BioPerl-Run-1.006900.tar.gz Running Build install make test had returned bad status, won't install without force From guy.leonard at gmail.com Wed Feb 6 13:35:38 2013 From: guy.leonard at gmail.com (guy.leonard at gmail.com) Date: Wed, 6 Feb 2013 10:35:38 -0800 (PST) Subject: [Bioperl-l] Next BioPerl release In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu> Message-ID: <3e4d717e-b58a-4bfd-943d-6f213bfae260@googlegroups.com> Nice, super work. Will there be a rough list of feature changes/addition/deprecation, or shall I consult git logs? On Tuesday, 5 February 2013 21:53:29 UTC, Christopher Fields wrote: > > All, > > I am scheduling the next BioPerl CPAN release tentatively for March 1. > Any help in triaging bug reports would be greatly appreciated! > > Amongst all other changes, as mentioned in a separate thread we will > remove Bio::FeatureIO, now developed in a separate repository: > > https://github.com/bioperl/Bio-FeatureIO > > Feedback, suggestions, etc are greatly appreciated. > > chris > _______________________________________________ > Bioperl-l mailing list > Biop... at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From guy.leonard at gmail.com Wed Feb 6 13:35:38 2013 From: guy.leonard at gmail.com (guy.leonard at gmail.com) Date: Wed, 6 Feb 2013 10:35:38 -0800 (PST) Subject: [Bioperl-l] Next BioPerl release In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu> Message-ID: <3e4d717e-b58a-4bfd-943d-6f213bfae260@googlegroups.com> Nice, super work. Will there be a rough list of feature changes/addition/deprecation, or shall I consult git logs? On Tuesday, 5 February 2013 21:53:29 UTC, Christopher Fields wrote: > > All, > > I am scheduling the next BioPerl CPAN release tentatively for March 1. > Any help in triaging bug reports would be greatly appreciated! > > Amongst all other changes, as mentioned in a separate thread we will > remove Bio::FeatureIO, now developed in a separate repository: > > https://github.com/bioperl/Bio-FeatureIO > > Feedback, suggestions, etc are greatly appreciated. > > chris > _______________________________________________ > Bioperl-l mailing list > Biop... at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From sidd.basu at gmail.com Wed Feb 6 14:36:17 2013 From: sidd.basu at gmail.com (Siddhartha Basu) Date: Wed, 6 Feb 2013 13:36:17 -0600 Subject: [Bioperl-l] Re: Next BioPerl release In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu> Message-ID: <5112b0b3.a5dc320a.4105.1fe3@mx.google.com> Hi, On Tue, 05 Feb 2013, Fields, Christopher J wrote: > All, > > I am scheduling the next BioPerl CPAN release tentatively for March 1. Any help in triaging bug reports would be greatly appreciated! > > Amongst all other changes, as mentioned in a separate thread we will remove Bio::FeatureIO, now developed in a separate repository: > > https://github.com/bioperl/Bio-FeatureIO > > Feedback, suggestions, etc are greatly appreciated. Here are CI build report on 5.12, 5.14 and 5.16 using travis. https://api.travis-ci.org/jobs/4623997/log.txt?deansi=true https://api.travis-ci.org/jobs/4623998/log.txt?deansi=true https://api.travis-ci.org/jobs/4623999/log.txt?deansi=true Could not get 5.10 to work on travis. Though i activated the (--network) option, it still didn't run one of the test that needs network. Also, initially got confused by the fact that though it has dist.ini, the tests still has to run through Build.PL. Running **dzil test** do not work. Hope this helps. thanks, -siddhartha > > chris > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Wed Feb 6 14:46:49 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 6 Feb 2013 19:46:49 +0000 Subject: [Bioperl-l] Next BioPerl release In-Reply-To: <3e4d717e-b58a-4bfd-943d-6f213bfae260@googlegroups.com> References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu> <3e4d717e-b58a-4bfd-943d-6f213bfae260@googlegroups.com> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1A109@CHIMBX5.ad.uillinois.edu> We've been a little better at keeping track of significant changes this time 'round. There aren't a lot of major updates, but it's important to make sure we get a release out to ensure everyone (not just those familiar with git) can access them. chris On Feb 6, 2013, at 12:35 PM, wrote: > Nice, super work. > > Will there be a rough list of feature changes/addition/deprecation, or > shall I consult git logs? > > On Tuesday, 5 February 2013 21:53:29 UTC, Christopher Fields wrote: >> >> All, >> >> I am scheduling the next BioPerl CPAN release tentatively for March 1. >> Any help in triaging bug reports would be greatly appreciated! >> >> Amongst all other changes, as mentioned in a separate thread we will >> remove Bio::FeatureIO, now developed in a separate repository: >> >> https://github.com/bioperl/Bio-FeatureIO >> >> Feedback, suggestions, etc are greatly appreciated. >> >> chris >> _______________________________________________ >> Bioperl-l mailing list >> Biop... at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Wed Feb 6 14:54:58 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 6 Feb 2013 19:54:58 +0000 Subject: [Bioperl-l] Next BioPerl release In-Reply-To: <5112b0b3.a5dc320a.4105.1fe3@mx.google.com> References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu> <5112b0b3.a5dc320a.4105.1fe3@mx.google.com> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1A1AB@CHIMBX5.ad.uillinois.edu> On Feb 6, 2013, at 1:36 PM, Siddhartha Basu wrote: > Hi, > > On Tue, 05 Feb 2013, Fields, Christopher J wrote: > >> All, >> >> I am scheduling the next BioPerl CPAN release tentatively for March 1. Any help in triaging bug reports would be greatly appreciated! >> >> Amongst all other changes, as mentioned in a separate thread we will remove Bio::FeatureIO, now developed in a separate repository: >> >> https://github.com/bioperl/Bio-FeatureIO >> >> Feedback, suggestions, etc are greatly appreciated. > > Here are CI build report on 5.12, 5.14 and 5.16 using travis. > https://api.travis-ci.org/jobs/4623997/log.txt?deansi=true > https://api.travis-ci.org/jobs/4623998/log.txt?deansi=true > https://api.travis-ci.org/jobs/4623999/log.txt?deansi=true > > Could not get 5.10 to work on travis. Though i activated the (--network) > option, it still didn't run one of the test that needs network. Also, initially got > confused by the fact that though it has dist.ini, the tests still has > to run through Build.PL. Running **dzil test** do not work. > > Hope this helps. > > thanks, > -siddhartha Just to point out, that was for Bio-FeatureIO. Truthfully I'm not worried about that one yet; got to get over Mt. Everest first (the main release). Build.PL is there mainly as a convenience for users w/o Dist::Zilla, which, last I recall, had a higher dependency list than even BioPerl (though I may be mistaken). I'll probably have to set up a Build.PL that can be clobbered by Dist::Zilla as needed. Or we can just get rid of it and insist that dev. code has to be added via 'use lib' or PERL5LIB, and not allow installation. chris From sidd.basu at gmail.com Wed Feb 6 15:26:06 2013 From: sidd.basu at gmail.com (Siddhartha Basu) Date: Wed, 6 Feb 2013 14:26:06 -0600 Subject: [Bioperl-l] Re: Next BioPerl release In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1A1AB@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu> <5112b0b3.a5dc320a.4105.1fe3@mx.google.com> <118F034CF4C3EF48A96F86CE585B94BF6CE1A1AB@CHIMBX5.ad.uillinois.edu> Message-ID: <5112bc60.c69e320a.1e98.2028@mx.google.com> On Wed, 06 Feb 2013, Fields, Christopher J wrote: > On Feb 6, 2013, at 1:36 PM, Siddhartha Basu > wrote: > > > Hi, > > > > On Tue, 05 Feb 2013, Fields, Christopher J wrote: > > > >> All, > >> > >> I am scheduling the next BioPerl CPAN release tentatively for March 1. Any help in triaging bug reports would be greatly appreciated! > >> > >> Amongst all other changes, as mentioned in a separate thread we will remove Bio::FeatureIO, now developed in a separate repository: > >> > >> https://github.com/bioperl/Bio-FeatureIO > >> > >> Feedback, suggestions, etc are greatly appreciated. > > > > Here are CI build report on 5.12, 5.14 and 5.16 using travis. > > https://api.travis-ci.org/jobs/4623997/log.txt?deansi=true > > https://api.travis-ci.org/jobs/4623998/log.txt?deansi=true > > https://api.travis-ci.org/jobs/4623999/log.txt?deansi=true > > > > Could not get 5.10 to work on travis. Though i activated the (--network) > > option, it still didn't run one of the test that needs network. Also, initially got > > confused by the fact that though it has dist.ini, the tests still has > > to run through Build.PL. Running **dzil test** do not work. > > > > Hope this helps. > > > > thanks, > > -siddhartha > > Just to point out, that was for Bio-FeatureIO. Truthfully I'm not worried about that one yet; got to get over Mt. Everest first (the main release). So, what are steps left for getting the release out to CPAN. Like are there lot of feature branches still left to be merged, are there a lot of unit tests still not passing. Just trying to figure out anyway i could be of any help to expedite the release process. However, if they are already taken care of, please ignore. > > Build.PL is there mainly as a convenience for users w/o Dist::Zilla, which, last I recall, had a higher dependency list than even BioPerl (though I may be mistaken). I'll probably have to set up a Build.PL that can be clobbered by Dist::Zilla as needed. As far as the error i encountered, presence of Build.PL was blocking dzil build/release process. And by default, dzil expects to generate Build.PL during its build/release process. However, i am not sure which mode is the most suitable for bioperl devs. > Or we can just get rid of it and insist that dev. code has to be added via 'use lib' or PERL5LIB, and not allow installation. thanks, -siddhartha > > chris From hlapp at drycafe.net Wed Feb 6 16:30:33 2013 From: hlapp at drycafe.net (Hilmar Lapp) Date: Wed, 6 Feb 2013 16:30:33 -0500 Subject: [Bioperl-l] dependencies on perl version In-Reply-To: <20754.39343.128576.743448@gargle.gargle.HOWL> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> Message-ID: Great points, George, and you're making a very compelling argument. I'm in total agreement. It's almost becoming a reason to having to be embarrassed to still be programming in Perl these days, so one might as well have fun while it lasts. -hilmar On Feb 6, 2013, at 12:58 PM, George Hartzell wrote: > Fields, Christopher J writes: >> [...] >> Right, it took ~8 yrs to go from 5.8 to 5.10. I'd like to point >> out that Python users are in the same boat: the Python version for >> CentOS 5 is 2.4.3, and Biopython requires a minimum of python 2.5 >> (and recommends python 2.7). >> >> We can always state that perl 5.8 is supported for the upcoming >> Bioperl release, but we're dropping v5.8 support for any future >> releases. > > Do more than drop support for 5.8. > > The Perl community has put a transparent and predictable process in > place for releasing [generally] better versions of the language. It > means that Perl has a chance of continuing to be relevant, attracting > new talent and actually *fixing* some of the s&%t that gives Perl a > bad rap. It gives people something to plan around, no one should be > surprised that v 5.X.Y is coming out in mid 20ZZ. > > BioPerl should do the same thing, declare a release policy that trails > along with the Perl release schedule. Keep it simple and no one can > argue with it. Support Perl releases as long as the releases > themselves are supported. > > Rather than expending energy supporting out of date platforms, put the > energy into being modern (or Modern...), better distro building and > packaging, testing, documentation and releasing so that the process of > staying current is painless. > > Look forward. Keep it interesting and fun. > > Everyone running Mac OS 9 on their Pismo, raise your hand. Anyone > make their living running sequencing gels in Plexiglas doohickeys on > their lab bench? > > I'm not suggesting that the BioPerl community is free to make > arbitrary and capricious changes that makes it difficult for *anyone* > to get anything done. Churn is a waste of time. > > But why should the all-volunteer BioPerl community be stuck supporting > code from 12 years ago because it's cost effective for someone else to > avoid spending *their* $/time/people to stay up to date. > > Those sites that value stability/maturity/stagnation so highly have > already accepted the cost/difficulty of nailing one of their feet to > the floor as they try to run forward. They recognize and depend on > the benefits of having that stable base but generally they've also > accepted the costs associated with their restrictive choices. They > know how to pull in separate kernel/driver updates so that they can > actually run on nearly modern hardware. They know, and live with, the > fact that they're not going to have access to the shiny new stuff. > And they know how to stay up to date, when they need to, with the > software that their users need to be competitive (e.g. BioConductor > and R). > > As long as (if/when...) updating a BioPerl release is something that > can reliably happen with a few cpanm invocations then the sites that > otherwise favor punctuated equilibrium will learn to handle gradual > change. > > Those folks that are "stuck" on older releases always have the option > of supporting professional Perl programmers to keep older releases > going, backport changes, etc.... They're already buying support for > their platforms (or freeloading and coping), let them put bread on the > table at one of the bioinformatics consultancies or labs if they have > something special they need. > > Have fun. Use sharp tools. Do cool science. Build cool things. No > one is paying you to be backwards compatible with the previous > millennium. > > g. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : =========================================================== From cjfields at illinois.edu Wed Feb 6 17:11:06 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 6 Feb 2013 22:11:06 +0000 Subject: [Bioperl-l] BioPerl long-term, was Re: dependencies on perl version In-Reply-To: References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> George, Should put your post on a pedestal :) tl;dr version: I completely agree, but we need help in order to do this. Long(-winded) version: I agree completely, backwards compatibility is killing us. But, we do need current and new people to get involved and help drive this forward. We need people on all fronts, from coding and bug fixes to documentation and web site maintenance. I've been driving this bus for a number of years now. Not getting tired yet, but I am getting substantially busier with my current endeavors, so my time spent working on BioPerl has dwindled considerably. Any additional support or sharing of responsibilities will help tremendously in keeping up momentum (if someone else wants to take the wheel for a bit, please let me know :). If we follow the perl release route, we should streamline the release process (think Dist::Zilla), end support of older versions of Perl, and work on a sustainable release schedule. The fact that we have so many of us so-called 'old folks' speaking up in favor of this is a very good sign. We do need a bit more than that; we need help. BioPerl is a very large project. A key point we need to address, which is very important for the future of BioPerl. I use Perl quite a bit in my current work (dabble with Ruby and Python as well when I have to). BioPerl? A little, but not as much as I could. Shocked? The main three reason I don't use it 'in anger': performance, performance, and performance. It is very important that we make a concerted effort to address this at all levels. It could be as simple as completely separating parsing from object creation (where the bulk of performance problems seem to lie, but not all of them). A specific example: Heng Li once tested the performance of FASTQ parsing (perl, python, bioperl, biopython, his C code, etc). BioPerl's FASTQ couldn't even be measured; IIRC it went on for many hours until he killed it. This was with the older version of the parser, but I'm willing to bet the newer one I wrote isn't any better. This. needs. to. change. I see no problem in stating any generic parsing and low-level interfaces are just as much a part of what BioPerl encompasses as the higher-level Bio::* classes themselves. Steve and Jason were on to something with SearchIO; it's maybe not as performant as we would like, but it certainly is more flexible in terms of what can be done, b/c it separates out low-level parsing from object creation. That's the general model we should look at. There is a good reason Biopython is following this model with their SearchIO implementation (Peter C, are you reading this?) We have a lot of very talented people involved with this project, both on the purely computational and purely biological end as well as the folks like me who straddle the two domains. A lot of good code out there that can be used, wrapped, taken advantage of, including everything we currently have in BioPerl. Let's come up with something that both works and works well, that people can use on a regular basis, even at a low level if they choose. That alone would dissuade new users from writing up (yet another) custom FASTA/FASTQ/BLAST/GenBank/etc parser b/c the BioPerl one takes millennia to finish. A few examples on this front: Rob Buels created a generic parser for GFF3 (Bio::GFF3::LowLevel) with very few dependencies, we wrap this with the newer Bio::FeatureIO code. Leon has Bio::SFF. Lincoln of course wrote Bio::DB::Sam and Bio::DB::BigFile. I have started a wrapper around Heng's FASTQ/FASTA parsing code (kseq), it seems to work quite well (~20M FASTQ in 30 sec last I recall?). So: If it means targeting performance, backwards-compatibility be damned (using Devel::NYTProf?), we do that. If it means creating a new Bio-NGS repo to focus some of these efforts, so be it. If it means we get away from the Java-based interface stuff in favor of something more Perl-like (roles anyone?), then I'm all for it. If it means we modularize BioPerl so this can be done, well, you probably know where I stand (yes). If it means this is to be BioPerl 2.0, then let's move that direction, sooner than later. But I can't do it alone. We (not just me, but we) need to drive the direction we take. First one who codes gets the gold ring. chris On Feb 6, 2013, at 12:47 PM, Aaron Mackey wrote: > Huzzah! > > -- > Aaron J. Mackey, PhD > Assistant Professor > Center for Public Health Genomics > University of Virginia > amackey at virginia.edu > http://www.cphg.virginia.edu/mackey > > > On Wed, Feb 6, 2013 at 12:58 PM, George Hartzell wrote: > Fields, Christopher J writes: > > [...] > > Right, it took ~8 yrs to go from 5.8 to 5.10. I'd like to point > > out that Python users are in the same boat: the Python version for > > CentOS 5 is 2.4.3, and Biopython requires a minimum of python 2.5 > > (and recommends python 2.7). > > > > We can always state that perl 5.8 is supported for the upcoming > > Bioperl release, but we're dropping v5.8 support for any future > > releases. > > Do more than drop support for 5.8. > > The Perl community has put a transparent and predictable process in > place for releasing [generally] better versions of the language. It > means that Perl has a chance of continuing to be relevant, attracting > new talent and actually *fixing* some of the s&%t that gives Perl a > bad rap. It gives people something to plan around, no one should be > surprised that v 5.X.Y is coming out in mid 20ZZ. > > BioPerl should do the same thing, declare a release policy that trails > along with the Perl release schedule. Keep it simple and no one can > argue with it. Support Perl releases as long as the releases > themselves are supported. > > Rather than expending energy supporting out of date platforms, put the > energy into being modern (or Modern...), better distro building and > packaging, testing, documentation and releasing so that the process of > staying current is painless. > > Look forward. Keep it interesting and fun. > > Everyone running Mac OS 9 on their Pismo, raise your hand. Anyone > make their living running sequencing gels in Plexiglas doohickeys on > their lab bench? > > I'm not suggesting that the BioPerl community is free to make > arbitrary and capricious changes that makes it difficult for *anyone* > to get anything done. Churn is a waste of time. > > But why should the all-volunteer BioPerl community be stuck supporting > code from 12 years ago because it's cost effective for someone else to > avoid spending *their* $/time/people to stay up to date. > > Those sites that value stability/maturity/stagnation so highly have > already accepted the cost/difficulty of nailing one of their feet to > the floor as they try to run forward. They recognize and depend on > the benefits of having that stable base but generally they've also > accepted the costs associated with their restrictive choices. They > know how to pull in separate kernel/driver updates so that they can > actually run on nearly modern hardware. They know, and live with, the > fact that they're not going to have access to the shiny new stuff. > And they know how to stay up to date, when they need to, with the > software that their users need to be competitive (e.g. BioConductor > and R). > > As long as (if/when...) updating a BioPerl release is something that > can reliably happen with a few cpanm invocations then the sites that > otherwise favor punctuated equilibrium will learn to handle gradual > change. > > Those folks that are "stuck" on older releases always have the option > of supporting professional Perl programmers to keep older releases > going, backport changes, etc.... They're already buying support for > their platforms (or freeloading and coping), let them put bread on the > table at one of the bioinformatics consultancies or labs if they have > something special they need. > > Have fun. Use sharp tools. Do cool science. Build cool things. No > one is paying you to be backwards compatible with the previous > millennium. > > g. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From cjfields at illinois.edu Wed Feb 6 17:34:42 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 6 Feb 2013 22:34:42 +0000 Subject: [Bioperl-l] BioPerl long-term, was Re: dependencies on perl version In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1AF0C@CHIMBX5.ad.uillinois.edu> I want to clarify, parser optimization isn't the only point we need to focus on by any means (and may not be the main one). There is a lot of room for improvement top to bottom, that was one specific example I have long held to be an issue. -c On Feb 6, 2013, at 4:11 PM, "Fields, Christopher J" wrote: > Shocked? The main three reason I don't use it 'in anger': performance, performance, and performance. It is very important that we make a concerted effort to address this at all levels. It could be as simple as completely separating parsing from object creation (where the bulk of performance problems seem to lie, but not all of them). ... From p.j.a.cock at googlemail.com Wed Feb 6 17:43:13 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 6 Feb 2013 22:43:13 +0000 Subject: [Bioperl-l] BioPerl long-term, was Re: dependencies on perl version In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> Message-ID: On Wed, Feb 6, 2013 at 10:11 PM, Fields, Christopher J wrote: > > I see no problem in stating any generic parsing and low-level interfaces > are just as much a part of what BioPerl encompasses as the higher-level > Bio::* classes themselves. Steve and Jason were on to something with > SearchIO; it's maybe not as performant as we would like, but it certainly > is more flexible in terms of what can be done, b/c it separates out > low-level parsing from object creation. That's the general model we > should look at. There is a good reason Biopython is following this > model with their SearchIO implementation (Peter C, are you reading this?) Actually I don't think we did end up with that kind of separation in the Biopython SearchIO - which is not so say it isn't an excellent model to follow. Rather the Biopython SearchIO (like the BioPerl one) had as the first goal a consistent object model across assorted file formats. The idea of a low level minimal overhead parsers (which are very format specific), on which a heavier but consistent object model can be built might be a good balance - the high level API has the connivence, but if you give that up you can have more speed. That's what I recommend with FASTQ and Biopython, e.g. http://news.open-bio.org/news/2009/09/biopython-fast-fastq/ > > I have started a wrapper around Heng's FASTQ/FASTA parsing > code (kseq), it seems to work quite well (~20M FASTQ in 30 sec > last I recall?). > I'd have to dig through my emails, but I think the BioRuby guys looked at that too - as I recall while it was fast, the error handling left something to be desired. Email me directly or on the BioRuby list if you want to follow up on that. Regards, Peter From cjfields at illinois.edu Wed Feb 6 17:53:21 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 6 Feb 2013 22:53:21 +0000 Subject: [Bioperl-l] FASTQ, was Re: BioPerl long-term, was Re: dependencies on perl version In-Reply-To: References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu> On Feb 6, 2013, at 4:43 PM, Peter Cock wrote: > On Wed, Feb 6, 2013 at 10:11 PM, Fields, Christopher J > wrote: >> >> I see no problem in stating any generic parsing and low-level interfaces >> are just as much a part of what BioPerl encompasses as the higher-level >> Bio::* classes themselves. Steve and Jason were on to something with >> SearchIO; it's maybe not as performant as we would like, but it certainly >> is more flexible in terms of what can be done, b/c it separates out >> low-level parsing from object creation. That's the general model we >> should look at. There is a good reason Biopython is following this >> model with their SearchIO implementation (Peter C, are you reading this?) > > Actually I don't think we did end up with that kind of separation in the > Biopython SearchIO - which is not so say it isn't an excellent model > to follow. Rather the Biopython SearchIO (like the BioPerl one) had > as the first goal a consistent object model across assorted file > formats. > > The idea of a low level minimal overhead parsers (which are very > format specific), on which a heavier but consistent object model > can be built might be a good balance - the high level API has the > connivence, but if you give that up you can have more speed. > That's what I recommend with FASTQ and Biopython, e.g. > http://news.open-bio.org/news/2009/09/biopython-fast-fastq/ > >> >> I have started a wrapper around Heng's FASTQ/FASTA parsing >> code (kseq), it seems to work quite well (~20M FASTQ in 30 sec >> last I recall?). >> > > I'd have to dig through my emails, but I think the BioRuby guys > looked at that too - as I recall while it was fast, the error handling > left something to be desired. Email me directly or on the BioRuby > list if you want to follow up on that. > > Regards, > > Peter I did a little on this, worth following up on, but I pulled the FASTQ test examples you created from the paper to test it out. IIRC it parsed where it needed to, but I'm not sure how it handled bad sequences, so yes, worth looking into. Maybe worth moving to open-bio-l for broader discussion. chris From whereverroadgoes at gmail.com Wed Feb 6 16:59:04 2013 From: whereverroadgoes at gmail.com (Slym) Date: Wed, 6 Feb 2013 13:59:04 -0800 (PST) Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases In-Reply-To: <87txpr26jj.fsf@topper.koldfront.dk> References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com> <87txpr26jj.fsf@topper.koldfront.dk> Message-ID: <411e920d-e614-417d-9198-78bef9adba16@googlegroups.com> Everything's working now! Thank you very much, especially to you Adam! > From carandraug+dev at gmail.com Wed Feb 6 20:38:20 2013 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Thu, 7 Feb 2013 01:38:20 +0000 Subject: [Bioperl-l] dependencies on perl version In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> Message-ID: On 5 February 2013 20:56, Fields, Christopher J wrote: > On Feb 5, 2013, at 2:06 PM, Carn? Draug wrote: >> how much perl backwards compatibility does bioperl needs to keep? > > Aim for 5.10.1, but be careful of smart-match. Well, I solved my problem differently and ended up not needing any of the new features. But next time I'll know. Thanks Carn? From pcantalupo at gmail.com Wed Feb 6 23:04:08 2013 From: pcantalupo at gmail.com (Paul Cantalupo) Date: Wed, 6 Feb 2013 23:04:08 -0500 Subject: [Bioperl-l] bug 3376 status needs updated Message-ID: Hi, A few months ago, I fixed bug 3376 ( https://github.com/bioperl/bioperl-live/commit/142e5d79e3a6593db32bf0af99048f47d01bd3f2). The Redmine bug page (https://redmine.open-bio.org/issues/3376) hasn't been updated to resolved or closed. Should I do this or is Chris the only one who does that? Thank you, Paul From cjfields at illinois.edu Wed Feb 6 23:20:30 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Thu, 7 Feb 2013 04:20:30 +0000 Subject: [Bioperl-l] bug 3376 status needs updated In-Reply-To: References: Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1B45C@CHIMBX5.ad.uillinois.edu> No, go ahead and close it. Let me know if you run into perm. problems with it. chris On Feb 6, 2013, at 10:04 PM, Paul Cantalupo wrote: > Hi, > > A few months ago, I fixed bug 3376 ( > https://github.com/bioperl/bioperl-live/commit/142e5d79e3a6593db32bf0af99048f47d01bd3f2). > The Redmine bug page (https://redmine.open-bio.org/issues/3376) hasn't been > updated to resolved or closed. Should I do this or is Chris the only one > who does that? > > Thank you, > > Paul > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From l.m.timmermans at students.uu.nl Thu Feb 7 04:07:57 2013 From: l.m.timmermans at students.uu.nl (Leon Timmermans) Date: Thu, 7 Feb 2013 10:07:57 +0100 Subject: [Bioperl-l] Next BioPerl release In-Reply-To: <5112bc60.c69e320a.1e98.2028@mx.google.com> References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu> <5112b0b3.a5dc320a.4105.1fe3@mx.google.com> <118F034CF4C3EF48A96F86CE585B94BF6CE1A1AB@CHIMBX5.ad.uillinois.edu> <5112bc60.c69e320a.1e98.2028@mx.google.com> Message-ID: On Wed, Feb 6, 2013 at 9:26 PM, Siddhartha Basu wrote: > As far as the error i encountered, presence of Build.PL was blocking dzil > build/release process. And by default, dzil expects to generate > Build.PL during its build/release process. However, i am not sure which > mode is the most suitable for bioperl devs. You can prune the Build.PL, and then let dzil add its own. We wouldn't be the first to do that sort of thing. Leon From amackey at virginia.edu Thu Feb 7 10:25:07 2013 From: amackey at virginia.edu (Aaron Mackey) Date: Thu, 7 Feb 2013 10:25:07 -0500 Subject: [Bioperl-l] FASTQ, was Re: BioPerl long-term, was Re: dependencies on perl version In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> <118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu> Message-ID: You might also want to consider a lazy/pull-based parser to defer parsing/object-building for pieces of the object that don't get used. This also usually provides some error tolerance. -Aaron -- Aaron J. Mackey, PhD Assistant Professor Center for Public Health Genomics University of Virginia amackey at virginia.edu http://www.cphg.virginia.edu/mackey On Wed, Feb 6, 2013 at 5:53 PM, Fields, Christopher J wrote: > On Feb 6, 2013, at 4:43 PM, Peter Cock wrote: > > > On Wed, Feb 6, 2013 at 10:11 PM, Fields, Christopher J > > wrote: > >> > >> I see no problem in stating any generic parsing and low-level interfaces > >> are just as much a part of what BioPerl encompasses as the higher-level > >> Bio::* classes themselves. Steve and Jason were on to something with > >> SearchIO; it's maybe not as performant as we would like, but it > certainly > >> is more flexible in terms of what can be done, b/c it separates out > >> low-level parsing from object creation. That's the general model we > >> should look at. There is a good reason Biopython is following this > >> model with their SearchIO implementation (Peter C, are you reading > this?) > > > > Actually I don't think we did end up with that kind of separation in the > > Biopython SearchIO - which is not so say it isn't an excellent model > > to follow. Rather the Biopython SearchIO (like the BioPerl one) had > > as the first goal a consistent object model across assorted file > > formats. > > > > The idea of a low level minimal overhead parsers (which are very > > format specific), on which a heavier but consistent object model > > can be built might be a good balance - the high level API has the > > connivence, but if you give that up you can have more speed. > > That's what I recommend with FASTQ and Biopython, e.g. > > http://news.open-bio.org/news/2009/09/biopython-fast-fastq/ > > > >> > >> I have started a wrapper around Heng's FASTQ/FASTA parsing > >> code (kseq), it seems to work quite well (~20M FASTQ in 30 sec > >> last I recall?). > >> > > > > I'd have to dig through my emails, but I think the BioRuby guys > > looked at that too - as I recall while it was fast, the error handling > > left something to be desired. Email me directly or on the BioRuby > > list if you want to follow up on that. > > > > Regards, > > > > Peter > > I did a little on this, worth following up on, but I pulled the FASTQ test > examples you created from the paper to test it out. IIRC it parsed where > it needed to, but I'm not sure how it handled bad sequences, so yes, worth > looking into. Maybe worth moving to open-bio-l for broader discussion. > > chris > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From tiago.hori at gmail.com Thu Feb 7 09:58:37 2013 From: tiago.hori at gmail.com (Tiago Hori) Date: Thu, 7 Feb 2013 06:58:37 -0800 (PST) Subject: [Bioperl-l] Search I::O In-Reply-To: <6B0BCF1B-4B67-4697-9B34-8F822B4DC565@gmail.com> References: <39b1269f-63a7-4b29-af79-8c93ab231abf@googlegroups.com> <6B0BCF1B-4B67-4697-9B34-8F822B4DC565@gmail.com> Message-ID: Thanks, Jason! It is working Now. So here is what I am trying to accomplish. For a given Blastx report, I want to extract the best BLASTx hit that is human, and does not contain unnamed or Predicted. I got very close, but I still can't get it to give me only the top BLAST hit, it gives me all blast hits that meet my criteria. I tried using "last" to stop it from looping through the hits, once it found a human one, but it didn't work. Can someone help? Here is my code so far (mostly stolen for the wiki). use strict; use Bio::SearchIO; my $in = new Bio::SearchIO(-format => 'blast', -file => 'testsalmon.txt'); while( my $result = $in->next_result ) { ## $result is a Bio::Search::Result::ResultI compliant object while( my $hit = $result->next_hit ) { ## $hit is a Bio::Search::Hit::HitI compliant object if( $hit->description !~ /[Uu]nnamed|PREDICTED|hypothetical/){ if( $hit->description =~ /Homo sapiens/){ while( my $hsp = $hit->next_hsp ) { ## $hsp is a Bio::Search::HSP::HSPI compliant object if( $hsp->length('total') > 50 ) { if ( $hsp->percent_identity >= 30) { if( $hsp->evalue <= 1e-05){ print "Query=", $result->query_name,"\t", " Description=", $hit->description,"\t", " Hit=", $hit->name,"\t", " Length=", $hsp->length('total'),"\t", " Percent_id=", $hsp->percent_identity,"\t", } } } } } } } } T. On Wednesday, February 6, 2013 6:46:47 PM UTC-3:30, Jason Stajich wrote: > > you are missing a comma after the -format => 'blast' > should be > my $in = Bio::SearchIO->new(-format => 'blast', > -file => 'XXX' ); > > > On Feb 5, 2013, at 7:21 AM, Tiago Hori > > wrote: > > > Hi All, > > > > I am trying to find the best putative orthologs for 44K Atlantic Salmon > > sequences, and so I need to parse 44K BLAST reports to find the best > human > > hit. I am trying to learn Seach::IO, but when I try the first example on > > the HOWTO: use strict; > > use Bio::SearchIO; > > > > my $in = new Bio::SearchIO(-format => 'blast' > > -file => 'C001R047.txt'); > > > > while( my $result = $in->next_result ) { > > ## $result is a Bio::Search::Result::ResultI compliant object > > while( my $hit = $result->next_hit ) { > > ## $hit is a Bio::Search::Hit::HitI compliant object > > while( my $hsp = $hit->next_hsp ) { > > ## $hsp is a Bio::Search::HSP::HSPI compliant object > > if( $hsp->length('total') > 50 ) { > > if ( $hsp->percent_identity >= 75 ) { > > print "Query=", $result->query_name, > > " Hit=", $hit->name, > > " Length=", $hsp->length('total'), > > " Percent_id=", $hsp->percent_identity, "\n"; > > } > > } > > } > > } > > } > > > > I get this error: Odd number of elements in hash assignment at > > /usr/local/share/perl/5.14.2/Bio/SearchIO.pm line 189. > > > > I am using BioPerl version 1.6.901. Is there a format problem with the > > blast reports? > > > > Any help would be greatly appreciated! > > > > T. > > _______________________________________________ > > Bioperl-l mailing list > > Biop... at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Jason Stajich > jason.... at gmail.com > ja... at bioperl.org > > From cjfields at illinois.edu Thu Feb 7 10:56:04 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Thu, 7 Feb 2013 15:56:04 +0000 Subject: [Bioperl-l] FASTQ, was Re: BioPerl long-term, was Re: dependencies on perl version In-Reply-To: References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> <118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1C42B@CHIMBX5.ad.uillinois.edu> This will likely be the approach for more NGS-friendly Bio::Seq class. Calculation of the PHRED scores could also be deferred until needed. seqtk has some C-based methods that we could possibly take advantage of, but will have to look into it. chris On Feb 7, 2013, at 9:25 AM, Aaron Mackey wrote: > You might also want to consider a lazy/pull-based parser to defer parsing/object-building for pieces of the object that don't get used. This also usually provides some error tolerance. > > -Aaron > > -- > Aaron J. Mackey, PhD > Assistant Professor > Center for Public Health Genomics > University of Virginia > amackey at virginia.edu > http://www.cphg.virginia.edu/mackey > > > On Wed, Feb 6, 2013 at 5:53 PM, Fields, Christopher J wrote: > On Feb 6, 2013, at 4:43 PM, Peter Cock wrote: > > > On Wed, Feb 6, 2013 at 10:11 PM, Fields, Christopher J > > wrote: > >> > >> I see no problem in stating any generic parsing and low-level interfaces > >> are just as much a part of what BioPerl encompasses as the higher-level > >> Bio::* classes themselves. Steve and Jason were on to something with > >> SearchIO; it's maybe not as performant as we would like, but it certainly > >> is more flexible in terms of what can be done, b/c it separates out > >> low-level parsing from object creation. That's the general model we > >> should look at. There is a good reason Biopython is following this > >> model with their SearchIO implementation (Peter C, are you reading this?) > > > > Actually I don't think we did end up with that kind of separation in the > > Biopython SearchIO - which is not so say it isn't an excellent model > > to follow. Rather the Biopython SearchIO (like the BioPerl one) had > > as the first goal a consistent object model across assorted file > > formats. > > > > The idea of a low level minimal overhead parsers (which are very > > format specific), on which a heavier but consistent object model > > can be built might be a good balance - the high level API has the > > connivence, but if you give that up you can have more speed. > > That's what I recommend with FASTQ and Biopython, e.g. > > http://news.open-bio.org/news/2009/09/biopython-fast-fastq/ > > > >> > >> I have started a wrapper around Heng's FASTQ/FASTA parsing > >> code (kseq), it seems to work quite well (~20M FASTQ in 30 sec > >> last I recall?). > >> > > > > I'd have to dig through my emails, but I think the BioRuby guys > > looked at that too - as I recall while it was fast, the error handling > > left something to be desired. Email me directly or on the BioRuby > > list if you want to follow up on that. > > > > Regards, > > > > Peter > > I did a little on this, worth following up on, but I pulled the FASTQ test examples you created from the paper to test it out. IIRC it parsed where it needed to, but I'm not sure how it handled bad sequences, so yes, worth looking into. Maybe worth moving to open-bio-l for broader discussion. > > chris > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From amackey at virginia.edu Thu Feb 7 11:09:14 2013 From: amackey at virginia.edu (Aaron Mackey) Date: Thu, 7 Feb 2013 11:09:14 -0500 Subject: [Bioperl-l] FASTQ, was Re: BioPerl long-term, was Re: dependencies on perl version In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1C42B@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> <118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu> <118F034CF4C3EF48A96F86CE585B94BF6CE1C42B@CHIMBX5.ad.uillinois.edu> Message-ID: e.g., a pull-based FASTQ parser that did nothing else at the top level but "chunk" the file into as-yet-unparsed four-line blobs could appear to work very fast, if the user code did nothing but count the number of entries: while (my $seq = $seqio->nextseq) { $ct++ }; in other words, you defer *everything* except the minimal amount of parsing/logic required to detect object boundaries. This is, in fact, the exact opposite of the event-based SearchIO "push" parsers, which always perform the most parsing possible, despite the user never accessing most of the material. Lastly, with respect to performance, if the parsing/object building operation is not simply IO bound, then parallel parser/object-building CPU threads could be considered, which could then dynamically adapt to pre-parse attributes (e.g. quality scores) that the calling code was actually using. What's the state of thread-safe Perl these days? -Aaron On Thu, Feb 7, 2013 at 10:56 AM, Fields, Christopher J < cjfields at illinois.edu> wrote: > This will likely be the approach for more NGS-friendly Bio::Seq class. > Calculation of the PHRED scores could also be deferred until needed. > > seqtk has some C-based methods that we could possibly take advantage of, > but will have to look into it. > > chris > > On Feb 7, 2013, at 9:25 AM, Aaron Mackey wrote: > > > You might also want to consider a lazy/pull-based parser to defer > parsing/object-building for pieces of the object that don't get used. This > also usually provides some error tolerance. > > > > -Aaron > From sidd.basu at gmail.com Thu Feb 7 11:38:47 2013 From: sidd.basu at gmail.com (Siddhartha Basu) Date: Thu, 7 Feb 2013 10:38:47 -0600 Subject: [Bioperl-l] Re: FASTQ, was Re:BioPerl long-term, was Re: dependencies on perl version In-Reply-To: References: <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> <118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu> <118F034CF4C3EF48A96F86CE585B94BF6CE1C42B@CHIMBX5.ad.uillinois.edu> Message-ID: <5113d899.ea64320a.489a.262d@mx.google.com> Another approach might be use map-reduce(Hadoop) if possible. I have seen one implementation in biopython's GFF3 parser. http://bcbio.wordpress.com/2009/03/22/mapreduce-implementation-of-gff-parsing-for-biopython/ -siddhartha On Thu, 07 Feb 2013, Aaron Mackey wrote: > e.g., a pull-based FASTQ parser that did nothing else at the top level but > "chunk" the file into as-yet-unparsed four-line blobs could appear to work > very fast, if the user code did nothing but count the number of entries: > > while (my $seq = $seqio->nextseq) { $ct++ }; > > in other words, you defer *everything* except the minimal amount of > parsing/logic required to detect object boundaries. > > This is, in fact, the exact opposite of the event-based SearchIO "push" > parsers, which always perform the most parsing possible, despite the user > never accessing most of the material. > > Lastly, with respect to performance, if the parsing/object building > operation is not simply IO bound, then parallel parser/object-building CPU > threads could be considered, which could then dynamically adapt to > pre-parse attributes (e.g. quality scores) that the calling code was > actually using. What's the state of thread-safe Perl these days? > > -Aaron > > > On Thu, Feb 7, 2013 at 10:56 AM, Fields, Christopher J < > cjfields at illinois.edu> wrote: > > > This will likely be the approach for more NGS-friendly Bio::Seq class. > > Calculation of the PHRED scores could also be deferred until needed. > > > > seqtk has some C-based methods that we could possibly take advantage of, > > but will have to look into it. > > > > chris > > > > On Feb 7, 2013, at 9:25 AM, Aaron Mackey wrote: > > > > > You might also want to consider a lazy/pull-based parser to defer > > parsing/object-building for pieces of the object that don't get used. This > > also usually provides some error tolerance. > > > > > > -Aaron > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Thu Feb 7 11:55:53 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Thu, 7 Feb 2013 16:55:53 +0000 Subject: [Bioperl-l] FASTQ, was Re:BioPerl long-term, was Re: dependencies on perl version In-Reply-To: <5113d899.ea64320a.489a.262d@mx.google.com> References: <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> <118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu> <118F034CF4C3EF48A96F86CE585B94BF6CE1C42B@CHIMBX5.ad.uillinois.edu> <5113d899.ea64320a.489a.262d@mx.google.com> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1C7B8@CHIMBX5.ad.uillinois.edu> I think we will want to allow for a multitude of implementations. SeqIO already allows for that to a degree, but multiple backend implementations (say, different ways of parsing/processing FASTQ and others) isn't supported yet. chris On Feb 7, 2013, at 10:38 AM, Siddhartha Basu wrote: > Another approach might be use map-reduce(Hadoop) if possible. I have > seen one implementation in biopython's GFF3 parser. > http://bcbio.wordpress.com/2009/03/22/mapreduce-implementation-of-gff-parsing-for-biopython/ > > -siddhartha > > > On Thu, 07 Feb 2013, Aaron Mackey wrote: > >> e.g., a pull-based FASTQ parser that did nothing else at the top level but >> "chunk" the file into as-yet-unparsed four-line blobs could appear to work >> very fast, if the user code did nothing but count the number of entries: >> >> while (my $seq = $seqio->nextseq) { $ct++ }; >> >> in other words, you defer *everything* except the minimal amount of >> parsing/logic required to detect object boundaries. >> >> This is, in fact, the exact opposite of the event-based SearchIO "push" >> parsers, which always perform the most parsing possible, despite the user >> never accessing most of the material. >> >> Lastly, with respect to performance, if the parsing/object building >> operation is not simply IO bound, then parallel parser/object-building CPU >> threads could be considered, which could then dynamically adapt to >> pre-parse attributes (e.g. quality scores) that the calling code was >> actually using. What's the state of thread-safe Perl these days? >> >> -Aaron >> >> >> On Thu, Feb 7, 2013 at 10:56 AM, Fields, Christopher J < >> cjfields at illinois.edu> wrote: >> >>> This will likely be the approach for more NGS-friendly Bio::Seq class. >>> Calculation of the PHRED scores could also be deferred until needed. >>> >>> seqtk has some C-based methods that we could possibly take advantage of, >>> but will have to look into it. >>> >>> chris >>> >>> On Feb 7, 2013, at 9:25 AM, Aaron Mackey wrote: >>> >>>> You might also want to consider a lazy/pull-based parser to defer >>> parsing/object-building for pieces of the object that don't get used. This >>> also usually provides some error tolerance. >>>> >>>> -Aaron >>> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Thu Feb 7 12:01:07 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Thu, 7 Feb 2013 17:01:07 +0000 Subject: [Bioperl-l] FASTQ, was Re: BioPerl long-term, was Re: dependencies on perl version In-Reply-To: References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> <118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu> <118F034CF4C3EF48A96F86CE585B94BF6CE1C42B@CHIMBX5.ad.uillinois.edu> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1C7EF@CHIMBX5.ad.uillinois.edu> re: thread-safe perl, so-so at best from what I understand. chris On Feb 7, 2013, at 10:09 AM, Aaron Mackey wrote: > e.g., a pull-based FASTQ parser that did nothing else at the top level but "chunk" the file into as-yet-unparsed four-line blobs could appear to work very fast, if the user code did nothing but count the number of entries: > > while (my $seq = $seqio->nextseq) { $ct++ }; > > in other words, you defer *everything* except the minimal amount of parsing/logic required to detect object boundaries. > > This is, in fact, the exact opposite of the event-based SearchIO "push" parsers, which always perform the most parsing possible, despite the user never accessing most of the material. > > Lastly, with respect to performance, if the parsing/object building operation is not simply IO bound, then parallel parser/object-building CPU threads could be considered, which could then dynamically adapt to pre-parse attributes (e.g. quality scores) that the calling code was actually using. What's the state of thread-safe Perl these days? > > -Aaron > > > On Thu, Feb 7, 2013 at 10:56 AM, Fields, Christopher J wrote: > This will likely be the approach for more NGS-friendly Bio::Seq class. Calculation of the PHRED scores could also be deferred until needed. > > seqtk has some C-based methods that we could possibly take advantage of, but will have to look into it. > > chris > > On Feb 7, 2013, at 9:25 AM, Aaron Mackey wrote: > > > You might also want to consider a lazy/pull-based parser to defer parsing/object-building for pieces of the object that don't get used. This also usually provides some error tolerance. > > > > -Aaron From hartzell at alerce.com Thu Feb 7 16:36:24 2013 From: hartzell at alerce.com (George Hartzell) Date: Thu, 7 Feb 2013 13:36:24 -0800 Subject: [Bioperl-l] BioPerl long-term, was Re: dependencies on perl version In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> Message-ID: <20756.7768.125680.662488@gargle.gargle.HOWL> Fields, Christopher J writes: > George, > > Should put your post on a pedestal :) > > tl;dr version: I completely agree, but we need help in order to do this. > [...] And therein lies the [a] problem. Don't look at me.... I'm not coding on bioinformatics problems these days (though I'm available...) so _maybe_ I shouldn't have gotten up on the soapbox. But I'm so sick of getting into arguments (or walking away from them...) with Ruby and Python [and lisp and *PHP*] fans; Perl is dead, you can't write good code in Perl, look - Ruby has GEMS!, etc... Perl of the olden days was an easy language in which to write really shitty code. Even the Perl of the BioPerl heyday wasn't really much help; role your own OO, role your own distro-building, mountains of monkey-work to provide consistent POD, versioning, etc... But that's not the Perl that I use. I have Moose and Moo. TAP and the things built on it. Dist::Zilla. PerlTidy. PerlCritic. cpanm. MetaCPAN. Pinto. GitHub. Perlbrew. Wow. It isn't any harder to write good code, for measures that I care about, using Perl than it is *any* of the other similar languages. And it's just as easy, and happens just as frequently, for people to write shitty (undocumented, untested, poorly managed, poorly packaged, ...) stuff in the other languages. GET OFF MY LAWN, KID! (Yeah, I know...) But BioPerl *is* dying. You might be standing on the shoulders of giants when you use it to solve a problem, but you *definitely* have those same giants (and their extended families) on your shoulders every time I see you try move the project forward. All of that history has become the tail that's wagging the dog. If all y'all are going to keep the thing alive, moving forward and contributing to new great works then make Apple your hero. Deprecate the stuff that's holding you back, give folks a path forward and move on. Have fun. Use sharp tools. Do cool science. Build cool things. Advance your careers (forgot that one last time). Be reasonable and professional. Supporting last year's projects is someone else's business opportunity. g. ps. Are all y'all following this thread? http://news.ycombinator.com/item?id=5123022 Maybe someone should search down for this bit: "Where to start? Any list of this [sic] projects?" and insert a plug for the various open-bio projects. (But "someone" doesn't work here, he said...). From cjfields at illinois.edu Thu Feb 7 18:12:19 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Thu, 7 Feb 2013 23:12:19 +0000 Subject: [Bioperl-l] BioPerl long-term, was Re: dependencies on perl version In-Reply-To: <20756.7768.125680.662488@gargle.gargle.HOWL> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> <20756.7768.125680.662488@gargle.gargle.HOWL> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1D071@CHIMBX5.ad.uillinois.edu> On Feb 7, 2013, at 3:36 PM, George Hartzell wrote: > Fields, Christopher J writes: >> George, >> >> Should put your post on a pedestal :) >> >> tl;dr version: I completely agree, but we need help in order to do this. >> [...] > > And therein lies the [a] problem. Don't look at me.... > > I'm not coding on bioinformatics problems these days (though I'm > available...) so _maybe_ I shouldn't have gotten up on the soapbox. > > But I'm so sick of getting into arguments (or walking away from > them...) with Ruby and Python [and lisp and *PHP*] fans; Perl is dead, > you can't write good code in Perl, look - Ruby has GEMS!, etc? Right, but that's a perception not just in the Bio* world. It's larger and more pervasive than that. > Perl of the olden days was an easy language in which to write really > shitty code. Even the Perl of the BioPerl heyday wasn't really much > help; role your own OO, role your own distro-building, mountains of > monkey-work to provide consistent POD, versioning, etc... > > But that's not the Perl that I use. I have Moose and Moo. TAP and > the things built on it. Dist::Zilla. PerlTidy. PerlCritic. cpanm. > MetaCPAN. Pinto. GitHub. Perlbrew. Wow. Yes, and that is the direction we need to go in. > It isn't any harder to write good code, for measures that I care > about, using Perl than it is *any* of the other similar languages. > > And it's just as easy, and happens just as frequently, for people to > write shitty (undocumented, untested, poorly managed, poorly packaged, > ...) stuff in the other languages. Oh, I know. I'm working on some very nice looking but terribly implemented Python code now. > GET OFF MY LAWN, KID! (Yeah, I know...) > > But BioPerl *is* dying. You might be standing on the shoulders of > giants when you use it to solve a problem, but you *definitely* have > those same giants (and their extended families) on your shoulders > every time I see you try move the project forward. All of that > history has become the tail that's wagging the dog. Yep. > If all y'all are going to keep the thing alive, moving forward and > contributing to new great works then make Apple your hero. Deprecate > the stuff that's holding you back, give folks a path forward and move > on. That's fine. > Have fun. Use sharp tools. Do cool science. Build cool things. > Advance your careers (forgot that one last time). Be reasonable and > professional. > > Supporting last year's projects is someone else's business > opportunity. > > g. Right, but this isn't just my show. I can't do this alone; it's simply too much code and I don't have even 1/4 the time I used to have. > ps. Are all y'all following this thread? > > http://news.ycombinator.com/item?id=5123022 > > Maybe someone should search down for this bit: "Where to start? Any > list of this [sic] projects?" and insert a plug for the various > open-bio projects. (But "someone" doesn't work here, he said?). Read the original guy's post. He's completely delusional (okay, maybe not *completely*, but he comes across as quite bitter and unrealistic). Frankly I don't feel so bad if he wants to leave. He doesn't like messy things. Biology is messy, if one doesn't understand that then computational biology is not for them. chris From carandraug+dev at gmail.com Thu Feb 7 23:12:22 2013 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Fri, 8 Feb 2013 04:12:22 +0000 Subject: [Bioperl-l] BioPerl long-term, was Re: dependencies on perl version Message-ID: On 6 February 2013 22:11, "Fields, Christopher J" wrote: > [...] > So: > > If it means targeting performance, backwards-compatibility be damned (using Devel::NYTProf?), we do that. > > If it means creating a new Bio-NGS repo to focus some of these efforts, so be it. > > If it means we get away from the Java-based interface stuff in favor of something more Perl-like (roles anyone?), then I'm all for it. > > If it means we modularize BioPerl so this can be done, well, you probably know where I stand (yes). > > If it means this is to be BioPerl 2.0, then let's move that direction, sooner than later. > > But I can't do it alone. We (not just me, but we) need to drive the direction we take. > > First one who codes gets the gold ring. Hi I know I'm not much involved with bioperl development but here's my suggestion as maintainer of another quite modular free software project. I swear I'm not promoting it. Skip to the last paragraph for the very short version. Octave Forge is now a collection of packages for GNU Octave, each released independently whenever its maintainer sees fit. But it wasn't like that before. For a long time, everything was released at the same time, there was no independent packages. Then it was decided to split it into sections: main, extra and nonfree (free software dependent on non-free libraries, now purged), and inside those, it was split into packages, each with its own maintainer. But some packages were (and are) more active that the others. Some packages even came from single contributions and we never heard from the authors again. And so, with time, cruft settled in. We didn't want to remove the code, but no one was interested or comfortable enough on the field, to fix it either. Packages that had a much more active development were being dragged down by code that no one was maintaining. So we broke with that and each package is now released independently. We have packages that haven't been released in 3 years yes, but that just shows the packages that no one cares about. Those have been marked as unmaintained and anyone can come around and make a release if they care about it. As the maintainer of the project, I do *not* make the releases of the packages. The package maintainers prepares everything and uploads them, I only run a handful of tests (takes me 10min), upload it to our server, and make the official announcement. I am also the maintainer of one of the packages, and have often made releases of unmaintained packages because I needed it. That's to show, if they are important enough for someone, they will get a release somehow. If they are not important, why would we waste our time on them anyway? We now around 5 package releases per month, many of them being minor releases with a handful of bug fixes. Preparing a release of a small package is much easier and much less trouble than preparing a giant release encompassing all of them at the same time. Short version: I'd recommend to split the project into much smaller ones. Some of the small ones will wither and die but those are the less important ones, and will allow the others, the ones that people care about, freedom to grow faster. Bioperl would still be just one project, that incorporates a hundred or so of smaller modules. Let those who care the most about a specific module to take care of it and make the releases. Releasing a module becomes much simpler, which means more releases, more activity, and the smaller code base for each module also make it less intimidating for new contributors. Carn? From hartzell at alerce.com Fri Feb 8 01:17:17 2013 From: hartzell at alerce.com (George Hartzell) Date: Thu, 7 Feb 2013 22:17:17 -0800 Subject: [Bioperl-l] injecting a bit of levity.... Message-ID: <20756.39021.553502.116384@gargle.gargle.HOWL> Perl's not dead. It's FAMOUS! http://imgs.xkcd.com/comics/perl_problems.png g. From carandraug+dev at gmail.com Fri Feb 8 01:57:30 2013 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Fri, 8 Feb 2013 06:57:30 +0000 Subject: [Bioperl-l] getting a Bio::Search::HSP::HSPI from Bio::SimpleAlign (to find differences between sequences) Message-ID: Hi I already have a Bio::SimpleAlign object (got it after using TCoffee through bioperl-run module) and I'm trying to get a Bio::Search::HSP::HSPI object from a pair of the aligned sequences. How can I do this? I want to use the seq_inds method to compare the sequences. Here's my actual problem just in case I should be trying to fix it some other way. I have a bunch of sequences from protein isoforms. They have small differences between them, point-mutations, small insertions or deletions, nothing too big. I want to make a table of the mutations that each of them has against the consensus sequence. I already made the alignment and got have the consensus with "$align->consensus_string". Now, I want to get something like: isoform1: Ala67Gly, His90_Met91insGln isoform2: .... The seq_inds method from the Bio::Search::HSP::HSPI class seems to do the part of finding the differences, but how can I get one? I can't find it on the documentation. Any tips, and even showing a different approach to my problem, are most appreciated. Thanks, Carn? From l.m.timmermans at students.uu.nl Fri Feb 8 06:18:58 2013 From: l.m.timmermans at students.uu.nl (Leon Timmermans) Date: Fri, 8 Feb 2013 12:18:58 +0100 Subject: [Bioperl-l] BioPerl long-term, was Re: dependencies on perl version In-Reply-To: <20756.7768.125680.662488@gargle.gargle.HOWL> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> <20756.7768.125680.662488@gargle.gargle.HOWL> Message-ID: On Thu, Feb 7, 2013 at 10:36 PM, George Hartzell wrote: > But I'm so sick of getting into arguments (or walking away from > them...) with Ruby and Python [and lisp and *PHP*] fans; Perl is dead, > you can't write good code in Perl, look - Ruby has GEMS!, etc... > > Perl of the olden days was an easy language in which to write really > shitty code. Even the Perl of the BioPerl heyday wasn't really much > help; role your own OO, role your own distro-building, mountains of > monkey-work to provide consistent POD, versioning, etc... > > But that's not the Perl that I use. I have Moose and Moo. TAP and > the things built on it. Dist::Zilla. PerlTidy. PerlCritic. cpanm. > MetaCPAN. Pinto. GitHub. Perlbrew. Wow. I share that experience. > But BioPerl *is* dying. You might be standing on the shoulders of > giants when you use it to solve a problem, but you *definitely* have > those same giants (and their extended families) on your shoulders > every time I see you try move the project forward. All of that > history has become the tail that's wagging the dog. I share your sentiment. Most of BioPerl is architected so badly I can't stomach it most days, and I've worked on hairy codebases included perl itself. There's just too much sick and wrong. It's like hundreds of dot-com-era cgi scripts. The problem (which is common in scientific computing) is that once code works it's effectively abandoned. BioPerl is essentially a gathering of more than a thousand such modules. > If all y'all are going to keep the thing alive, moving forward and > contributing to new great works then make Apple your hero. Deprecate > the stuff that's holding you back, give folks a path forward and move > on. That would be lovely, but who is going to do that? We're suffering from the tragedy of the commons. > Have fun. Use sharp tools. Do cool science. Build cool things. > Advance your careers (forgot that one last time). Be reasonable and > professional. Sounds like good advice to me :-) > Supporting last year's projects is someone else's business > opportunity. True! > ps. Are all y'all following this thread? > > http://news.ycombinator.com/item?id=5123022 > > Maybe someone should search down for this bit: "Where to start? Any > list of this [sic] projects?" and insert a plug for the various > open-bio projects. (But "someone" doesn't work here, he said...). Interesting discussion, though the original post is too cynical even for my taste. Leon From cjfields at illinois.edu Fri Feb 8 09:08:56 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Fri, 8 Feb 2013 14:08:56 +0000 Subject: [Bioperl-l] BioPerl long-term, was Re: dependencies on perl version In-Reply-To: References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> <20756.7768.125680.662488@gargle.gargle.HOWL> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1DA2D@CHIMBX5.ad.uillinois.edu> On Feb 8, 2013, at 5:18 AM, Leon Timmermans wrote: > On Thu, Feb 7, 2013 at 10:36 PM, George Hartzell wrote: >> But I'm so sick of getting into arguments (or walking away from >> them...) with Ruby and Python [and lisp and *PHP*] fans; Perl is dead, >> you can't write good code in Perl, look - Ruby has GEMS!, etc... >> >> Perl of the olden days was an easy language in which to write really >> shitty code. Even the Perl of the BioPerl heyday wasn't really much >> help; role your own OO, role your own distro-building, mountains of >> monkey-work to provide consistent POD, versioning, etc... >> >> But that's not the Perl that I use. I have Moose and Moo. TAP and >> the things built on it. Dist::Zilla. PerlTidy. PerlCritic. cpanm. >> MetaCPAN. Pinto. GitHub. Perlbrew. Wow. > > I share that experience. > >> But BioPerl *is* dying. You might be standing on the shoulders of >> giants when you use it to solve a problem, but you *definitely* have >> those same giants (and their extended families) on your shoulders >> every time I see you try move the project forward. All of that >> history has become the tail that's wagging the dog. > > I share your sentiment. Most of BioPerl is architected so badly I > can't stomach it most days, and I've worked on hairy codebases > included perl itself. There's just too much sick and wrong. It's like > hundreds of dot-com-era cgi scripts. > > The problem (which is common in scientific computing) is that once > code works it's effectively abandoned. BioPerl is essentially a > gathering of more than a thousand such modules. Yep, the progression from 'it works' to 'it works very well' tends to have very high activation energy. Many of the fixes tend to be more bandaids (get it working) than fundamental surgery. I tried my hand at this, got a few things done. >> If all y'all are going to keep the thing alive, moving forward and >> contributing to new great works then make Apple your hero. Deprecate >> the stuff that's holding you back, give folks a path forward and move >> on. > > That would be lovely, but who is going to do that? We're suffering > from the tragedy of the commons. Spot on, but we could break that path for the time being. I think BioPerl as is will have to be in maintenance mode; we need a new effort to break with older perl, older practices. >> Have fun. Use sharp tools. Do cool science. Build cool things. >> Advance your careers (forgot that one last time). Be reasonable and >> professional. > > Sounds like good advice to me :-) > >> Supporting last year's projects is someone else's business >> opportunity. > > True! We just need to make a bioperl 1.x branch for the maintenance bit, rechristen 'master' as 'v2', and just move on to fixing the f****** code. Let's move on that. >> ps. Are all y'all following this thread? >> >> http://news.ycombinator.com/item?id=5123022 >> >> Maybe someone should search down for this bit: "Where to start? Any >> list of this [sic] projects?" and insert a plug for the various >> open-bio projects. (But "someone" doesn't work here, he said...). > > Interesting discussion, though the original post is too cynical even > for my taste. > > Leon Yes, that's not unusual unfortunately. We have a number of physicists and mathematicians here who have started their initial forays into computational biology, they're all startled at how noisy it is and how messy code can. Of course their disciplines have had the benefit of teaching students how to (somewhat decently) code for the last 40 years. chris From l.m.timmermans at students.uu.nl Fri Feb 8 07:08:06 2013 From: l.m.timmermans at students.uu.nl (Leon Timmermans) Date: Fri, 8 Feb 2013 13:08:06 +0100 Subject: [Bioperl-l] BioPerl long-term, was Re: dependencies on perl version In-Reply-To: References: Message-ID: On Fri, Feb 8, 2013 at 5:12 AM, Carn? Draug wrote: > Short version: > I'd recommend to split the project into much smaller ones. Some of the > small ones will wither and die but those are the less important ones, > and will allow the others, the ones that people care about, freedom to > grow faster. Bioperl would still be just one project, that > incorporates a hundred or so of smaller modules. Let those who care > the most about a specific module to take care of it and make the > releases. Releasing a module becomes much simpler, which means more > releases, more activity, and the smaller code base for each module > also make it less intimidating for new contributors. That has been a goal for some time now, but it's fairly complicated. Not only do we have a LOT of modules (bioperl-live alone is more than 900), they also have complicated dependencies. I've attached the results of my static dependency analysis of bioperl-live. I suspect this split-up needs to done by automated graph analysis, it's too much to do by hand. Leon -------------- next part -------------- A non-text attachment was scrubbed... Name: deps.dot Type: application/octet-stream Size: 93463 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: deps.png Type: image/png Size: 6694525 bytes Desc: not available URL: From sebastien.moretti at unil.ch Fri Feb 8 11:19:29 2013 From: sebastien.moretti at unil.ch (=?ISO-8859-1?Q?Moretti_S=E9bastien?=) Date: Fri, 08 Feb 2013 17:19:29 +0100 Subject: [Bioperl-l] PhyloXML Message-ID: <51152591.9010402@unil.ch> Hi I would like to add some XML to an existing PhyloXML tree. No problem to read and write it. I would like to add smthg after the tag as in http://www.phyloxml.org/examples_syntax/phyloxml_syntax_example_1.html but get problems with add_phyloXML_annotation() : Can't locate object method "annotation" via package "Bio::Tree::Tree" at /software/share/perl5/vendor_perl/Bio/TreeIO/phyloxml.pm line 984, line 1 (#1) (F) You called a method correctly, and it correctly indicated a package functioning as a class, but that package doesn't define that particular method, nor does any of its base classes. See perlobj. Uncaught exception from user code: Can't locate object method "annotation" via package "Bio::Tree::Tree" at /software/share/perl5/vendor_perl/Bio/TreeIO/phyloxml.pm line 984, line 1. at /software/share/perl5/vendor_perl/Bio/TreeIO/phyloxml.pm line 984 Bio::TreeIO::phyloxml::element_default('Bio::TreeIO::phyloxml=HASH(0x134b1268)') called at /software/share/perl5/vendor_perl/Bio/TreeIO/phyloxml.pm line 670 Bio::TreeIO::phyloxml::processXMLNode('Bio::TreeIO::phyloxml=HASH(0x134b1268)') called at /software/share/perl5/vendor_perl/Bio/TreeIO/phyloxml.pm line 309 Bio::TreeIO::phyloxml::add_phyloXML_annotation('Bio::TreeIO::phyloxml=HASH(0x134b1268)', '-obj', 'Bio::Tree::Tree=HASH(0x13525258)', '-xml', 'SUMF family') called at ./add_annotation_to_phyloxml.pl line 40 I think I do something wrong but what ? Here is the code my $treeio = new Bio::TreeIO(-file => "$infile", -format => 'phyloxml', ); my $tree = $treeio->next_tree; # Add annotation $treeio->add_phyloXML_annotation(-obj => $tree, -xml => 'SUMF family', ); -- S?bastien Moretti From cjfields at illinois.edu Sat Feb 9 01:25:17 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Sat, 9 Feb 2013 06:25:17 +0000 Subject: [Bioperl-l] BioPerl future Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1F217@CHIMBX5.ad.uillinois.edu> All, (cross-posting to gmod-gbrowse) I want to gauge the community's thoughts on a few things. At the moment I think we can safely say that BioPerl 1.x is in maintenance mode. By 'maintenance mode', I mean that we can only do so much with it w/o breaking backwards compatibility with old scripts. We need a way forward so that we can address fundamental problems within the core codebase, namely speed. I am thinking at the moment of pushing a 'v1' branch next week after I make an official announcement, with a new 1.6 release coming out from that branch (as already announced, tentatively scheduled for March 1). That frees up master for any code development, removal of modules/cruft, etc. This will open an initial path forward and at least enable us to do more. Make sense? This of course means that any code reliant on v1 should pull from that branch instead of 'master'. Thoughts? chris From cjfields at illinois.edu Sat Feb 9 01:43:24 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Sat, 9 Feb 2013 06:43:24 +0000 Subject: [Bioperl-l] BioPerl long-term, was Re: dependencies on perl version In-Reply-To: References: Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1F2C6@CHIMBX5.ad.uillinois.edu> On Feb 8, 2013, at 6:08 AM, Leon Timmermans wrote: > On Fri, Feb 8, 2013 at 5:12 AM, Carn? Draug wrote: >> Short version: >> I'd recommend to split the project into much smaller ones. Some of the >> small ones will wither and die but those are the less important ones, >> and will allow the others, the ones that people care about, freedom to >> grow faster. Bioperl would still be just one project, that >> incorporates a hundred or so of smaller modules. Let those who care >> the most about a specific module to take care of it and make the >> releases. Releasing a module becomes much simpler, which means more >> releases, more activity, and the smaller code base for each module >> also make it less intimidating for new contributors. > > That has been a goal for some time now, but it's fairly complicated. > Not only do we have a LOT of modules (bioperl-live alone is more than > 900), they also have complicated dependencies. I've attached the > results of my static dependency analysis of bioperl-live. I suspect > this split-up needs to done by automated graph analysis, it's too much > to do by hand. > > Leon > Leon, I'm hoping we can do this sooner than later. In fact, if we proceed with make a 'v1' branch or something similar, we can start extricating out code sooner than later (next few weeks). chris From cjfields at illinois.edu Sat Feb 9 08:51:35 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Sat, 9 Feb 2013 13:51:35 +0000 Subject: [Bioperl-l] [Gmod-gbrowse] BioPerl future Message-ID: Sheldon, The branch is where the old (v1.x) code would reside. Master branch would be v2. Chris Sent via phone -------- Original message -------- From: Sheldon McKay Date: To: "Fields, Christopher J" Cc: BioPerl List ,gmod-gbrowse at lists.sourceforge.net Subject: Re: [Gmod-gbrowse] BioPerl future Hi Chris, This sounds like a good idea. I think it will eventually allow bioperl to evolve into a leaner, meaner package that would be more likely to be adopted by new or isolated bioinformaticians, who tend to be put off by the size and complexity of bioperl as it now stands. One question I have is whether the name of branch v1 might be perceived as a step backward. How about v2? Sheldon On Saturday, February 9, 2013, Fields, Christopher J wrote: All, (cross-posting to gmod-gbrowse) I want to gauge the community's thoughts on a few things. At the moment I think we can safely say that BioPerl 1.x is in maintenance mode. By 'maintenance mode', I mean that we can only do so much with it w/o breaking backwards compatibility with old scripts. We need a way forward so that we can address fundamental problems within the core codebase, namely speed. I am thinking at the moment of pushing a 'v1' branch next week after I make an official announcement, with a new 1.6 release coming out from that branch (as already announced, tentatively scheduled for March 1). That frees up master for any code development, removal of modules/cruft, etc. This will open an initial path forward and at least enable us to do more. Make sense? This of course means that any code reliant on v1 should pull from that branch instead of 'master'. Thoughts? chris ------------------------------------------------------------------------------ Free Next-Gen Firewall Hardware Offer Buy your Sophos next-gen firewall before the end March 2013 and get the hardware for free! Learn more. http://p.sf.net/sfu/sophos-d2d-feb _______________________________________________ Gmod-gbrowse mailing list Gmod-gbrowse at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse -- Sheldon McKay, PhD Computational Biologist DNA Learning Center Cold Spring Harbor Laboratory 1 Bungtown Rd Cold Spring Harbor, NY 11724 (516) 367-5185 www.dnalc.org From sheldon.mckay at gmail.com Sat Feb 9 08:04:50 2013 From: sheldon.mckay at gmail.com (Sheldon McKay) Date: Sat, 9 Feb 2013 08:04:50 -0500 Subject: [Bioperl-l] [Gmod-gbrowse] BioPerl future In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1F217@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1F217@CHIMBX5.ad.uillinois.edu> Message-ID: Hi Chris, This sounds like a good idea. I think it will eventually allow bioperl to evolve into a leaner, meaner package that would be more likely to be adopted by new or isolated bioinformaticians, who tend to be put off by the size and complexity of bioperl as it now stands. One question I have is whether the name of branch v1 might be perceived as a step backward. How about v2? Sheldon On Saturday, February 9, 2013, Fields, Christopher J wrote: > All, > > (cross-posting to gmod-gbrowse) > > I want to gauge the community's thoughts on a few things. At the moment I > think we can safely say that BioPerl 1.x is in maintenance mode. By > 'maintenance mode', I mean that we can only do so much with it w/o breaking > backwards compatibility with old scripts. We need a way forward so that we > can address fundamental problems within the core codebase, namely speed. > > I am thinking at the moment of pushing a 'v1' branch next week after I > make an official announcement, with a new 1.6 release coming out from that > branch (as already announced, tentatively scheduled for March 1). That > frees up master for any code development, removal of modules/cruft, etc. > This will open an initial path forward and at least enable us to do more. > Make sense? This of course means that any code reliant on v1 should pull > from that branch instead of 'master'. > > Thoughts? > > chris > > ------------------------------------------------------------------------------ > Free Next-Gen Firewall Hardware Offer > Buy your Sophos next-gen firewall before the end March 2013 > and get the hardware for free! Learn more. > http://p.sf.net/sfu/sophos-d2d-feb > _______________________________________________ > Gmod-gbrowse mailing list > Gmod-gbrowse at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse > -- Sheldon McKay, PhD Computational Biologist DNA Learning Center Cold Spring Harbor Laboratory 1 Bungtown Rd Cold Spring Harbor, NY 11724 (516) 367-5185 www.dnalc.org From cjfields at illinois.edu Sat Feb 9 23:25:14 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Sun, 10 Feb 2013 04:25:14 +0000 Subject: [Bioperl-l] BioPerl future In-Reply-To: <2F53583C-9E7D-4D6A-A4C2-E5C27DDBA493@illinois.edu> References: <2F53583C-9E7D-4D6A-A4C2-E5C27DDBA493@illinois.edu> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1FC4C@CHIMBX5.ad.uillinois.edu> Apologies if you receive this twice. I never received the replies from the gbrowse list through bioperl-l so it is possible there were mail issues last night. ------------------------ All, (cross-posting to gmod-gbrowse) I want to gauge the community's thoughts on a few things. At the moment I think we can safely say that BioPerl 1.x is in maintenance mode. By 'maintenance mode', I mean that we can only do so much with it w/o breaking backwards compatibility with old scripts. We need a way forward so that we can address fundamental problems within the core codebase, namely speed. I am thinking at the moment of pushing a 'v1' branch next week after I make an official announcement, with a new 1.6 release coming out from that branch (as already announced, tentatively scheduled for March 1). That frees up master for any code development, removal of modules/cruft, etc. This will open an initial path forward and at least enable us to do more. Make sense? This of course means that any code reliant on v1 should pull from that branch instead of 'master'. Thoughts? chris From genehack at genehack.org Sat Feb 9 23:36:07 2013 From: genehack at genehack.org (John SJ Anderson) Date: Sat, 9 Feb 2013 20:36:07 -0800 Subject: [Bioperl-l] BioPerl future In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1FC4C@CHIMBX5.ad.uillinois.edu> References: <2F53583C-9E7D-4D6A-A4C2-E5C27DDBA493@illinois.edu> <118F034CF4C3EF48A96F86CE585B94BF6CE1FC4C@CHIMBX5.ad.uillinois.edu> Message-ID: <668BED38-61AE-4D21-A3BD-B7AEC9361EF6@genehack.org> On Feb 9, 2013, at 8:25 PM, "Fields, Christopher J" wrote: > Thoughts? +1 The other thing to maybe give some advance thought to is organization of the new development. Maybe instead of one big repository, we can encourage the more loosely coupled small pieces that everybody seems to realize we need by having more, smaller repositories. j. -- John SJ Anderson // genehack at genehack.org From carandraug+dev at gmail.com Sun Feb 10 13:40:33 2013 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Sun, 10 Feb 2013 18:40:33 +0000 Subject: [Bioperl-l] BioPerl future Message-ID: On 10 February 2013 17:00, wrote: > Message: 3 > Date: Sat, 9 Feb 2013 20:36:07 -0800 > From: John SJ Anderson > Subject: Re: [Bioperl-l] BioPerl future > To: "Fields, Christopher J" > Cc: BioPerl List > Message-ID: <668BED38-61AE-4D21-A3BD-B7AEC9361EF6 at genehack.org> > Content-Type: text/plain; charset=us-ascii > > On Feb 9, 2013, at 8:25 PM, "Fields, Christopher J" wrote: > >> Thoughts? > > +1 > > The other thing to maybe give some advance thought to is organization of the new development. Maybe instead of one big repository, we can encourage the more loosely coupled small pieces that everybody seems to realize we need by having more, smaller repositories. For those interested, I have just added instructions on the wiki on how to split a subset of modules, tests, files, etc from the bioperl-live repository into a new repository while keeping their old history. http://www.bioperl.org/wiki/Using_Git/Advanced#Split_a_module_from_bioperl-live Carn? From cjfields at illinois.edu Sun Feb 10 15:08:35 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Sun, 10 Feb 2013 20:08:35 +0000 Subject: [Bioperl-l] BioPerl future In-Reply-To: References: Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE20632@CHIMBX5.ad.uillinois.edu> On Feb 10, 2013, at 12:40 PM, Carn? Draug wrote: > On 10 February 2013 17:00, wrote: >> Message: 3 >> Date: Sat, 9 Feb 2013 20:36:07 -0800 >> From: John SJ Anderson >> Subject: Re: [Bioperl-l] BioPerl future >> To: "Fields, Christopher J" >> Cc: BioPerl List >> Message-ID: <668BED38-61AE-4D21-A3BD-B7AEC9361EF6 at genehack.org> >> Content-Type: text/plain; charset=us-ascii >> >> On Feb 9, 2013, at 8:25 PM, "Fields, Christopher J" wrote: >> >>> Thoughts? >> >> +1 >> >> The other thing to maybe give some advance thought to is organization of the new development. Maybe instead of one big repository, we can encourage the more loosely coupled small pieces that everybody seems to realize we need by having more, smaller repositories. > > For those interested, I have just added instructions on the wiki on > how to split a subset of modules, tests, files, etc from the > bioperl-live repository into a new repository while keeping their old > history. > > http://www.bioperl.org/wiki/Using_Git/Advanced#Split_a_module_from_bioperl-live > > Carn? It's probably worth looking at this page as well, then: http://www.bioperl.org/wiki/BioPerl_Modularization We should probably merge the two. chris From hlapp at drycafe.net Sun Feb 10 20:03:34 2013 From: hlapp at drycafe.net (Hilmar Lapp) Date: Sun, 10 Feb 2013 20:03:34 -0500 Subject: [Bioperl-l] PhyloXML In-Reply-To: <51152591.9010402@unil.ch> References: <51152591.9010402@unil.ch> Message-ID: On Feb 8, 2013, at 11:19 AM, Moretti S?bastien wrote: > # Add annotation > $treeio->add_phyloXML_annotation(-obj => $tree, > -xml => 'SUMF family', > ); If you really have $treeio in your script in this line and not $tree, then that's at least one problem. But the stack suggests that the above isn't the exact line in your script - can you confirm that? -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : =========================================================== From sebastien.moretti at unil.ch Mon Feb 11 02:08:22 2013 From: sebastien.moretti at unil.ch (=?ISO-8859-1?Q?S=E9bastien_MORETTI?=) Date: Mon, 11 Feb 2013 08:08:22 +0100 Subject: [Bioperl-l] PhyloXML In-Reply-To: References: <51152591.9010402@unil.ch> Message-ID: <511898E6.7060400@unil.ch> >> # Add annotation >> $treeio->add_phyloXML_annotation(-obj => $tree, >> -xml => 'SUMF family', >> ); > > If you really have $treeio in your script in this line and not $tree, then that's at least one problem. But the stack suggests that the above isn't the exact line in your script - can you confirm that? > > -hilmar I replaced $treeio by $tree in the above line but still get an error. Don't see what you mean by "the stack suggests that the above isn't the exact line in your script" The only think I changed is the length of the xml string I try to insert. But get the same error with an empty xml string. my $treeio = new Bio::TreeIO(-file => "$infile", -format => 'phyloxml', ); my $tree = $treeio->next_tree; # Add annotation $tree->add_phyloXML_annotation(-obj => $tree, -xml => 'SUMF family', ); Can't locate object method "add_phyloXML_annotation" via package "Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, line 1 (#1) (F) You called a method correctly, and it correctly indicated a package functioning as a class, but that package doesn't define that particular method, nor does any of its base classes. See perlobj. Uncaught exception from user code: Can't locate object method "add_phyloXML_annotation" via package "Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, line 1. at ./add_annotation_to_phyloxml.pl line 40 -- S?bastien Moretti Department of Ecology and Evolution, Biophore, University of Lausanne, CH-1015 Lausanne, Switzerland Tel.: +41 (21) 692 4221/4079 http://bioinfo.unil.ch/ From saladi1 at illinois.edu Tue Feb 12 16:24:34 2013 From: saladi1 at illinois.edu (Shyam Saladi) Date: Tue, 12 Feb 2013 13:24:34 -0800 Subject: [Bioperl-l] Bio::Tools::SeqStats->count_codons Message-ID: Hi, I am using the count_codons method from Bio::Tools::SeqStats and keep getting "AMBIGUOUS" codons, but I can't figure out why exactly. When I translate the same sequence that gives the error using another standard utility like (ExPASy - Translate), it seems to work alright. An example sequence is below. Could anyone lend some insight? Thanks, Shyam AAA AAC AAG AAT ACA ACC ACG ACT AGA AGC AGT *AMBIGUOUS* ATA ATC ATG ATT CAA CAC CAG CAT CCA CCC CCG CCT CGA CGC CGG CGT CTA CTC CTG CTT GAA GAC GAG GAT GCA GCC GCG GCT GGA GGC GGG GGT GTA GTC GTG GTT TAA TAC TAT TCA TCC TCG TCT TGG TGT TTA TTC TTG TTT count filename 1.722488038277511961722488038277511961722 2.966507177033492822966507177033492822967 1.531100478468899521531100478468899521531 0.9569377990430622009569377990430622009569 0.4784688995215311004784688995215311004785 1.722488038277511961722488038277511961722 1.33971291866028708133971291866028708134 1.913875598086124401913875598086124401914 0.1913875598086124401913875598086124401914 0.7655502392344497607655502392344497607656 1.435406698564593301435406698564593301435 * 0.09569377990430622009569377990430622009569* 0.3827751196172248803827751196172248803828 2.488038277511961722488038277511961722488 3.349282296650717703349282296650717703349 3.636363636363636363636363636363636363636 2.870813397129186602870813397129186602871 0.3827751196172248803827751196172248803828 1.626794258373205741626794258373205741627 0.4784688995215311004784688995215311004785 1.722488038277511961722488038277511961722 0.5741626794258373205741626794258373205742 1.052631578947368421052631578947368421053 1.244019138755980861244019138755980861244 0.3827751196172248803827751196172248803828 0.7655502392344497607655502392344497607656 0.1913875598086124401913875598086124401914 2.488038277511961722488038277511961722488 0.4784688995215311004784688995215311004785 0.6698564593301435406698564593301435406699 2.105263157894736842105263157894736842105 0.8612440191387559808612440191387559808612 2.870813397129186602870813397129186602871 1.435406698564593301435406698564593301435 1.722488038277511961722488038277511961722 2.775119617224880382775119617224880382775 2.00956937799043062200956937799043062201 2.488038277511961722488038277511961722488 3.540669856459330143540669856459330143541 2.00956937799043062200956937799043062201 0.1913875598086124401913875598086124401914 2.392344497607655502392344497607655502392 0.8612440191387559808612440191387559808612 5.454545454545454545454545454545454545455 1.913875598086124401913875598086124401914 0.8612440191387559808612440191387559808612 4.593301435406698564593301435406698564593 2.679425837320574162679425837320574162679 0.09569377990430622009569377990430622009569 1.148325358851674641148325358851674641148 1.148325358851674641148325358851674641148 0.8612440191387559808612440191387559808612 0.4784688995215311004784688995215311004785 2.105263157894736842105263157894736842105 0.9569377990430622009569377990430622009569 0.9569377990430622009569377990430622009569 0.09569377990430622009569377990430622009569 2.679425837320574162679425837320574162679 2.966507177033492822966507177033492822967 3.062200956937799043062200956937799043062 2.775119617224880382775119617224880382775 1045 temp.seq ATGGCACGTTTTTTTATTGATCGTCCCATCTTTGCGTGGGTGATCGCCTTAATTATTATGTTGGCGGGGGTGCTTTCAATTCGCACCCTGCCGGTTTCTCAATATCCCAGCATTGCACCGCCAACCGTGGTGATCAGTGCTAACTACCCTGGTGCATCGGCCAAGATTGTTGAAGACTCAGTGACTCAGGTGATTGAGCAACGCATGAAGGGTATCGATCACCTACGTTATATTGCCTCAACCAGCGATAGTTTCGGTAATGCTGAAATCACTTTGACCTTCAATGCCGAAGCCGATCCTGATATTGCTCAGGTACAAGTTCAGAACAAATTGCAGGGTGCAATGACCCTGTTACCACAAGAGGTACAGGCTCAAGGGGTTGACGTTAACAAATCAAGTTCTGGCTTYTTGATGGTGCTGGGTTTCGTATCGACTGACGGTTCCTTAGATAAAGGCGACATCGCCGACTATGTGGGTGCAAACGTACAAGATCCCATGAGCCGTGTACCGGGCGTGGGTGAAATTCAGCTGTTTGGTGCCCAATATGCGATGCGTATATGGCTTGATCCTTTAAAACTGACTCAATATAACTTGACCAGTTTAGAGGTGATCTCGGCGATTCGTGCTCAAAACGCGCAGGTGTCTGCGGGTCAGTTGGGTGGTACGCCGTCAATTCAAGGGCAAGAACTTAACGCCACTGTTTCGGCGCAAAGTCGTTTGCAAACCCCTGAAGAGTTTCGCAAGATTATCCTGAAGTCTGATACTTCGGGTGCGAATGTGTTCCTCGGTGATGTGGCGCGCGTAGAGTTAGGTTCAGAGAGTTATGCCGTTGTCTCGTTCTACAATGGTAAGCCTGCTACTGGTTTAGCGATTAAACTGGCGACAGGCGCAAACGCGTTGGATACCGCTGAAGCTGTTCGTGATAAAGTTGAAGAATTGCGACCTTTCTTCCCGCAAGGGTTGGATGTTGTTTATCCCTACGATACTACGCCATTCGTTGAGAAATCGATAGAAGGCGTGGTACACACCCTGCTCGAAGCGATTGTTCTGGTGTTTGTCATCATGTACCTCTTCCTGCAAAACTTCCGTGCGACCTTAATTCCGACGATTGCGGTACCAGTGGTCTTGCTGGGAACGTTTGCGATTTTGTCGGCCACGGGCTTCTCTATCAACACCCTTACCATGTTTGCTATGGTGCTGGCGATTGGTCTGTTGGTGGACGACGCCATCGTGGTGGTTGAAAACGTTGAGCGGGTGATGTCGGAAGAAGGGTTGAGCCCACTCGAAGCGACTCGTAAATCGATGGATCAAATCACTGGCGCCTTAGTTGGTATTGGTTTGACGTTATCTGCTGTATTTGTGCCAATGGCATTTATGTCGGGTTCTACTGGGGTCATTTACCGTCAGTTCTCGATCACTATCGTGTCTGCGATGGCATTGTCGGTATTAGTGGCCTTGATTTTAACGCCGGCACTTTGTGCCACTATGTTAAAACCCGTGCAGAAGGGACATGGTCATATTGAAACCGGTTTCTTCGGTTGGTTTAACCGTAACTTTGATCGCTTAACTAACCGTTACGAATCCAGTGTGGCGGGCATAGTGAAGCGTGGCTTTAGAGTCATGATGATTTATGTGGCTTTAGTGGTCGCCGTCGGTTGGATCTTCATGCGTATGCCAACTGCATTCTTACCCGATGAAGACCAAGGTATCTTGTTTACGCAGGCGATTTTGCCAACAAACTCGACTCAAGAAAGTACCCTCAAAGTGCTGGATAAGGTATCCGATCACTTCATGGCTGAAGAAGGCGTGAGATCGGTATTCAGCGTGGCGGGCTTTAGCTTTGCGGGTCAAGGCCAAAACATGGGTATCGCTTTCGTTGGCTTGAAGGATTGGTCAGAGCGTGAAGCACCTGGTATGGATGTGCAGTCTATTGCGGGTCGTGCTATGGGTGCCTTTAGTCAAATTAAAGACGCCTTCGTATTTGCCTTCGTACCACCTGCGGTTATTGAGCTGGGTACGGCGAATGGTTTTGACATGTACCTGCAAGATAAAAACGGTCAAGGCCACGATAAGTTAATAGCGGCTCGTAACCAATTGCTGGGTATGGCGGCTCAGAATCCAAACCTTATGGGTGTTCGCCCTAATGGTCAGGAAGATGCGCCAATCTATCAATTGCATATTGATCATGCAAAGTTGAGCGCATTAGGCGTTGATATTGCTAACGTTAACAGTGTGTTGGCAACTGCTTGGGGTGGTTCCTATGTGAACGATTTTATCGACCGCGGCCGTGTGAAAAAGGTATTTGTGCAAGGTGATGCCCAATACCGTATGCAGCCTGAAGACCTCAACACTTGGTACGTGCGTAACAACAAGGGTGACATGGTGCCATTTTCGGCCTTTGCAACAGGTTCTTGGGAATACGGCTCACCGCGTCTAGAACGTTTTAACGGTTTACCAGCGGTGAATATTCAAGGCGCAACTGCACCAGGCTTTAGTACGGGTGCTGCCATGACTATCATGGAGGACTTAGTTAAGCAGCTACCACCTGGCTTTGGCATCGAGTGGAACGGCTTATCCTACGAGGAACGTTTATCGGGTAACCAAGCACCAGCCTTGTATGCGTTGTCGATTCTGGTGGTATTCCTTGTATTAGCAGCCTTGTATGAAAGCTGGTCAGTACCGTTTGCGGTTATCCTTGTGGTTCCATTGGGGATTATCGGTGCTCTATTGGCGATGAATGGTCGAGGCTTGCCTAACGACGTGTTCTTCCAAGTGGGTCTGTTAACAACGGTTGGTTTGGCAACCAAGAACGCCATCTTGATTGTGGAATTTGCAAAAGAATTCTACGAGAAGGGGGCGGGTCTGGTTGAGGCGACCTTACATGCGGTCCGCGTGCGTTTACGTCCGATTTTAATGACGTCGCTCGCTTTTGGTCTGGGGGTTGTACCGCTAGCCATTAGTACAGGTGTGGGTTCGGGCAGTCAGAACGCCATTGGTACCGGTGTACTTGGCGGTATGATGAGTTCGACCTTCTTAGGTATCTTCTTCGTGCCACTGTTCTTCGTCATTGTTGAGCGGATCTTCAGTAAACGAGAGCGAAAAGCGAAAGAGAAAAATCCTACGTCGACGGATTAA From bosborne11 at verizon.net Tue Feb 12 21:30:08 2013 From: bosborne11 at verizon.net (Brian Osborne) Date: Tue, 12 Feb 2013 21:30:08 -0500 Subject: [Bioperl-l] Bio::Tools::SeqStats->count_codons In-Reply-To: References: Message-ID: Shyam, An ambiguous codon would be one that has a character other than [ACTGU] in it. I see '!' in your sequences, that would create an ambiguous codon. Brian O. On Feb 12, 2013, at 4:24 PM, Shyam Saladi wrote: > Hi, > > I am using the count_codons method from Bio::Tools::SeqStats and keep > getting "AMBIGUOUS" codons, but I can't figure out why exactly. > > When I translate the same sequence that gives the error using another > standard utility like (ExPASy - Translate), it seems to work alright. > > An example sequence is below. Could anyone lend some insight? > > Thanks, > Shyam > > > > AAA AAC AAG AAT ACA ACC ACG ACT AGA AGC > AGT *AMBIGUOUS* ATA ATC ATG ATT CAA CAC > CAG CAT CCA CCC CCG CCT CGA CGC CGG > CGT CTA CTC CTG CTT GAA GAC GAG GAT GCA > GCC GCG GCT GGA GGC GGG GGT GTA GTC > GTG GTT TAA TAC TAT TCA TCC TCG TCT TGG > TGT TTA TTC TTG TTT count filename > 1.722488038277511961722488038277511961722 > 2.966507177033492822966507177033492822967 > 1.531100478468899521531100478468899521531 > 0.9569377990430622009569377990430622009569 > 0.4784688995215311004784688995215311004785 > 1.722488038277511961722488038277511961722 > 1.33971291866028708133971291866028708134 > 1.913875598086124401913875598086124401914 > 0.1913875598086124401913875598086124401914 > 0.7655502392344497607655502392344497607656 > 1.435406698564593301435406698564593301435 * > 0.09569377990430622009569377990430622009569* > 0.3827751196172248803827751196172248803828 > 2.488038277511961722488038277511961722488 > 3.349282296650717703349282296650717703349 > 3.636363636363636363636363636363636363636 > 2.870813397129186602870813397129186602871 > 0.3827751196172248803827751196172248803828 > 1.626794258373205741626794258373205741627 > 0.4784688995215311004784688995215311004785 > 1.722488038277511961722488038277511961722 > 0.5741626794258373205741626794258373205742 > 1.052631578947368421052631578947368421053 > 1.244019138755980861244019138755980861244 > 0.3827751196172248803827751196172248803828 > 0.7655502392344497607655502392344497607656 > 0.1913875598086124401913875598086124401914 > 2.488038277511961722488038277511961722488 > 0.4784688995215311004784688995215311004785 > 0.6698564593301435406698564593301435406699 > 2.105263157894736842105263157894736842105 > 0.8612440191387559808612440191387559808612 > 2.870813397129186602870813397129186602871 > 1.435406698564593301435406698564593301435 > 1.722488038277511961722488038277511961722 > 2.775119617224880382775119617224880382775 > 2.00956937799043062200956937799043062201 > 2.488038277511961722488038277511961722488 > 3.540669856459330143540669856459330143541 > 2.00956937799043062200956937799043062201 > 0.1913875598086124401913875598086124401914 > 2.392344497607655502392344497607655502392 > 0.8612440191387559808612440191387559808612 > 5.454545454545454545454545454545454545455 > 1.913875598086124401913875598086124401914 > 0.8612440191387559808612440191387559808612 > 4.593301435406698564593301435406698564593 > 2.679425837320574162679425837320574162679 > 0.09569377990430622009569377990430622009569 > 1.148325358851674641148325358851674641148 > 1.148325358851674641148325358851674641148 > 0.8612440191387559808612440191387559808612 > 0.4784688995215311004784688995215311004785 > 2.105263157894736842105263157894736842105 > 0.9569377990430622009569377990430622009569 > 0.9569377990430622009569377990430622009569 > 0.09569377990430622009569377990430622009569 > 2.679425837320574162679425837320574162679 > 2.966507177033492822966507177033492822967 > 3.062200956937799043062200956937799043062 > 2.775119617224880382775119617224880382775 1045 temp.seq > > ATGGCACGTTTTTTTATTGATCGTCCCATCTTTGCGTGGGTGATCGCCTTAATTATTATGTTGGCGGGGGTGCTTTCAATTCGCACCCTGCCGGTTTCTCAATATCCCAGCATTGCACCGCCAACCGTGGTGATCAGTGCTAACTACCCTGGTGCATCGGCCAAGATTGTTGAAGACTCAGTGACTCAGGTGATTGAGCAACGCATGAAGGGTATCGATCACCTACGTTATATTGCCTCAACCAGCGATAGTTTCGGTAATGCTGAAATCACTTTGACCTTCAATGCCGAAGCCGATCCTGATATTGCTCAGGTACAAGTTCAGAACAAATTGCAGGGTGCAATGACCCTGTTACCACAAGAGGTACAGGCTCAAGGGGTTGACGTTAACAAATCAAGTTCTGGCTTYTTGATGGTGCTGGGTTTCGTATCGACTGACGGTTCCTTAGATAAAGGCGACATCGCCGACTATGTGGGTGCAAACGTACAAGATCCCATGAGCCGTGTACCGGGCGTGGGTGAAATTCAGCTGTTTGGTGCCCAATATGCGATGCGTATATGGCTTGATCCTTTAAAACTGACTCAATATAACTTGACCAGTTTAGAGGTGATCTCGGCGATTCGTGCTCAAAACGCGCAGGTGTCTGCGGGTCAGTTGGGTGGTACGCCGTCAATTCAAGGGCAAGAACTTAACGCCACTGTTTCGGCGCAAAGTCGTTTGCAAACCCCTGAAGAGTTTCGCAAGATTATCCTGAAGTCTGATACTTCGGGTGCGAATGTGTTCCTCGGTGATGTGGCGCGCGTAGAGTTAGGTTCAGAGAGTTATGCCGTTGTCTCGTTCTACAATGGTAAGCCTGCTACTGGTTTAGCGATTAAACTGGCGACAGGCGCAAACGCGTTGGATACCGCTGAAGCTGTTCGTGATAAAGTTGAAGAATTGCGACCTTTCTTCCCGCAAGGGTTGGATGTTGTTTATCCCTACGATACTAC! > GCCATTCGTTGAGAAATCGATAGAAGGCGTGGTACACACCCTGCTCGAAGCGATTGTTCTGGTGTTTGTCATCATGTACCTCTTCCTGCAAAACTTCCGTGCGACCTTAATTCCGACGATTGCGGTACCAGTGGTCTTGCTGGGAACGTTTGCGATTTTGTCGGCCACGGGCTTCTCTATCAACACCCTTACCATGTTTGCTATGGTGCTGGCGATTGGTCTGTTGGTGGACGACGCCATCGTGGTGGTTGAAAACGTTGAGCGGGTGATGTCGGAAGAAGGGTTGAGCCCACTCGAAGCGACTCGTAAATCGATGGATCAAATCACTGGCGCCTTAGTTGGTATTGGTTTGACGTTATCTGCTGTATTTGTGCCAATGGCATTTATGTCGGGTTCTACTGGGGTCATTTACCGTCAGTTCTCGATCACTATCGTGTCTGCGATGGCATTGTCGGTATTAGTGGCCTTGATTTTAACGCCGGCACTTTGTGCCACTATGTTAAAACCCGTGCAGAAGGGACATGGTCATATTGAAACCGGTTTCTTCGGTTGGTTTAACCGTAACTTTGATCGCTTAACTAACCGTTACGAATCCAGTGTGGCGGGCATAGTGAAGCGTGGCTTTAGAGTCATGATGATTTATGTGGCTTTAGTGGTCGCCGTCGGTTGGATCTTCATGCGTATGCCAACTGCATTCTTACCCGATGAAGACCAAGGTATCTTGTTTACGCAGGCGATTTTGCCAACAAACTCGACTCAAGAAAGTACCCTCAAAGTGCTGGATAAGGTATCCGATCACTTCATGGCTGAAGAAGGCGTGAGATCGGTATTCAGCGTGGCGGGCTTTAGCTTTGCGGGTCAAGGCCAAAACATGGGTATCGCTTTCGTTGGCTTGAAGGATTGGTCAGAGCGTGAAGCACCTGGTATGGATGTGCAGTCTATTGCGGGTCGTGCTATGGGTGCCTTTAGTCAAATTAAAGACGCCTTC! > GTATTTGCCTTCGTACCACCTGCGGTTATTGAGCTGGGTACGGCGAATGGTTTTGACATGTACCTGCAAG > ATAAAAACGGTCAAGGCCACGATAAGTTAATAGCGGCTCGTAACCAATTGCTGGGTATGGCGGCTCAGAATCCAAACCTTATGGGTGTTCGCCCTAATGGTCAGGAAGATGCGCCAATCTATCAATTGCATATTGATCATGCAAAGTTGAGCGCATTAGGCGTTGATATTGCTAACGTTAACAGTGTGTTGGCAACTGCTTGGGGTGGTTCCTATGTGAACGATTTTATCGACCGCGGCCGTGTGAAAAAGGTATTTGTGCAAGGTGATGCCCAATACCGTATGCAGCCTGAAGACCTCAACACTTGGTACGTGCGTAACAACAAGGGTGACATGGTGCCATTTTCGGCCTTTGCAACAGGTTCTTGGGAATACGGCTCACCGCGTCTAGAACGTTTTAACGGTTTACCAGCGGTGAATATTCAAGGCGCAACTGCACCAGGCTTTAGTACGGGTGCTGCCATGACTATCATGGAGGACTTAGTTAAGCAGCTACCACCTGGCTTTGGCATCGAGTGGAACGGCTTATCCTACGAGGAACGTTTATCGGGTAACCAAGCACCAGCCTTGTATGCGTTGTCGATTCTGGTGGTATTCCTTGTATTAGCAGCCTTGTATGAAAGCTGGTCAGTACCGTTTGCGGTTATCCTTGTGGTTCCATTGGGGATTATCGGTGCTCTATTGGCGATGAATGGTCGAGGCTTGCCTAACGACGTGTTCTTCCAAGTGGGTCTGTTAACAACGGTTGGTTTGGCAACCAAGAACGCCATCTTGATTGTGGAATTTGCAAAAGAATTCTACGAGAAGGGGGCGGGTCTGGTTGAGGCGACCTTACATGCGGTCCGCGTGCGTTTACGTCCGATTTTAATGACGTCGCTCGCTTTTGGTCTGGGGGTTGTACCGCTAGCCATTAGTACAGGTGTGGGTTCGGGCAGTCAGAACGCCATTGGTACCGGTGTACTTGGCGGTATGATGAGTTCGACCTTCTTA! > GGTATCTTCTTCGTGCCACTGTTCTTCGTCATTGTTGAGCGGATCTTCAGTAAACGAGAGCGAAAAGCGAAAGAGAAAAATCCTACGTCGACGGATTAA > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Wed Feb 13 10:18:10 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 13 Feb 2013 15:18:10 +0000 Subject: [Bioperl-l] [ANNOUNCEMENT] BioPerl Future Development Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE24CF5@CHIMBX5.ad.uillinois.edu> All, tl;dr: A lot of change is coming. Be forewarned and be prepared. This is an 'official' announcement to the BioPerl community on future BioPerl plans. We have decided to move continued maintenance of Bioperl release series over to the new 'v1' branch. This branch will be the point where any future versions of 1.6.x code will be released, starting with the (already-scheduled) March 1 release. The 'master' branch will become the main focal point for future development of BioPerl going into an eventual v2 release, with a focus on performance enhancements, addressing newer technologies like NGS and large data, code cleanup, and simplifying the code base. We welcome any help with code improvements. GMOD folks? Want to help? This is a good opportunity to address BioPerl short-comings in the code base! What this means for anyone using BioPerl currently: 1) We anticipate significant issues if you are relying on the 'master' branch for anything. To inelegantly state it, the core developers are taking back the 'master' branch for future development. Please please please do not rely on the 'master' branch for stable code; if you are reliant on the BioPerl 1.6.x, make sure to use 'v1'. We can revisit whether to make 'v1' the default checkout branch if/when the need arises. 2) Expect not to find some modules. We will be migrating modules requiring external dependencies and other associated chunks of the code base out into their own repositories over the next year to help future maintenance; the eventual intent is to release all of these independently on CPAN. We will completely remove all code previously marked as deprecated, and we may immediately deprecate additional modules if needed (this will of course be discussed on list). 3) Expect version numbering to change significantly. Because we are releasing code in separate repositories, I fully expect downstream versioning problems if we stick with the current system (e.g. all bioperl-live modules having the same version). It will be too much of a headache to sync versions for all modules as this will entail making a full release of all bioperl code, one of the main reasons we are splitting out code to begin with. At the moment, no specific versioning scheme has been chosen, though I *highly* recommend using X.Y versioning for simplicity (e.g. no more 3-point versions). This is the standard that Lincoln has adopted for Bio::Graphics and GBrowse. 4) Expect quick deprecation of methods within modules as needed. These should of course be brought up to the mail list prior to actual implementation, but I would anticipate some things changing as we try to adopt a more consistent method naming scheme. 5) The same steps outlined for bioperl-live will apply for bioperl-run modules. We will have to decide the best approach to use for those, e.g. whether to separate them out based on task (alignment), application group (NGS, BLAST, RNA), etc. and how these may fit organically with bioperl-live modules where appropriate. 6) Do not expect a new CPAN release of such code until Dec 2013. Even then it will be in an alpha stage. We are all busy campers. We do not anticipate significant changes to bioperl-network or bioperl-db at this time beyond updating them to deal with new changes. I'm sure there are many other points that need to be discussed. Please reply over the next week if you have any concerns. chris From cjfields at illinois.edu Wed Feb 13 11:01:07 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 13 Feb 2013 16:01:07 +0000 Subject: [Bioperl-l] Test-pls ignore Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE2506D@CHIMBX5.ad.uillinois.edu> testing the mail list to see if it is working. -c From sebastien.moretti at unil.ch Wed Feb 13 11:21:23 2013 From: sebastien.moretti at unil.ch (=?ISO-8859-1?Q?Moretti_S=E9bastien?=) Date: Wed, 13 Feb 2013 17:21:23 +0100 Subject: [Bioperl-l] PhyloXML In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE24E8D@CHIMBX5.ad.uillinois.edu> References: <51152591.9010402@unil.ch> <511898E6.7060400@unil.ch> <118F034CF4C3EF48A96F86CE585B94BF6CE24E8D@CHIMBX5.ad.uillinois.edu> Message-ID: <511BBD83.2000708@unil.ch> >>>> # Add annotation >>>> $treeio->add_phyloXML_annotation(-obj => $tree, >>>> -xml => 'SUMF family', >>>> ); >>> >>> If you really have $treeio in your script in this line and not $tree, then that's at least one problem. But the stack suggests that the above isn't the exact line in your script - can you confirm that? >>> >>> -hilmar >> >> I replaced $treeio by $tree in the above line but still get an error. >> Don't see what you mean by "the stack suggests that the above isn't the exact line in your script" >> >> The only think I changed is the length of the xml string I try to insert. But get the same error with an empty xml string. >> >> >> >> my $treeio = new Bio::TreeIO(-file => "$infile", >> -format => 'phyloxml', >> ); >> my $tree = $treeio->next_tree; >> >> # Add annotation >> $tree->add_phyloXML_annotation(-obj => $tree, >> -xml => 'SUMF family', >> ); >> >> Can't locate object method "add_phyloXML_annotation" via package >> "Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, line 1 (#1) >> (F) You called a method correctly, and it correctly indicated a package >> functioning as a class, but that package doesn't define that particular >> method, nor does any of its base classes. See perlobj. >> >> Uncaught exception from user code: >> Can't locate object method "add_phyloXML_annotation" via package "Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, line 1. >> at ./add_annotation_to_phyloxml.pl line 40 > > Will have to look into this. One problem we have is that phyloXML support has dwindled, so if anyone wants to take this on I would be more than happy to help them get started. > > chris You mean that BioPerl 1.6.901 has not a full support of PhyloXML ? The problem I have is "expected" ? -- S?bastien Moretti From cjfields at illinois.edu Wed Feb 13 10:47:17 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 13 Feb 2013 15:47:17 +0000 Subject: [Bioperl-l] PhyloXML In-Reply-To: <511898E6.7060400@unil.ch> References: <51152591.9010402@unil.ch> <511898E6.7060400@unil.ch> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE24E8D@CHIMBX5.ad.uillinois.edu> On Feb 11, 2013, at 1:08 AM, S?bastien MORETTI wrote: >>> # Add annotation >>> $treeio->add_phyloXML_annotation(-obj => $tree, >>> -xml => 'SUMF family', >>> ); >> >> If you really have $treeio in your script in this line and not $tree, then that's at least one problem. But the stack suggests that the above isn't the exact line in your script - can you confirm that? >> >> -hilmar > > I replaced $treeio by $tree in the above line but still get an error. > Don't see what you mean by "the stack suggests that the above isn't the exact line in your script" > > The only think I changed is the length of the xml string I try to insert. But get the same error with an empty xml string. > > > > my $treeio = new Bio::TreeIO(-file => "$infile", > -format => 'phyloxml', > ); > my $tree = $treeio->next_tree; > > # Add annotation > $tree->add_phyloXML_annotation(-obj => $tree, > -xml => 'SUMF family', > ); > > Can't locate object method "add_phyloXML_annotation" via package > "Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, line 1 (#1) > (F) You called a method correctly, and it correctly indicated a package > functioning as a class, but that package doesn't define that particular > method, nor does any of its base classes. See perlobj. > > Uncaught exception from user code: > Can't locate object method "add_phyloXML_annotation" via package "Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, line 1. > at ./add_annotation_to_phyloxml.pl line 40 > > > > -- > S?bastien Moretti > Department of Ecology and Evolution, > Biophore, University of Lausanne, > CH-1015 Lausanne, Switzerland > Tel.: +41 (21) 692 4221/4079 > http://bioinfo.unil.ch/\ Will have to look into this. One problem we have is that phyloXML support has dwindled, so if anyone wants to take this on I would be more than happy to help them get started. chris From carandraug+dev at gmail.com Wed Feb 13 12:23:23 2013 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Wed, 13 Feb 2013 17:23:23 +0000 Subject: [Bioperl-l] Next BioPerl release Message-ID: On 5 February 2013 21:53, Fields, Christopher J wrote: > I am scheduling the next BioPerl CPAN release tentatively for March 1. Any help in triaging bug reports would be greatly appreciated! Hi is this release of bioperl-live only or also includes bioperl-run? Carn? From cjfields at illinois.edu Wed Feb 13 12:08:21 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 13 Feb 2013 17:08:21 +0000 Subject: [Bioperl-l] PhyloXML In-Reply-To: <511BBD83.2000708@unil.ch> References: <51152591.9010402@unil.ch> <511898E6.7060400@unil.ch> <118F034CF4C3EF48A96F86CE585B94BF6CE24E8D@CHIMBX5.ad.uillinois.edu> <511BBD83.2000708@unil.ch> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE2532B@CHIMBX5.ad.uillinois.edu> On Feb 13, 2013, at 10:21 AM, Moretti S?bastien wrote: >>>>> # Add annotation >>>>> $treeio->add_phyloXML_annotation(-obj => $tree, >>>>> -xml => 'SUMF family', >>>>> ); >>>> >>>> If you really have $treeio in your script in this line and not $tree, then that's at least one problem. But the stack suggests that the above isn't the exact line in your script - can you confirm that? >>>> >>>> -hilmar >>> >>> I replaced $treeio by $tree in the above line but still get an error. >>> Don't see what you mean by "the stack suggests that the above isn't the exact line in your script" >>> >>> The only think I changed is the length of the xml string I try to insert. But get the same error with an empty xml string. >>> >>> >>> >>> my $treeio = new Bio::TreeIO(-file => "$infile", >>> -format => 'phyloxml', >>> ); >>> my $tree = $treeio->next_tree; >>> >>> # Add annotation >>> $tree->add_phyloXML_annotation(-obj => $tree, >>> -xml => 'SUMF family', >>> ); >>> >>> Can't locate object method "add_phyloXML_annotation" via package >>> "Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, line 1 (#1) >>> (F) You called a method correctly, and it correctly indicated a package >>> functioning as a class, but that package doesn't define that particular >>> method, nor does any of its base classes. See perlobj. >>> >>> Uncaught exception from user code: >>> >>> at ./add_annotation_to_phyloxml.pl line 40 >> >> Will have to look into this. One problem we have is that phyloXML support has dwindled, so if anyone wants to take this on I would be more than happy to help them get started. >> >> chris > > You mean that BioPerl 1.6.901 has not a full support of PhyloXML ? > The problem I have is "expected" ? > > -- > S?bastien Moretti I think it handles most of phyloXML fine, but the implementation of the parser is a little tricky. I tried cleaning this up a few years back but didn't make much progress. The function is in Bio::TreeIO::phyloxml, so the correct call should be (as you previously had it): $treeio->add_phyloXML_annotation(-obj => $tree, -xml => 'SUMF family', ); My guess is that Bio::Tree::Tree was AnnotatableI at one point but that was removed, will have to trace that back. Can you file a bug on this? https://redmine.open-bio.org/ chris From cjfields at illinois.edu Wed Feb 13 13:05:53 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 13 Feb 2013 18:05:53 +0000 Subject: [Bioperl-l] Next BioPerl release In-Reply-To: References: Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE25573@CHIMBX5.ad.uillinois.edu> On Feb 13, 2013, at 11:23 AM, Carn? Draug wrote: > On 5 February 2013 21:53, Fields, Christopher J wrote: >> I am scheduling the next BioPerl CPAN release tentatively for March 1. Any help in triaging bug reports would be greatly appreciated! > > Hi > > is this release of bioperl-live only or also includes bioperl-run? > > Carn? We can work on a bioperl-run release. It's too much to handle both in one go. The problem I have faced with bioperl-run in the past is similar to bioperl-live, that the tools used are a moving target and that makes the wrappers easily out-of-date. I would really like a more flexible generic way of defining these that would allow for easier maintenance. chris From l.m.timmermans at students.uu.nl Wed Feb 13 14:44:22 2013 From: l.m.timmermans at students.uu.nl (Leon Timmermans) Date: Wed, 13 Feb 2013 20:44:22 +0100 Subject: [Bioperl-l] Next BioPerl release In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE25573@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE25573@CHIMBX5.ad.uillinois.edu> Message-ID: On Wed, Feb 13, 2013 at 7:05 PM, Fields, Christopher J wrote: > We can work on a bioperl-run release. It's too much to handle both in one go. The problem I have faced with bioperl-run in the past is similar to bioperl-live, that the tools used are a moving target and that makes the wrappers easily out-of-date. I would really like a more flexible generic way of defining these that would allow for easier maintenance. Also, bioperl-run needs to be cut into smaller distributions even more than bioperl-live. Few people if anyone at all has all tools it tries to wrap at hand, so its almost impossible to pass its testing suite. We need dists that can realistically pass. Leon From cjfields at illinois.edu Wed Feb 13 16:04:26 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 13 Feb 2013 21:04:26 +0000 Subject: [Bioperl-l] Next BioPerl release In-Reply-To: References: <118F034CF4C3EF48A96F86CE585B94BF6CE25573@CHIMBX5.ad.uillinois.edu> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE25B07@CHIMBX5.ad.uillinois.edu> On Feb 13, 2013, at 1:44 PM, Leon Timmermans wrote: > On Wed, Feb 13, 2013 at 7:05 PM, Fields, Christopher J > wrote: >> We can work on a bioperl-run release. It's too much to handle both in one go. The problem I have faced with bioperl-run in the past is similar to bioperl-live, that the tools used are a moving target and that makes the wrappers easily out-of-date. I would really like a more flexible generic way of defining these that would allow for easier maintenance. > > Also, bioperl-run needs to be cut into smaller distributions even more > than bioperl-live. Few people if anyone at all has all tools it tries > to wrap at hand, so its almost impossible to pass its testing suite. > > We need dists that can realistically pass. > > Leon Yup. It's a mess. chris From florent.angly at gmail.com Wed Feb 13 17:33:14 2013 From: florent.angly at gmail.com (Florent Angly) Date: Thu, 14 Feb 2013 08:33:14 +1000 Subject: [Bioperl-l] [ANNOUNCEMENT] BioPerl Future Development In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE24CF5@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE24CF5@CHIMBX5.ad.uillinois.edu> Message-ID: <511C14AA.9030107@gmail.com> On 14/02/13 01:18, Fields, Christopher J wrote: > I*highly* recommend using X.Y versioning for simplicity (e.g. no more 3-point versions) Yes, I support the X.Y versioning as well. Florent From l.m.timmermans at students.uu.nl Wed Feb 13 18:12:06 2013 From: l.m.timmermans at students.uu.nl (Leon Timmermans) Date: Thu, 14 Feb 2013 00:12:06 +0100 Subject: [Bioperl-l] [ANNOUNCEMENT] BioPerl Future Development In-Reply-To: <511C14AA.9030107@gmail.com> References: <118F034CF4C3EF48A96F86CE585B94BF6CE24CF5@CHIMBX5.ad.uillinois.edu> <511C14AA.9030107@gmail.com> Message-ID: On Wed, Feb 13, 2013 at 11:33 PM, Florent Angly wrote: > On 14/02/13 01:18, Fields, Christopher J wrote: >> >> I*highly* recommend using X.Y versioning for simplicity (e.g. no more >> 3-point versions) > > Yes, I support the X.Y versioning as well. > Florent See also: http://www.dagolden.com/index.php/369/version-numbers-should-be-boring/ Leon From daisieh at gmail.com Thu Feb 14 00:21:15 2013 From: daisieh at gmail.com (Daisie Huang) Date: Wed, 13 Feb 2013 21:21:15 -0800 (PST) Subject: [Bioperl-l] Question regarding while loops for reading files In-Reply-To: References: Message-ID: <3cbbba3b-759d-4281-9592-6b690aea92ab@googlegroups.com> I think you need to reset the pointer to the filehandle before you go through the while loop the second time: seek $fh,0,0 On Wednesday, February 13, 2013 6:46:41 PM UTC-8, Tiago Hori wrote: > > Hey Guys, > > I am still at the same place. I am writing these little pieces of code to > try to learn the language better, so any advice would be useful. I am again > parsing through tab delimited files and now trying to find fish from on id > (in these case families AS5 and AS9), retrieve the weights and average > them. When I started I did it for one family and it worked (instead of the > @families I had a scalar $family set to AS5). But really it is more useful > to look at more than one family at time (I should mention that are 2 types > of fish per family one ends in PS , the other doesn't). So I tried to use a > foreach loop to go through the file twice, once with a the search value set > to AS5 and a second time to AS9. It works for AS5, but for some reason, the > foreach loop sets $test to AS9 the second time, but it doesn't go through > the while loop. What am I doing wrong? > > here is the code: > > #! /usr/bin/perl > use strict; > use warnings; > > my $file = $ARGV[0]; > my @family = ('AS5','AS9'); > my $i; > my $ii; > my $test; > > open (my $fh, "<", $file) or die ("Can't open $file: $!"); > > foreach (@family){ > $test = $_; > my @data_weight_2N = (); > my @data_weight_3N = (); > while (<$fh>){ > chomp; > my $line = $_; > my @data = split ("\t", $line); > if ($data[0] !~ /[0-9]*/){ > next;} > elsif ($data[1] eq "ABF09-$test"){ > $i += 1; > push (@data_weight_2N, $data[6]); > }elsif ($data[1] eq "ABF09-".$test."PS"){ > $ii += 1; > push (@data_weight_3N,$data[6]); > } > } > my $mean_2N = &average (\@data_weight_2N); > my $stdev_2N = &stdev (\@data_weight_2N); > my $stderr_2N = ($stdev_2N/sqrt($i)); > > print "These are the the avearge weight, stdev and stderr for $test > 2N:\t", $mean_2N,"\t",$stdev_2N,"\t",$stderr_2N, "\n"; > > my $mean_3N = &average (\@data_weight_3N); > my $stdev_3N = &stdev (\@data_weight_3N); > my $stderr_3N = ($stdev_3N/sqrt($i)); > > print "These are the the avearge weight, stdev and stderr for $test > 3N:\t", $mean_3N,"\t",$stdev_3N,"\t",$stderr_3N, "\n"; > } > > close ($fh); > > sub average{ > my($data) = @_; > if (not @$data) { > print ("Empty array\n"); > return 0; > } > my $total = 0; > foreach (@$data) { > $total += $_; > } > my $average = $total / @$data; > return $average; > } > > sub stdev{ > my($data) = @_; > if(@$data == 1){ > return 0; > } > my $average = &average($data); > my $sqtotal = 0; > foreach(@$data) { > $sqtotal += ($average-$_) ** 2; > } > my $std = ($sqtotal / (@$data-1)) ** 0.5; > return $std; > } > > Thanks, > > T. > > -- > "Education is not to be used to promote obscurantism." - Theodonius > Dobzhansky. > > "Gracias a la vida que me ha dado tanto > Me ha dado el sonido y el abecedario > Con ?l, las palabras que pienso y declaro > Madre, amigo, hermano > Y luz alumbrando la ruta del alma del que estoy amando > > Gracias a la vida que me ha dado tanto > Me ha dado la marcha de mis pies cansados > Con ellos anduve ciudades y charcos > Playas y desiertos, monta?as y llanos > Y la casa tuya, tu calle y tu patio" > > Violeta Parra - Gracias a la Vida > > Tiago S. F. Hori. PhD. > Ocean Science Center-Memorial University of Newfoundland > From sebastien.moretti at unil.ch Thu Feb 14 03:09:06 2013 From: sebastien.moretti at unil.ch (=?ISO-8859-1?Q?S=E9bastien_MORETTI?=) Date: Thu, 14 Feb 2013 09:09:06 +0100 Subject: [Bioperl-l] PhyloXML In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE2532B@CHIMBX5.ad.uillinois.edu> References: <51152591.9010402@unil.ch> <511898E6.7060400@unil.ch> <118F034CF4C3EF48A96F86CE585B94BF6CE24E8D@CHIMBX5.ad.uillinois.edu> <511BBD83.2000708@unil.ch> <118F034CF4C3EF48A96F86CE585B94BF6CE2532B@CHIMBX5.ad.uillinois.edu> Message-ID: <511C9BA2.9000508@unil.ch> >>>>>> # Add annotation >>>>>> $treeio->add_phyloXML_annotation(-obj => $tree, >>>>>> -xml => 'SUMF family', >>>>>> ); >>>>> >>>>> If you really have $treeio in your script in this line and not $tree, then that's at least one problem. But the stack suggests that the above isn't the exact line in your script - can you confirm that? >>>>> >>>>> -hilmar >>>> >>>> I replaced $treeio by $tree in the above line but still get an error. >>>> Don't see what you mean by "the stack suggests that the above isn't the exact line in your script" >>>> >>>> The only think I changed is the length of the xml string I try to insert. But get the same error with an empty xml string. >>>> >>>> >>>> >>>> my $treeio = new Bio::TreeIO(-file => "$infile", >>>> -format => 'phyloxml', >>>> ); >>>> my $tree = $treeio->next_tree; >>>> >>>> # Add annotation >>>> $tree->add_phyloXML_annotation(-obj => $tree, >>>> -xml => 'SUMF family', >>>> ); >>>> >>>> Can't locate object method "add_phyloXML_annotation" via package >>>> "Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, line 1 (#1) >>>> (F) You called a method correctly, and it correctly indicated a package >>>> functioning as a class, but that package doesn't define that particular >>>> method, nor does any of its base classes. See perlobj. >>>> >>>> Uncaught exception from user code: >>>> >>>> at ./add_annotation_to_phyloxml.pl line 40 >>> >>> Will have to look into this. One problem we have is that phyloXML support has dwindled, so if anyone wants to take this on I would be more than happy to help them get started. >>> >>> chris >> >> You mean that BioPerl 1.6.901 has not a full support of PhyloXML ? >> The problem I have is "expected" ? >> >> -- >> S?bastien Moretti > > I think it handles most of phyloXML fine, but the implementation of the parser is a little tricky. I tried cleaning this up a few years back but didn't make much progress. > > The function is in Bio::TreeIO::phyloxml, so the correct call should be (as you previously had it): > > $treeio->add_phyloXML_annotation(-obj => $tree, > -xml => 'SUMF family', > ); > > My guess is that Bio::Tree::Tree was AnnotatableI at one point but that was removed, will have to trace that back. Can you file a bug on this? > > https://redmine.open-bio.org/ > > chris I will fill a bug on this. I'd be happy to try to contribute to the phyloxml code. But don't know how to proceed for BioPerl. -- S?bastien Moretti From hartzell at alerce.com Thu Feb 14 15:04:44 2013 From: hartzell at alerce.com (George Hartzell) Date: Thu, 14 Feb 2013 12:04:44 -0800 Subject: [Bioperl-l] Question regarding while loops for reading files In-Reply-To: <3cbbba3b-759d-4281-9592-6b690aea92ab@googlegroups.com> References: <3cbbba3b-759d-4281-9592-6b690aea92ab@googlegroups.com> Message-ID: <20765.17244.185833.755900@gargle.gargle.HOWL> I think that it's important to get feedback on code that one has written and to try to understand how/what/why someone else has done in their code. To that end.... Since Tiago's using this to learn the language better I can't resist some comments beyond resetting the file handle. For grins I rewrote it using Text::CSV_XS and Statistics::Basic and to take a single pass through the data file using a multilevel data structure. I resisted the urge to rewrite it in Moose. Didn't even have an urge to rewrite it in R. Funny, that.... The script is here Tiago.pl https://gist.github.com/hartzell/4955401 With something like what I think the data looks like here: https://gist.github.com/hartzell/4955570 Even without that big of a rewrite, I had a bunch of local comments which are inline below. Daisie Huang writes: > [...] > On Wednesday, February 13, 2013 6:46:41 PM UTC-8, Tiago Hori wrote: > > > > Hey Guys, > > > > I am still at the same place. I am writing these little pieces of code to > > try to learn the language better, so any advice would be useful. > > [...] > > here is the code: > > > > #! /usr/bin/perl > > use strict; > > use warnings; > > > > my $file = $ARGV[0]; Slightly better would be $filename, so that when you step up to Path::Class you can differentiate a file object from a file name string. > > my @family = ('AS5','AS9'); Better would be @families, plural. See the use of $family below. > > my $i; > > my $ii; As far as I can tell, these are just counting the number of things that you push onto the various arrays. You don't need them, referring to the list in scalar context will give you its size. > > my $test; You use this to hold the name of the family, so it's not particularly evocative. You should also restrict it's scope to within the loop. See the comment for the foreach loop. > > open (my $fh, "<", $file) or die ("Can't open $file: $!"); You made my day, three arg. open *and* you checked for errors. Nice! > > foreach (@family){ Better would be for my $family (@families) { which is evocative and restricts the scope of $family to the for loop (and for is 4 characters shorter than foreach...). > > $test = $_; No longer need this, using $family declared in the for loop with the proper scoping. > > my @data_weight_2N = (); > > my @data_weight_3N = (); > > while (<$fh>){ > > chomp; > > my $line = $_; > > my @data = split ("\t", $line); Don't parse CSV (TSV) files yourself. Get in the habit of using Text::CSV_XS. > > if ($data[0] !~ /[0-9]*/){ > > next;} > > elsif ($data[1] eq "ABF09-$test"){ > > $i += 1; You don't need the counter. > > push (@data_weight_2N, $data[6]); > > }elsif ($data[1] eq "ABF09-".$test."PS"){ > > $ii += 1; You don't need the counter. > > push (@data_weight_3N,$data[6]); > > } > > } > > my $mean_2N = &average (\@data_weight_2N); > > my $stdev_2N = &stdev (\@data_weight_2N); You don't need the ampersands on the subroutine calls. They're old school and just encourage people to make fun of our language for its use of all those funny punctuation marks . > > my $stderr_2N = ($stdev_2N/sqrt($i)); Unless I'm mistaken, this is equivalent my $stderr_2N = ($stdev_2N/sqrt(scalar @data_weight_2N)); and you don't need the counter, the explicit use of scalar there might even be redundant (I'm a coward). You use the same trick in your subroutine defn's below. > > > > print "These are the the avearge weight, stdev and stderr for $test > > 2N:\t", $mean_2N,"\t",$stdev_2N,"\t",$stderr_2N, "\n"; > > > > my $mean_3N = &average (\@data_weight_3N); > > my $stdev_3N = &stdev (\@data_weight_3N); > > my $stderr_3N = ($stdev_3N/sqrt($i)); > > > > print "These are the the avearge weight, stdev and stderr for $test > > 3N:\t", $mean_3N,"\t",$stdev_3N,"\t",$stderr_3N, "\n"; > > } > > > > close ($fh); Ah, rats. You checked whether open worked, you need to do the same thing on close too! close ($fh) or die !$; Or you could just use autodie qw(open close); and then they'll die appropriately when they have to and you don't have to bother with the checking. > > sub average{ > > my($data) = @_; > > if (not @$data) { > > print ("Empty array\n"); > > return 0; > > } > > my $total = 0; > > foreach (@$data) { > > $total += $_; > > } use List::AllUtils qw(sum); # somewhere up at the top of the script... my $total = sum(@$data); if (not defined $total) { print "Empty array\n"; return; } List::AllUtils is your friend. Learn to use it. Your returning 0 for an empty list is probably the wrong thing, isn't it possible to the total to actually be 0? Just return instead. Don't return undef, just return (and let perl take context into account for you). You probably don't actually want to spew "Empty array" out into your output stream, imagine writing a script that postprocesses your output and having to deal with it. If you really need to say it, send it to standard error with print STDERR "Empty array\n"; > > my $average = $total / @$data; > > return $average; If you don't really need the error message, then you can get to my $total = sum(@$data); return unless $total; return $total / @$data; And if an empty data array is *truly* unexpected, maybe you should just die/carp. > > } > > > > sub stdev{ > > my($data) = @_; > > if(@$data == 1){ > > return 0; > > } > > my $average = &average($data); > > my $sqtotal = 0; > > foreach(@$data) { > > $sqtotal += ($average-$_) ** 2; > > } > > my $std = ($sqtotal / (@$data-1)) ** 0.5; > > return $std; > > } Ditto on the use of List::AllUtils, etc... Phew. The only other thing I'd like to see would be an arrangement that let's you write simple tests. A simple sol'n would be to package the entire main part of the code up into e.g. a subroutine that returns a hashref keyed by family, containing a hashref keyed by 2N/3N/... and then you could just: use Test::More; use Tiago qw(summarize); my $output = summarize("test_data.tsv"); is($output->{AS5}->{'2N}, "42", "Got the magic number") # etc... done_testing; Thanks for sharing your code. Keep practicing! g. From carandraug+dev at gmail.com Thu Feb 14 17:13:45 2013 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Thu, 14 Feb 2013 22:13:45 +0000 Subject: [Bioperl-l] bioperl in Google Summer of Code 2013 Message-ID: Hi we got word of it on another project I'm involved with and I was wondering. Is bioperl going to apply for the Google Summer of Code this year? http://www.google-melange.com/gsoc/homepage/google/gsoc2013 Carn? From hlapp at drycafe.net Fri Feb 15 09:28:30 2013 From: hlapp at drycafe.net (Hilmar Lapp) Date: Fri, 15 Feb 2013 09:28:30 -0500 Subject: [Bioperl-l] bioperl in Google Summer of Code 2013 In-Reply-To: References: Message-ID: <50A8D5F3-7248-49C1-AA1D-969C2BCA38A9@drycafe.net> I presume the OBF does as an umbrella organization on behalf of all Bio* projects. If you fancy proposing a project idea or mentoring, now is not a bad time to think about that or looking for co-mentors. -hilmar Sent with a tap. On Feb 14, 2013, at 5:13 PM, Carn? Draug wrote: > Hi > > we got word of it on another project I'm involved with and I was > wondering. Is bioperl going to apply for the Google Summer of Code > this year? > > http://www.google-melange.com/gsoc/homepage/google/gsoc2013 > > Carn? > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From p.j.a.cock at googlemail.com Fri Feb 15 09:47:39 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Fri, 15 Feb 2013 14:47:39 +0000 Subject: [Bioperl-l] bioperl in Google Summer of Code 2013 In-Reply-To: <50A8D5F3-7248-49C1-AA1D-969C2BCA38A9@drycafe.net> References: <50A8D5F3-7248-49C1-AA1D-969C2BCA38A9@drycafe.net> Message-ID: On Fri, Feb 15, 2013 at 2:28 PM, Hilmar Lapp wrote: > I presume the OBF does as an umbrella organization on behalf of all Bio* > projects. If you fancy proposing a project idea or mentoring, now is not a > bad time to think about that or looking for co-mentors. > > -hilmar Yes, the plan is that as in the last few years, the OBF will apply to GSoC and cover for BioPerl, BioJava, BioRuby, Biopython etc. At this stage the Bio* projects would be wise to start coming up with some good project ideas and experienced developers thinking about being a mentor. For potential students, getting involved in the community early is a good idea (e.g. bug reports, or better fixing existing bugs) See also: http://lists.open-bio.org/mailman/listinfo/gsoc http://lists.open-bio.org/mailman/listinfo/gsoc-mentors Peter From cjfields at illinois.edu Fri Feb 15 09:59:43 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Fri, 15 Feb 2013 14:59:43 +0000 Subject: [Bioperl-l] bioperl in Google Summer of Code 2013 In-Reply-To: References: <50A8D5F3-7248-49C1-AA1D-969C2BCA38A9@drycafe.net> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE28328@CHIMBX5.ad.uillinois.edu> On Feb 15, 2013, at 8:47 AM, Peter Cock wrote: > On Fri, Feb 15, 2013 at 2:28 PM, Hilmar Lapp wrote: >> I presume the OBF does as an umbrella organization on behalf of all Bio* >> projects. If you fancy proposing a project idea or mentoring, now is not a >> bad time to think about that or looking for co-mentors. >> >> -hilmar > > Yes, the plan is that as in the last few years, the OBF will apply to > GSoC and cover for BioPerl, BioJava, BioRuby, Biopython etc. At > this stage the Bio* projects would be wise to start coming up with > some good project ideas and experienced developers thinking about > being a mentor. For potential students, getting involved in the > community early is a good idea (e.g. bug reports, or better fixing > existing bugs) > > See also: > http://lists.open-bio.org/mailman/listinfo/gsoc > http://lists.open-bio.org/mailman/listinfo/gsoc-mentors > > Peter At the moment I'm not sure if Rob is heading this up or if the baton will be passed on to someone else. I can't take charge of writing up a proposal at the moment but I can certainly help edit. chris From scott at scottcain.net Fri Feb 15 14:18:37 2013 From: scott at scottcain.net (Scott Cain) Date: Fri, 15 Feb 2013 14:18:37 -0500 Subject: [Bioperl-l] sequence-region directives in gff files In-Reply-To: References: Message-ID: Hi Carn?, Thanks for pointing this out; I was only sort of paying attention to the FeatureIO discussion, and it hadn't occurred to me that my commit was the problem. I believe I've reproduced the functionality from that commit, and I even added a test that makes use of the added method (yes, I know, it surprised me too!). All of the tests now pass for me in the FeatureIO master. I'm putting it on my todo list to check that the Chado loader that makes use of Bio::FeatureIO still works as expected with the new incarnation. Thanks, Scott On Wed, Feb 13, 2013 at 5:22 AM, Carn? Draug wrote: > Hi Scott > > 3 years ago, the code for the Bio::SeqFeatureIO::* modules was split > from bioperl-live into a separate repository[1]. Because the code was > not removed from the bioperl-live repository, people ended up patching > on both sides, leading to 2 branches of development. Last weekend I > merged them back together with the exception of one commit that would > not longer apply[2]. > > This commit was authored by you with the following commit message: > "tiny change to Bio::FeatureIO::gff to allow the gmod chado gff3 bulk > loader to not choke when the gff file has ##sequence-region > directives. The loader is documented not to support this, but now it > will quitely ignore those directives." > > Do you think you could take a look at it? > > Thank you, > Carn? > > [1] https://github.com/bioperl/Bio-FeatureIO > [2] https://github.com/bioperl/bioperl-live/commit/7218728b66ad297953676236077fd0ec757378c0 -- ------------------------------------------------------------------------ Scott Cain, Ph. D. scott at scottcain dot net GMOD Coordinator (http://gmod.org/) 216-392-3087 Ontario Institute for Cancer Research From carandraug+dev at gmail.com Tue Feb 19 13:52:57 2013 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Tue, 19 Feb 2013 18:52:57 +0000 Subject: [Bioperl-l] bioperl in Google Summer of Code 2013 In-Reply-To: References: <50A8D5F3-7248-49C1-AA1D-969C2BCA38A9@drycafe.net> <118F034CF4C3EF48A96F86CE585B94BF6CE28328@CHIMBX5.ad.uillinois.edu> Message-ID: On 15 February 2013 14:28, Hilmar Lapp wrote: > [...] > If you fancy proposing a project idea or mentoring, now is not a bad time to think about that or looking for co-mentors. On 15 February 2013 14:59, Fields, Christopher J wrote: > At the moment I'm not sure if Rob is heading this up or if the baton will be passed on to someone else. I can't take charge of writing up a proposal at the moment but I can certainly help edit. I would like to participate this year as a student. I do not have however, have any bioperl itch that would last a summer to fix. The largest of them is to implement BLAST using NCBI's server. They have made available a SOAP-based BLAST and doing this has been on my todo for ages. Would you suggest any other project for bioperl? Carn? From peymanalavi at yahoo.com Tue Feb 19 16:16:49 2013 From: peymanalavi at yahoo.com (peyman alavi) Date: Tue, 19 Feb 2013 13:16:49 -0800 (PST) Subject: [Bioperl-l] BioGraphics: Bio::SCF installation through cpan fails Message-ID: <1361308609.90384.YahooMailNeo@web120901.mail.ne1.yahoo.com> Hello, I am having problems for a while trying to install the Bio::SCF module on my Vista32. Now, I know that Bio::SCF isn't really a Bioperl module, but I need it for Bio::Graphics, and I thought perhaps other people had experienced the same problem before.? I have installed zlib and io_lib (both their last available versions), but it looks like sth. (presumably with io_lib) is missing. I should be very grateful if someone could tell me what still needs to be done! Here are the paths where the io_lib "library" and "include" directories are installed, and I set them to cpan before trying to install Bio::SCF: o conf makepl_arg ?LIBS=-Lc:/MinGW/msys/1.0/local/lib INC=-Ic:/MinGW/msys/1.0/local/include? And the following is what I get on the STDOUT: ? Set up gcc environment - 4.7.2 [32m cpan shell -- CPAN exploration and modules installation (v1.9800) Enter 'h' for help.[0m ? [32m??? makepl_arg???????? [LIBS=-Lc:/MinGW/msys/1.0/local/lib INC=-Ic:/MinGW/msys/1.0/local/include][0m [32mPlease use 'o conf commit' to make the config permanent![0m ? [32m[0m [32mReading 'D:\Perl\cpan\Metadata'[0m [32m? Database was generated on Sun, 17 Feb 2013 12:17:02 GMT[0m [32mRunning install for module 'Bio::SCF'[0m [32mRunning make for L/LD/LDS/Bio-SCF-1.03.tar.gz[0m [32mChecksum for D:\Perl\cpan\sources\authors\id\L\LD\LDS\Bio-SCF-1.03.tar.gz ok[0m [32mScanning cache D:\Perl/cpan/build for sizes[0m [32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32mDONE[0m [32mBio-SCF-1.03/[0m [32mBio-SCF-1.03/t/[0m [32mBio-SCF-1.03/t/scf.t[0m [32mBio-SCF-1.03/eg/[0m [32mBio-SCF-1.03/eg/write_test_obj.pl[0m [32mBio-SCF-1.03/eg/write_test_tied.pl[0m [32mBio-SCF-1.03/eg/read_test_obj.pl[0m [32mBio-SCF-1.03/eg/read_test_tied.pl[0m [32mBio-SCF-1.03/SCF/[0m [32mBio-SCF-1.03/SCF/Arrays.pm[0m [32mBio-SCF-1.03/DISCLAIMER[0m [32mBio-SCF-1.03/README[0m [32mBio-SCF-1.03/SCF.pm[0m [32mBio-SCF-1.03/SCF.xs[0m [32mBio-SCF-1.03/Changes[0m [32mBio-SCF-1.03/test.scf[0m [32mBio-SCF-1.03/Makefile.PL[0m [32mBio-SCF-1.03/META.yml[0m [32mBio-SCF-1.03/INSTALL[0m [32mBio-SCF-1.03/MANIFEST[0m [32m ? CPAN.pm: Building L/LD/LDS/Bio-SCF-1.03.tar.gz[0m ? Set up gcc environment - 4.7.2 Checking if your kit is complete... Looks good Writing Makefile for Bio::SCF Writing MYMETA.yml and MYMETA.json cp SCF.pm blib\lib\Bio\SCF.pm cp SCF/Arrays.pm blib\lib\Bio\SCF\Arrays.pm D:\Perl\bin\perl.exe D:\Perl\site\lib\ExtUtils\xsubpp? -typemap D:\Perl\lib\ExtUtils\typemap? SCF.xs > SCF.xsc && D:\Perl\bin\perl.exe -MExtUtils::Command -e mv -- SCF.xsc SCF.c Please specify prototyping behavior for SCF.xs (see perlxs manual) c:/MinGW/bin/gcc.exe -c? -Ic:/MinGW/msys/1.0/local/include ???????????? -DNDEBUG -DWIN32 -D_CONSOLE -DNO_STRICT -DHAVE_DES_FCRYPT -DUSE_SITECUSTOMIZE -DPERL_IMPLICIT_CONTEXT -DPERL_IMPLICIT_SYS -DUSE_PERLIO -D_USE_32BIT_TIME_T -DPERL_MSVCRT_READFIX -DHASATTRIBUTE -fno-strict-aliasing -mms-bitfields -O2 ??????? ??-DVERSION=\"1.03\" ??????? -DXS_VERSION=\"1.03\"? "-ID:\Perl\lib\CORE"? -DLITTLE_ENDIAN SCF.c In file included from c:/MinGW/msys/1.0/local/include/io_lib/scf.h:31:0, ???????????????? from SCF.xs:12: c:/MinGW/msys/1.0/local/include/io_lib/mFILE.h:23:0: warning: "MF_APPEND" redefined [enabled by default] In file included from c:\mingw\bin\../lib/gcc/mingw32/4.7.2/../../../../include/windows.h:55:0, ???????????????? from D:\Perl\lib\CORE/win32.h:61, ???????????????? from D:\Perl\lib\CORE/win32thread.h:4, ???????????????? from D:\Perl\lib\CORE/perl.h:2825, ???????????????? from SCF.xs:5: c:\mingw\bin\../lib/gcc/mingw32/4.7.2/../../../../include/winuser.h:131:0: note: this is the location of the previous definition SCF.xs: In function 'XS_Bio__SCF_get_scf_pointer': SCF.xs:35:2: warning: passing argument 3 of '(*Perl_ILIO_ptr((struct PerlInterpreter *)Perl_get_context()))->pNameStat' from incompatible pointer type [enabled by default] SCF.xs:35:2: note: expected 'struct _stati64 *' but argument is of type 'struct stat *' Running Mkbootstrap for Bio::SCF () D:\Perl\bin\perl.exe -MExtUtils::Command -e chmod -- 644 SCF.bs D:\Perl\bin\perl.exe -MExtUtils::Mksymlists \ ???? -e "Mksymlists('NAME'=>\"Bio::SCF\", 'DLBASE' => 'SCF', 'DL_FUNCS' => {? }, 'FUNCLIST' => [], 'IMPORTS' => {? }, 'DL_VARS' => []);" Set up gcc environment - 4.7.2 dlltool --def SCF.def --output-exp dll.exp c:\MinGW\bin\g++.exe -o blib\arch\auto\Bio\SCF\SCF.dll -Wl,--base-file -Wl,dll.base -mdll -L"D:\Perl\lib\CORE" SCF.o?? D:\Perl\lib\CORE\perl512.lib c:\MinGW\lib\libkernel32.a c:\MinGW\lib\libuser32.a c:\MinGW\lib\libgdi32.a c:\MinGW\lib\libwinspool.a c:\MinGW\lib\libcomdlg32.a c:\MinGW\lib\libadvapi32.a c:\MinGW\lib\libshell32.a c:\MinGW\lib\libole32.a c:\MinGW\lib\liboleaut32.a c:\MinGW\lib\libnetapi32.a c:\MinGW\lib\libuuid.a c:\MinGW\lib\libws2_32.a c:\MinGW\lib\libmpr.a c:\MinGW\lib\libwinmm.a c:\MinGW\lib\libversion.a c:\MinGW\lib\libodbc32.a c:\MinGW\lib\libodbccp32.a c:\MinGW\lib\libcomctl32.a c:\MinGW\lib\libmsvcrt.a dll.exp Warning: resolving _VirtualQuery at 12 by linking to _VirtualQuery Use --enable-stdcall-fixup to disable these warnings Use --disable-stdcall-fixup to disable these fixups Warning: resolving _VirtualProtect at 16 by linking to _VirtualProtect Warning: resolving _EnterCriticalSection at 4 by linking to _EnterCriticalSection Warning: resolving _TlsGetValue at 4 by linking to _TlsGetValue Warning: resolving _GetLastError at 0 by linking to _GetLastError Warning: resolving _LeaveCriticalSection at 4 by linking to _LeaveCriticalSection Warning: resolving _DeleteCriticalSection at 4 by linking to _DeleteCriticalSection Warning: resolving _InitializeCriticalSection at 4 by linking to _InitializeCriticalSection SCF.o:SCF.c:(.text+0xf35): undefined reference to `mfreopen' SCF.o:SCF.c:(.text+0xf4b): undefined reference to `mfwrite_scf' SCF.o:SCF.c:(.text+0xf6a): undefined reference to `mfflush' SCF.o:SCF.c:(.text+0xf72): undefined reference to `mfdestroy' SCF.o:SCF.c:(.text+0x1138): undefined reference to `write_scf' SCF.o:SCF.c:(.text+0x16ac): undefined reference to `scf_deallocate' SCF.o:SCF.c:(.text+0x17b1): undefined reference to `mfreopen' SCF.o:SCF.c:(.text+0x17c1): undefined reference to `mfread_scf' SCF.o:SCF.c:(.text+0x19bd): undefined reference to `read_scf' c:/mingw/bin/../lib/gcc/mingw32/4.7.2/../../../../mingw32/bin/ld.exe: SCF.o: bad reloc address 0xa4 in section `.rdata' c:/mingw/bin/../lib/gcc/mingw32/4.7.2/../../../../mingw32/bin/ld.exe: final link failed: Invalid operation collect2.exe: error: ld returned 1 exit status dmake.exe:? Error code 129, while making 'blib\arch\auto\Bio\SCF\SCF.dll' [32m? LDS/Bio-SCF-1.03.tar.gz[0m [31m? D:\Perl\site\bin\dmake.exe -- NOT OK[0m [32mRunning make test[0m [32m? Can't test without successful make[0m [32mRunning make install[0m [32m? Make had returned bad status, install seems impossible[0m [32mFailed during this command: ?LDS/Bio-SCF-1.03.tar.gz????????????????????? : make NO[0m [32m[0m [31mWarning: Configuration not saved.[0m [32mLockfile removed.[0m ? ? ?Thanks in advance for any useful suggestions/help!! Peyman From scott at scottcain.net Tue Feb 19 18:39:44 2013 From: scott at scottcain.net (Scott Cain) Date: Tue, 19 Feb 2013 18:39:44 -0500 Subject: [Bioperl-l] BioGraphics: Bio::SCF installation through cpan fails In-Reply-To: <1361308609.90384.YahooMailNeo@web120901.mail.ne1.yahoo.com> References: <1361308609.90384.YahooMailNeo@web120901.mail.ne1.yahoo.com> Message-ID: <777246AB-2EF0-403D-9652-8EA8390D5C53@scottcain.net> Hi Peyman, I have no idea what might be required to get staden and Bio::SCF installed on a windows machine; you have my sympathies for having to go through it. But what I wanted to touch on was what you wrote, that is, that you "need" it for Bio::Graphics. I just wanted to point out that you don't need it unless you want to be able to display traces from ABI sequencers (which most people don't really care to do these days). Bioi::SCF is listed as a recommended module, not a required one. Scott Sent from my iPad On Feb 19, 2013, at 4:16 PM, peyman alavi wrote: > Hello, > I am having > problems for a while trying to install the Bio::SCF module on my Vista32. Now, I know that Bio::SCF isn't really a Bioperl module, but I need it for Bio::Graphics, and I thought perhaps other people had experienced the same problem before. I > have installed zlib and io_lib (both their last available versions), but it > looks like sth. (presumably with io_lib) is missing. I should be very grateful > if someone could tell me what still needs to be done! > Here are > the paths where the io_lib "library" and "include" directories are installed, and I > set them to cpan before trying to install Bio::SCF: > o conf > makepl_arg ?LIBS=-Lc:/MinGW/msys/1.0/local/lib INC=-Ic:/MinGW/msys/1.0/local/include? > And the > following is what I get on the STDOUT: > > Set up gcc environment - 4.7.2 > [32m > cpan shell -- CPAN exploration and modules installation (v1.9800) > Enter 'h' for help.[0m > > [32m makepl_arg [LIBS=-Lc:/MinGW/msys/1.0/local/lib > INC=-Ic:/MinGW/msys/1.0/local/include][0m > [32mPlease use 'o conf commit' to make the config permanent![0m > > [32m[0m > [32mReading 'D:\Perl\cpan\Metadata'[0m > [32m Database was generated on > Sun, 17 Feb 2013 12:17:02 GMT[0m > [32mRunning install for module 'Bio::SCF'[0m > [32mRunning make for L/LD/LDS/Bio-SCF-1.03.tar.gz[0m > [32mChecksum for > D:\Perl\cpan\sources\authors\id\L\LD\LDS\Bio-SCF-1.03.tar.gz ok[0m > [32mScanning cache D:\Perl/cpan/build for sizes[0m > [32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32mDONE[0m > [32mBio-SCF-1.03/[0m > [32mBio-SCF-1.03/t/[0m > [32mBio-SCF-1.03/t/scf.t[0m > [32mBio-SCF-1.03/eg/[0m > [32mBio-SCF-1.03/eg/write_test_obj.pl[0m > [32mBio-SCF-1.03/eg/write_test_tied.pl[0m > [32mBio-SCF-1.03/eg/read_test_obj.pl[0m > [32mBio-SCF-1.03/eg/read_test_tied.pl[0m > [32mBio-SCF-1.03/SCF/[0m > [32mBio-SCF-1.03/SCF/Arrays.pm[0m > [32mBio-SCF-1.03/DISCLAIMER[0m > [32mBio-SCF-1.03/README[0m > [32mBio-SCF-1.03/SCF.pm[0m > [32mBio-SCF-1.03/SCF.xs[0m > [32mBio-SCF-1.03/Changes[0m > [32mBio-SCF-1.03/test.scf[0m > [32mBio-SCF-1.03/Makefile.PL[0m > [32mBio-SCF-1.03/META.yml[0m > [32mBio-SCF-1.03/INSTALL[0m > [32mBio-SCF-1.03/MANIFEST[0m > [32m > CPAN.pm: Building > L/LD/LDS/Bio-SCF-1.03.tar.gz[0m > > Set up gcc environment - 4.7.2 > Checking if your kit is complete... > Looks good > Writing Makefile for Bio::SCF > Writing MYMETA.yml and MYMETA.json > cp SCF.pm blib\lib\Bio\SCF.pm > cp SCF/Arrays.pm blib\lib\Bio\SCF\Arrays.pm > D:\Perl\bin\perl.exe D:\Perl\site\lib\ExtUtils\xsubpp -typemap D:\Perl\lib\ExtUtils\typemap SCF.xs > SCF.xsc && > D:\Perl\bin\perl.exe -MExtUtils::Command -e mv -- SCF.xsc SCF.c > Please specify prototyping behavior for SCF.xs (see perlxs manual) > c:/MinGW/bin/gcc.exe -c -Ic:/MinGW/msys/1.0/local/include -DNDEBUG > -DWIN32 -D_CONSOLE -DNO_STRICT -DHAVE_DES_FCRYPT -DUSE_SITECUSTOMIZE > -DPERL_IMPLICIT_CONTEXT -DPERL_IMPLICIT_SYS -DUSE_PERLIO -D_USE_32BIT_TIME_T > -DPERL_MSVCRT_READFIX -DHASATTRIBUTE -fno-strict-aliasing -mms-bitfields -O2 -DVERSION=\"1.03\" -DXS_VERSION=\"1.03\" "-ID:\Perl\lib\CORE" -DLITTLE_ENDIAN SCF.c > In file included from c:/MinGW/msys/1.0/local/include/io_lib/scf.h:31:0, > from SCF.xs:12: > c:/MinGW/msys/1.0/local/include/io_lib/mFILE.h:23:0: warning: > "MF_APPEND" redefined [enabled by default] > In file included from > c:\mingw\bin\../lib/gcc/mingw32/4.7.2/../../../../include/windows.h:55:0, > from > D:\Perl\lib\CORE/win32.h:61, > from > D:\Perl\lib\CORE/win32thread.h:4, > from > D:\Perl\lib\CORE/perl.h:2825, > from SCF.xs:5: > c:\mingw\bin\../lib/gcc/mingw32/4.7.2/../../../../include/winuser.h:131:0: > note: this is the location of the previous definition > SCF.xs: In function 'XS_Bio__SCF_get_scf_pointer': > SCF.xs:35:2: warning: passing argument 3 of '(*Perl_ILIO_ptr((struct > PerlInterpreter *)Perl_get_context()))->pNameStat' from incompatible pointer > type [enabled by default] > SCF.xs:35:2: note: expected 'struct _stati64 *' but argument is of type > 'struct stat *' > Running Mkbootstrap for Bio::SCF () > D:\Perl\bin\perl.exe -MExtUtils::Command -e chmod -- 644 SCF.bs > D:\Perl\bin\perl.exe -MExtUtils::Mksymlists \ > -e > "Mksymlists('NAME'=>\"Bio::SCF\", 'DLBASE' => 'SCF', > 'DL_FUNCS' => { }, 'FUNCLIST' => > [], 'IMPORTS' => { }, 'DL_VARS' => > []);" > Set up gcc environment - 4.7.2 > dlltool --def SCF.def --output-exp dll.exp > c:\MinGW\bin\g++.exe -o blib\arch\auto\Bio\SCF\SCF.dll -Wl,--base-file > -Wl,dll.base -mdll -L"D:\Perl\lib\CORE" SCF.o D:\Perl\lib\CORE\perl512.lib > c:\MinGW\lib\libkernel32.a c:\MinGW\lib\libuser32.a c:\MinGW\lib\libgdi32.a > c:\MinGW\lib\libwinspool.a c:\MinGW\lib\libcomdlg32.a c:\MinGW\lib\libadvapi32.a > c:\MinGW\lib\libshell32.a c:\MinGW\lib\libole32.a c:\MinGW\lib\liboleaut32.a > c:\MinGW\lib\libnetapi32.a c:\MinGW\lib\libuuid.a c:\MinGW\lib\libws2_32.a > c:\MinGW\lib\libmpr.a c:\MinGW\lib\libwinmm.a c:\MinGW\lib\libversion.a > c:\MinGW\lib\libodbc32.a c:\MinGW\lib\libodbccp32.a c:\MinGW\lib\libcomctl32.a > c:\MinGW\lib\libmsvcrt.a dll.exp > Warning: resolving _VirtualQuery at 12 by linking to _VirtualQuery > Use --enable-stdcall-fixup to disable these warnings > Use --disable-stdcall-fixup to disable these fixups > Warning: resolving _VirtualProtect at 16 by linking to _VirtualProtect > Warning: resolving _EnterCriticalSection at 4 by linking to > _EnterCriticalSection > Warning: resolving _TlsGetValue at 4 by linking to _TlsGetValue > Warning: resolving _GetLastError at 0 by linking to _GetLastError > Warning: resolving _LeaveCriticalSection at 4 by linking to > _LeaveCriticalSection > Warning: resolving _DeleteCriticalSection at 4 by linking to > _DeleteCriticalSection > Warning: resolving _InitializeCriticalSection at 4 by linking to > _InitializeCriticalSection > SCF.o:SCF.c:(.text+0xf35): undefined reference to `mfreopen' > SCF.o:SCF.c:(.text+0xf4b): undefined reference to `mfwrite_scf' > SCF.o:SCF.c:(.text+0xf6a): undefined reference to `mfflush' > SCF.o:SCF.c:(.text+0xf72): undefined reference to `mfdestroy' > SCF.o:SCF.c:(.text+0x1138): undefined reference to `write_scf' > SCF.o:SCF.c:(.text+0x16ac): undefined reference to `scf_deallocate' > SCF.o:SCF.c:(.text+0x17b1): undefined reference to `mfreopen' > SCF.o:SCF.c:(.text+0x17c1): undefined reference to `mfread_scf' > SCF.o:SCF.c:(.text+0x19bd): undefined reference to `read_scf' > c:/mingw/bin/../lib/gcc/mingw32/4.7.2/../../../../mingw32/bin/ld.exe: > SCF.o: bad reloc address 0xa4 in section `.rdata' > c:/mingw/bin/../lib/gcc/mingw32/4.7.2/../../../../mingw32/bin/ld.exe: > final link failed: Invalid operation > collect2.exe: error: ld returned 1 exit status > dmake.exe: Error code 129, while > making 'blib\arch\auto\Bio\SCF\SCF.dll' > [32m LDS/Bio-SCF-1.03.tar.gz[0m > [31m D:\Perl\site\bin\dmake.exe > -- NOT OK[0m > [32mRunning make test[0m > [32m Can't test without successful > make[0m > [32mRunning make install[0m > [32m Make had returned bad > status, install seems impossible[0m > [32mFailed during this command: > LDS/Bio-SCF-1.03.tar.gz : make NO[0m > [32m[0m > [31mWarning: Configuration not saved.[0m > [32mLockfile removed.[0m > > > Thanks in advance for any useful > suggestions/help!! > Peyman > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From anngregory at email.arizona.edu Wed Feb 20 00:20:41 2013 From: anngregory at email.arizona.edu (Ann Gregory) Date: Tue, 19 Feb 2013 22:20:41 -0700 Subject: [Bioperl-l] Problem Parsing BLAST output to annotate FASTA file Message-ID: Hi BioPerl, I am having issues with a BioPerl script. I have a blastxml file from a blastx blast and the original multifasta file containing the original nucleotides sequences. I want to take the blast result (ie. the blast description) and annotate my multifasta file. I have written 2 while loops that extract the blast descriptions as well as the nucleotide sequence from the multifasta file. My problem is that I cannot incorporate one of the while loops into the other without loosing the loop property of one of the loops. I would like to take the 1st blast description, then the 1st nucleotide sequence, then the 2nd blast description, then the 2nd nucleotide sequence and so on...just can figure out how to alternate the results. See script below: use warnings; use strict; use Bio::SearchIO; use Bio::SeqIO; my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file => "$ARGV[0]"); while (my $result = $search_in->next_result) { while (my $hit = $result->next_hit) { while (my $hsp = $hit->next_hsp) { my $qd = $hit->description; print $qd, "\n"; } } } my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]"); while (my $seqobj = $seqio->next_seq) { my $nuc = $seqobj->seq(); print $nuc, "\n"; }-- Ann (Nina) Gregory Graduate Student Rich Lab / Sullivan Lab Soil, Water, Environmental Science Department University of Arizona From yonexhalaolv at gmail.com Wed Feb 20 04:17:12 2013 From: yonexhalaolv at gmail.com (Sebastian Lau) Date: Wed, 20 Feb 2013 01:17:12 -0800 (PST) Subject: [Bioperl-l] =?utf-8?q?failed_to_install_via_fink=EF=BC=9Ano_packa?= =?utf-8?q?ge_found_for_specification_=27bioperl-pm5100=27!?= Message-ID: <84fa1bcb-a39f-4847-bff2-e3a9c2b909ea@googlegroups.com> *Hi guys,* * * *I just about to install bioperl on my MacOS 10.7.5 via fink. but after typing the command, fink said it couldn't find any package:* fangmatoMacBook-Pro:~ yoyo$ fink install bioperl-pm5100 Information about 6901 packages read in 1 seconds. Failed: no package found for specification 'bioperl-pm5100'! fangmatoMacBook-Pro:~ yoyo$ fink install bioperl-pm588 Information about 6901 packages read in 1 seconds. Failed: no package found for specification 'bioperl-pm588'! fangmatoMacBook-Pro:~ yoyo$ fink install bioperl-pm586 Information about 6901 packages read in 1 seconds. Failed: no package found for specification 'bioperl-pm586'! *I followed the instruction on wiki. I don't know what's wrong with it. Thanks for your help.* From awitney at sgul.ac.uk Wed Feb 20 10:22:51 2013 From: awitney at sgul.ac.uk (Adam Witney) Date: Wed, 20 Feb 2013 15:22:51 +0000 Subject: [Bioperl-l] Problem Parsing BLAST output to annotate FASTA file In-Reply-To: References: Message-ID: <5124EA4B.5020409@sgul.ac.uk> Hi Ann, On 20/02/2013 05:20, Ann Gregory wrote: > Hi BioPerl, > > I am having issues with a BioPerl script. I have a blastxml file from a > blastx blast and the original multifasta file containing the original > nucleotides sequences. > > I want to take the blast result (ie. the blast description) and annotate my > multifasta file. > > I have written 2 while loops that extract the blast descriptions as well as > the nucleotide sequence from the multifasta file. > > My problem is that I cannot incorporate one of the while loops into the > other without loosing the loop property of one of the loops. I would like > to take the 1st blast description, then the 1st nucleotide sequence, then > the 2nd blast description, then the 2nd nucleotide sequence and so > on...just can figure out how to alternate the results. > > See script below: > > > use warnings; > use strict; > use Bio::SearchIO; > use Bio::SeqIO; > > > my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file => > "$ARGV[0]"); > while (my $result = $search_in->next_result) { > while (my $hit = $result->next_hit) { > while (my $hsp = $hit->next_hsp) { > my $qd = $hit->description; > print $qd, "\n"; > } > } > } > > my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]"); > while (my $seqobj = $seqio->next_seq) { > my $nuc = $seqobj->seq(); > print $nuc, "\n"; > }-- I think what you are proposing assumes that the loop over the BLAST results will come back in the same order as the loop over the Fasta file, this may be the case, but I'm not sure its something I would rely on. Anyway, I would loop over the BLAST results, storing the relevant data to an array or hash and then loop over the fasta file to put the two together. eg: my $blast_data; while ( ... blast data ... ) { ... $blast_data->{$qd} = ... } while ( my $seqobj = $seqio->next_seq ) { my $id = $seqobj->id; print $blast_data->{$id}."\n"; } something along those lines... or have i misunderstood you? if so can you provide some more details, like what do you want your output to look like? HTH Adam From andreas.leimbach at uni-wuerzburg.de Wed Feb 20 11:24:50 2013 From: andreas.leimbach at uni-wuerzburg.de (Andreas Leimbach) Date: Wed, 20 Feb 2013 17:24:50 +0100 Subject: [Bioperl-l] Problem Parsing BLAST output to annotate FASTA file In-Reply-To: References: Message-ID: <5124F8D2.4020904@uni-wuerzburg.de> oops, I just realized I had one loop to much in there. Adam is correct. Sorry. The last part of the code I send you should look like this: my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]"); while (my $seqobj = $seqio->next_seq) { print ">$hits{$seqobj->display_id}\n"; my $nuc = $seqobj->seq(); print $nuc, "\n"; } Cheers, Andreas -- Andreas Leimbach Universit?t M?nster Institut f?r Hygiene Mendelstr. 7 D-48149 M?nster Germany Tel.: +49 (0)551 39 3843 E-Mail: andreas.leimbach at uni-wuerzburg.de On 20.2.13 06:20, Ann Gregory wrote: > Hi BioPerl, > > I am having issues with a BioPerl script. I have a blastxml file from a > blastx blast and the original multifasta file containing the original > nucleotides sequences. > > I want to take the blast result (ie. the blast description) and annotate my > multifasta file. > > I have written 2 while loops that extract the blast descriptions as well as > the nucleotide sequence from the multifasta file. > > My problem is that I cannot incorporate one of the while loops into the > other without loosing the loop property of one of the loops. I would like > to take the 1st blast description, then the 1st nucleotide sequence, then > the 2nd blast description, then the 2nd nucleotide sequence and so > on...just can figure out how to alternate the results. > > See script below: > > > use warnings; > use strict; > use Bio::SearchIO; > use Bio::SeqIO; > > > my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file => > "$ARGV[0]"); > while (my $result = $search_in->next_result) { > while (my $hit = $result->next_hit) { > while (my $hsp = $hit->next_hsp) { > my $qd = $hit->description; > print $qd, "\n"; > } > } > } > > my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]"); > while (my $seqobj = $seqio->next_seq) { > my $nuc = $seqobj->seq(); > print $nuc, "\n"; > }-- > Ann (Nina) Gregory > Graduate Student > Rich Lab / Sullivan Lab > Soil, Water, Environmental Science Department > University of Arizona > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From andreas.leimbach at uni-wuerzburg.de Wed Feb 20 11:14:29 2013 From: andreas.leimbach at uni-wuerzburg.de (Andreas Leimbach) Date: Wed, 20 Feb 2013 17:14:29 +0100 Subject: [Bioperl-l] Problem Parsing BLAST output to annotate FASTA file In-Reply-To: References: Message-ID: <5124F665.5050602@uni-wuerzburg.de> Hi Ann, I agree with Adam, but I was already writing my email, while his came in. Hope it helps: I hope I understand correctly what you want to do. Just to clarify, you queried a protein blast database with blastx and nucleotide queries. Now you want to associate the protein description for the FIRST blast hit with the corresponding nucleotide fasta file. Is that correct? You have to put the two while loops into one another. Or associate the blast hits with the query descriptions. But it's not feasible to take the first blast hit and the first nucleotide fasta seq, then the 2nd of both etc, as Adam already pointed out. You would have to iterate through both at the same time. I.e. take the first blast hit, then iterate through the nucleotide fasta until you find the hit. Then take the 2nd blast hit and iterate through the nucleotide fasta etc. It's probably easiest to do this in a hash. Something along the lines of (not tested I just punched that in the E-Mail): my %hits; my $hit_desc; my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file => "$ARGV[0]"); while (my $result = $search_in->next_result) { while (my $hit = $result->next_hit) { while (my $hsp = $hit->next_hsp) { if ($hit->description eq $hit_desc) { # Only want the first blast hit next; } my $hit_desc = $hit->description; $hits{$result->query_description} = $hit_desc; } } } my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]"); foreach my $query (keys %hits) { while (my $seqobj = $seqio->next_seq) { if ($seqobj->display_id eq $query) { print ">$hits{$query}\n"; my $nuc = $seqobj->seq(); print $nuc, "\n"; } You might want to put some evalue cutoff in there to only score significant hits. Also if your nucleotide query multi-fasta file is very large, you might consider creating an index first: http://www.bioperl.org/wiki/HOWTO:Local_Databases#Bio::Index Hope that helps! Cheers, Andreas P.S.: Please next time include version numbers for BioPerl and Perl and a little more detail what you want to do. ;-) -- Andreas Leimbach Universit?t M?nster Institut f?r Hygiene Mendelstr. 7 D-48149 M?nster Germany Tel.: +49 (0)551 39 3843 E-Mail: andreas.leimbach at uni-wuerzburg.de On 20.2.13 06:20, Ann Gregory wrote: > Hi BioPerl, > > I am having issues with a BioPerl script. I have a blastxml file from a > blastx blast and the original multifasta file containing the original > nucleotides sequences. > > I want to take the blast result (ie. the blast description) and annotate my > multifasta file. > > I have written 2 while loops that extract the blast descriptions as well as > the nucleotide sequence from the multifasta file. > > My problem is that I cannot incorporate one of the while loops into the > other without loosing the loop property of one of the loops. I would like > to take the 1st blast description, then the 1st nucleotide sequence, then > the 2nd blast description, then the 2nd nucleotide sequence and so > on...just can figure out how to alternate the results. > > See script below: > > > use warnings; > use strict; > use Bio::SearchIO; > use Bio::SeqIO; > > > my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file => > "$ARGV[0]"); > while (my $result = $search_in->next_result) { > while (my $hit = $result->next_hit) { > while (my $hsp = $hit->next_hsp) { > my $qd = $hit->description; > print $qd, "\n"; > } > } > } > > my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]"); > while (my $seqobj = $seqio->next_seq) { > my $nuc = $seqobj->seq(); > print $nuc, "\n"; > }-- > Ann (Nina) Gregory > Graduate Student > Rich Lab / Sullivan Lab > Soil, Water, Environmental Science Department > University of Arizona > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From andreas.leimbach at uni-wuerzburg.de Wed Feb 20 12:00:51 2013 From: andreas.leimbach at uni-wuerzburg.de (Andreas Leimbach) Date: Wed, 20 Feb 2013 18:00:51 +0100 Subject: [Bioperl-l] Problem Parsing BLAST output to annotate FASTA file In-Reply-To: References: <5124F8D2.4020904@uni-wuerzburg.de> Message-ID: <51250143.9050503@uni-wuerzburg.de> Hey Ann, damn, it 's not my best day ... Anyways, I wouldn't work with List::MoreUtils's each_array function, as this assumes that the blast hits and the nucleotide queries are in the same order (as Adam pointed out). Rather use a hash which associates a key to a certain value. Also, the hash can be used to skip sequences that have no hits. Here's my new version: my %hits; my $hit_desc; my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file => "$ARGV[0]"); while (my $result = $search_in->next_result) { while (my $hit = $result->next_hit) { while (my $hsp = $hit->next_hsp) { $hits{$result->query_description} = $hit->description; # hash: associate query_desc (key) with hit_desc (value) last; # jump out of the while loop; this should resolve getting only the first hit } last; # see above } } my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]"); while (my $seqobj = $seqio->next_seq) { if ($hits{$seqobj->display_id}) { # only true if display_id associated with hit_desc and should skip seqs without hits print ">$hits{$seqobj->display_id}\n"; my $nuc = $seqobj->seq(); print $nuc, "\n"; } } Cheers, Andreas P.S.: I redirected your mail to the BioPerl mailing list, others might profit from my mistakes ;-) ... -- Andreas Leimbach Universit?t M?nster Institut f?r Hygiene Mendelstr. 7 D-48149 M?nster Germany Tel.: +49 (0)551 39 3843 E-Mail: andreas.leimbach at uni-wuerzburg.de On 20.2.13 17:35, Ann Gregory wrote: > Hi Andreas, > > Thanks for you help! I don't understand how this gets the first blast hit: > > if ($hit->description eq $hit_desc) { # Only want the first blast hit > next; > } > > I tried this and seems to be working...but I can't get the 1st blast hit > or skip the sequences that had no hits. Do you know any quick fixes? > > * > use warnings; > use strict; > use Bio::SearchIO; > use Bio::SeqIO; > use List::MoreUtils qw(each_array); > > my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file => > "$ARGV[0]"); > my @ids; > while (my $result = $search_in->next_result) { > while (my $hit = $result->next_hit) { > while (my $hsp = $hit->next_hsp) { > my $match = $result->num_hits; > push(@ids, $qd); > } > } > } > } > > my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]"); > my @seqs; > while (my $seqobj = $seqio->next_seq) { > my $nuc = $seqobj->seq(); > push(@seqs, $nuc); > } > > my $it = each_array(@ids, at seqs); > while(my($ids,$seqs)=$it->()){ > print $ids, "\n", $seqs, "\n"; > } > * > > Thanks again! > ~Ann > > On Wed, Feb 20, 2013 at 9:24 AM, Andreas Leimbach > > wrote: > > oops, I just realized I had one loop to much in there. Adam is > correct. Sorry. > > The last part of the code I send you should look like this: > > > my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]"); > while (my $seqobj = $seqio->next_seq) { > print ">$hits{$seqobj->display_id}\__n"; > > my $nuc = $seqobj->seq(); > print $nuc, "\n"; > } > > > Cheers, > Andreas > > > -- > Andreas Leimbach > Universit?t M?nster > Institut f?r Hygiene > Mendelstr. 7 > D-48149 M?nster > Germany > > Tel.: +49 (0)551 39 3843 > E-Mail: andreas.leimbach at uni-__wuerzburg.de > > > On 20.2.13 06:20, Ann Gregory wrote: > > Hi BioPerl, > > I am having issues with a BioPerl script. I have a blastxml file > from a > blastx blast and the original multifasta file containing the > original > nucleotides sequences. > > I want to take the blast result (ie. the blast description) and > annotate my > multifasta file. > > I have written 2 while loops that extract the blast descriptions > as well as > the nucleotide sequence from the multifasta file. > > My problem is that I cannot incorporate one of the while loops > into the > other without loosing the loop property of one of the loops. I > would like > to take the 1st blast description, then the 1st nucleotide > sequence, then > the 2nd blast description, then the 2nd nucleotide sequence and so > on...just can figure out how to alternate the results. > > See script below: > > > use warnings; > use strict; > use Bio::SearchIO; > use Bio::SeqIO; > > > my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file => > "$ARGV[0]"); > while (my $result = $search_in->next_result) { > while (my $hit = $result->next_hit) { > while (my $hsp = $hit->next_hsp) { > my $qd = $hit->description; > print $qd, "\n"; > } > } > } > > my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => > "$ARGV[1]"); > while (my $seqobj = $seqio->next_seq) { > my $nuc = $seqobj->seq(); > print $nuc, "\n"; > }-- > Ann (Nina) Gregory > Graduate Student > Rich Lab / Sullivan Lab > Soil, Water, Environmental Science Department > University of Arizona > _________________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/__mailman/listinfo/bioperl-l > > > > > > -- > Ann (Nina) Gregory > Graduate Student > Rich Lab / Sullivan Lab > Soil, Water, Environmental Science Department > University of Arizona > > > From cjfields at illinois.edu Wed Feb 20 13:24:58 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 20 Feb 2013 18:24:58 +0000 Subject: [Bioperl-l] Problem Parsing BLAST output to annotate FASTA file In-Reply-To: <51250143.9050503@uni-wuerzburg.de> References: <5124F8D2.4020904@uni-wuerzburg.de> <51250143.9050503@uni-wuerzburg.de> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE2EB4A@CHIMBX5.ad.uillinois.edu> If this is meant to be something done using the same FASTA files for a bunch of BLAST reports, might be worth setting up a flat file index and using that to look up and grab the sequences; it should be a LOT faster, just the first pass (generation of the initial index) would take a little time. Look at Bio::DB::Fasta for an example. chris On Feb 20, 2013, at 11:00 AM, Andreas Leimbach wrote: > Hey Ann, > > damn, it 's not my best day ... Anyways, I wouldn't work with List::MoreUtils's each_array function, as this assumes that the blast hits and the nucleotide queries are in the same order (as Adam pointed out). Rather use a hash which associates a key to a certain value. Also, the hash can be used to skip sequences that have no hits. > Here's my new version: > > my %hits; > my $hit_desc; > my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file => > "$ARGV[0]"); > while (my $result = $search_in->next_result) { > while (my $hit = $result->next_hit) { > while (my $hsp = $hit->next_hsp) { > $hits{$result->query_description} = $hit->description; # hash: associate query_desc (key) with hit_desc (value) > last; # jump out of the while loop; this should resolve getting only the first hit > } > last; # see above > } > } > > > my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]"); > while (my $seqobj = $seqio->next_seq) { > if ($hits{$seqobj->display_id}) { # only true if display_id associated with hit_desc and should skip seqs without hits > print ">$hits{$seqobj->display_id}\n"; > my $nuc = $seqobj->seq(); > print $nuc, "\n"; > } > } > > Cheers, > Andreas > > P.S.: I redirected your mail to the BioPerl mailing list, others might profit from my mistakes ;-) ... > > -- > Andreas Leimbach > Universit?t M?nster > Institut f?r Hygiene > Mendelstr. 7 > D-48149 M?nster > Germany > > Tel.: +49 (0)551 39 3843 > E-Mail: andreas.leimbach at uni-wuerzburg.de > > On 20.2.13 17:35, Ann Gregory wrote: >> Hi Andreas, >> >> Thanks for you help! I don't understand how this gets the first blast hit: >> >> if ($hit->description eq $hit_desc) { # Only want the first blast hit >> next; >> } >> >> I tried this and seems to be working...but I can't get the 1st blast hit >> or skip the sequences that had no hits. Do you know any quick fixes? >> >> * >> use warnings; >> use strict; >> use Bio::SearchIO; >> use Bio::SeqIO; >> use List::MoreUtils qw(each_array); >> >> my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file => >> "$ARGV[0]"); >> my @ids; >> while (my $result = $search_in->next_result) { >> while (my $hit = $result->next_hit) { >> while (my $hsp = $hit->next_hsp) { >> my $match = $result->num_hits; >> push(@ids, $qd); >> } >> } >> } >> } >> >> my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]"); >> my @seqs; >> while (my $seqobj = $seqio->next_seq) { >> my $nuc = $seqobj->seq(); >> push(@seqs, $nuc); >> } >> >> my $it = each_array(@ids, at seqs); >> while(my($ids,$seqs)=$it->()){ >> print $ids, "\n", $seqs, "\n"; >> } >> * >> >> Thanks again! >> ~Ann >> >> On Wed, Feb 20, 2013 at 9:24 AM, Andreas Leimbach >> > > wrote: >> >> oops, I just realized I had one loop to much in there. Adam is >> correct. Sorry. >> >> The last part of the code I send you should look like this: >> >> >> my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]"); >> while (my $seqobj = $seqio->next_seq) { >> print ">$hits{$seqobj->display_id}\__n"; >> >> my $nuc = $seqobj->seq(); >> print $nuc, "\n"; >> } >> >> >> Cheers, >> Andreas >> >> >> -- >> Andreas Leimbach >> Universit?t M?nster >> Institut f?r Hygiene >> Mendelstr. 7 >> D-48149 M?nster >> Germany >> >> Tel.: +49 (0)551 39 3843 >> E-Mail: andreas.leimbach at uni-__wuerzburg.de >> >> >> On 20.2.13 06:20, Ann Gregory wrote: >> >> Hi BioPerl, >> >> I am having issues with a BioPerl script. I have a blastxml file >> from a >> blastx blast and the original multifasta file containing the >> original >> nucleotides sequences. >> >> I want to take the blast result (ie. the blast description) and >> annotate my >> multifasta file. >> >> I have written 2 while loops that extract the blast descriptions >> as well as >> the nucleotide sequence from the multifasta file. >> >> My problem is that I cannot incorporate one of the while loops >> into the >> other without loosing the loop property of one of the loops. I >> would like >> to take the 1st blast description, then the 1st nucleotide >> sequence, then >> the 2nd blast description, then the 2nd nucleotide sequence and so >> on...just can figure out how to alternate the results. >> >> See script below: >> >> >> use warnings; >> use strict; >> use Bio::SearchIO; >> use Bio::SeqIO; >> >> >> my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file => >> "$ARGV[0]"); >> while (my $result = $search_in->next_result) { >> while (my $hit = $result->next_hit) { >> while (my $hsp = $hit->next_hsp) { >> my $qd = $hit->description; >> print $qd, "\n"; >> } >> } >> } >> >> my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => >> "$ARGV[1]"); >> while (my $seqobj = $seqio->next_seq) { >> my $nuc = $seqobj->seq(); >> print $nuc, "\n"; >> }-- >> Ann (Nina) Gregory >> Graduate Student >> Rich Lab / Sullivan Lab >> Soil, Water, Environmental Science Department >> University of Arizona >> _________________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/__mailman/listinfo/bioperl-l >> >> >> >> >> >> -- >> Ann (Nina) Gregory >> Graduate Student >> Rich Lab / Sullivan Lab >> Soil, Water, Environmental Science Department >> University of Arizona >> >> >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From carandraug+dev at gmail.com Mon Feb 25 05:08:23 2013 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Mon, 25 Feb 2013 10:08:23 +0000 Subject: [Bioperl-l] module for description of sequence variants (where to place code) Message-ID: Hi I'm writing a perl module to write a description of the variance between 2 sequences as described on http://www.hgvs.org/mutnomen/recs-prot.html Basically, given 2 sequences, would returns something like "p.Lys2del p.His25_Met26insGln" if those are the differences. It also accounts for the existence of - characters on the sequences that may come from their alignment. My question is, where on the project tree should I place the module? Also, is there something already written that would convert from 1 to 3 letter code? Carn? From andreas.leimbach at uni-wuerzburg.de Mon Feb 25 05:32:43 2013 From: andreas.leimbach at uni-wuerzburg.de (Andreas Leimbach) Date: Mon, 25 Feb 2013 11:32:43 +0100 Subject: [Bioperl-l] module for description of sequence variants (where to place code) In-Reply-To: References: Message-ID: <512B3DCB.7050008@uni-wuerzburg.de> Hi Carn?, for your last question: You can convert aa strings from one to three letter code with 'Bio::SeqUtils'. Cheers, Andreas -- Andreas Leimbach Universit?t M?nster Institut f?r Hygiene Mendelstr. 7 D-48149 M?nster Germany Tel.: +49 (0)551 39 3843 E-Mail: andreas.leimbach at uni-wuerzburg.de On 25.2.13 11:08, Carn? Draug wrote: > Hi > > I'm writing a perl module to write a description of the variance > between 2 sequences as described on > http://www.hgvs.org/mutnomen/recs-prot.html > > Basically, given 2 sequences, would returns something like "p.Lys2del > p.His25_Met26insGln" if those are the differences. It also accounts > for the existence of - characters on the sequences that may come from > their alignment. > > My question is, where on the project tree should I place the module? > > Also, is there something already written that would convert from 1 to > 3 letter code? > > Carn? > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From genehack at genehack.org Wed Feb 27 19:57:48 2013 From: genehack at genehack.org (John SJ Anderson) Date: Wed, 27 Feb 2013 16:57:48 -0800 Subject: [Bioperl-l] YAPC talks? Message-ID: Hi - Is there anyone that was planning on submitting a Bioperl talk to YAPC::NA? In an unrelated conversation, one of the organizers expressed an interest in getting a Bioperl talk this year. If no one else is planning on a talk submission, Jay Hannah (aka deafferret) and I are promising/threatening a tag-team style "Bioperl rules / Bioperl sucks" overview/state of the dist style talk... thanks, john. From cjfields at illinois.edu Wed Feb 27 21:48:55 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Thu, 28 Feb 2013 02:48:55 +0000 Subject: [Bioperl-l] YAPC talks? In-Reply-To: References: Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6E705CD3@CHIMBX5.ad.uillinois.edu> At the moment I personally have no plans on going, but I think a no-holds-barred bioperl talk is a good idea. chris On Feb 27, 2013, at 6:57 PM, John SJ Anderson wrote: > Hi - > > Is there anyone that was planning on submitting a Bioperl talk to > YAPC::NA? In an unrelated conversation, one of the organizers > expressed an interest in getting a Bioperl talk this year. > > If no one else is planning on a talk submission, Jay Hannah (aka > deafferret) and I are promising/threatening a tag-team style "Bioperl > rules / Bioperl sucks" overview/state of the dist style talk... > > thanks, > john. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From hlapp at drycafe.net Wed Feb 27 22:20:34 2013 From: hlapp at drycafe.net (Hilmar Lapp) Date: Wed, 27 Feb 2013 22:20:34 -0500 Subject: [Bioperl-l] YAPC talks? In-Reply-To: References: Message-ID: <42C1F1B8-FE26-43A8-B601-E80D17D215EC@drycafe.net> On Feb 27, 2013, at 7:57 PM, John SJ Anderson wrote: > Jay Hannah (aka deafferret) and I are promising/threatening a tag-team style "Bioperl > rules / Bioperl sucks" overview/state of the dist style talk... Please videotape. I'll be sure to watch and promote it :-) -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : =========================================================== From saladi1 at illinois.edu Thu Feb 28 01:58:20 2013 From: saladi1 at illinois.edu (Shyam Saladi) Date: Wed, 27 Feb 2013 22:58:20 -0800 Subject: [Bioperl-l] EUtilities Cookbook - Accn to gi Message-ID: Hi, I think that rettype for the section "Get GIs for a list of accessions" should be -rettype => 'gi'); instead of 'gilist' as it is now. I think this change is due to a change in NCBI eutils. webpage: http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook#Get_GIs_for_a_list_of_accessions Thanks, Shyam From fossandonc at hotmail.com Thu Feb 28 10:36:34 2013 From: fossandonc at hotmail.com (=?iso-8859-1?Q?Francisco_J._Ossand=F3n?=) Date: Thu, 28 Feb 2013 12:36:34 -0300 Subject: [Bioperl-l] Fix for Bug #3376 broke somewhere else Message-ID: Hi, I was re-checking Bug #3302 using the Bio::SearchIO modules of the repository and found that now it can't parse a Hmmer2 file that was previously fine. After tracking the problem, I discovered that a change in a regular expression to fix another bug broke the parse. The fix for the Bug #3376 consisted in adding an extra condition to omit lines where end of domain indicator is split across lines (https://redmine.open-bio.org/issues/3376): TEST: domain 1 of 1, from 8 to 97: score 184.7, E = 2.5e-56 *->svfqqqqssksttgstvtAiAiAigYRYRYRAvtWnsGsLssGvnDn sv+qqqq+ + +vtAiAiAigYRYRYRAv Wn GsLs G nDn Test 8 SVYQQQQGGSA----MVTAIAIAIGYRYRYRAVVWNKGSLSTGTNDN 50 DnDqqsdgLYtiYYsvtvpssslpsqtviHHHaHkasstkiiikiePr<- DnDq +d LYtiYYsvtv +ss+p q+v+HHHaH+asstkiiiki P Test 51 DNDQAAD-LYTIYYSVTVSASSWPGQSVTHHHAHPASSTKIIIKIAPS 97 * Test - - This case is characterized by the 2 dashes in the line... So the expression added in hmmer2.pm - ?next_result? (https://github.com/bioperl/bioperl-live/commit/142e5d79e3a6593db32bf0af9904 8f47d01bd3f2): elsif (CORE::length($_) == 0 || ( $count != 1 && /^\s+$/o ) || /^\s+\-?\*\s*$/ || /^.+\-\s+\-\s*$/ ) ### <--- This regex was designed for bug 3376 { next; } But the expression used is too broad because it uses the "^.+" just before the 2 dashes, and it broke these lines parsing, where is full of dashes: KyACrqCdtiVQAPaPakpIErGiptaGLLArvlVSKyaEHlPLYRQsEI lcl|gi|340 - -------------------------------------------------- - yaRqGVeiaRstLadWVgrtgarLaPLvdALaeyVLkeGklHADeTPVqV +i s L V++ + r lcl|gi|340 60938 ------AIMISGLIHGVSARCLRF-------------------------- 60955 I think a reasonable fix that still fixes the original bug and restore the function for this case is to add an extra \s+ in the regex just before the first dash, so the expression makes sure that the first dash is the one that comes AFTER the description (and is replacing the usual coordinate number) and is not the last of an alignment or a series of dashes like the one above: elsif (CORE::length($_) == 0 || ( $count != 1 && /^\s+$/o ) || /^\s+\-?\*\s*$/ || /^.+\s+\-\s+\-\s*$/ ) ### <--- Tweaked regex { next; } I tested it and it works fine, hope you find the fix acceptable. Cheers, -- Francisco J. Ossandon Bioinformatician. Ph.D. Candidate, University Andres Bello. Center for Bioinformatics and Genome Biology, Fundacion Ciencia para la Vida. Santiago, Chile. www.cienciavida.cl/CBGB.htm From PDagosto at edgebio.com Mon Feb 25 11:50:34 2013 From: PDagosto at edgebio.com (Phil Dagosto) Date: Mon, 25 Feb 2013 16:50:34 +0000 Subject: [Bioperl-l] Error when running Build.PL Message-ID: Greetings, I downloaded BioPerl 1.6.1 from this location: http://www.bioperl.org/wiki/Getting_BioPerl When I ran Build.PL with all of the default settings chosen in the interactive mode I got the following error message: Could not get valid metadata. Error is: Invalid metadata structure. Errors: 'Perl_5' for 'license' does not have a URL scheme (resources -> license) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::FeatureIO::gff -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::WebAgent -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::EUtilParameters -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::OntologyIO::InterProParser -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Biblio::IO::medlinexml -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::strider -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::PhyloNetwork::RandomFactory -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::Analysis::DNA::ESEfinder -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::game::gameSubs -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::FeatureIO::interpro -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::GFF::Adaptor::berkeleydb -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::entrezgene -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::tinyseq -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::chadoxml -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::game::gameWriter -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::FileCache -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::bsml_sax -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::Primer3 -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::GFF::Adaptor::ace -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::PopGen::HtSNP -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tree::Compatible -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::Ace -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::Taxonomy::entrez -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::agave -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::PopGen::TagHaplotype -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::SeqFeature::Store::FeatureFileLoader -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::* -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::Analysis::Protein* -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SearchIO::blastxml -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::EUtilities -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tree::Draw::Cladogram -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::SeqPattern -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::tigrxml -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqFeature::Collection -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Draw::Pictogram -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SearchIO::Writer::BSMLResultWriter -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::Query::HIVQuery -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::TreeIO::svggraph -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::Biblio::eutils -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::PhyloNetwork -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::SeqPattern::BackTranslate -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::Query::GenBank -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Variation::IO::xml -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::PhyloNetwork::GraphViz -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqFeature::Annotated -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::NCBIHelper -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::HIV -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::Analysis::DNA* -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::Run::RemoteBlast -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::excel -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::ClusterIO::dbsnp -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Microarray::Tools::ReseqChip -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::Biblio::soap -> requires) [Validation: 1.2] at /usr/local/lib/perl5/5.10.1/Module/Build/Base.pm line 4559 Could not create MYMETA files Creating new 'Build' script for 'BioPerl' version '1.006001' I have no idea whether this is a problem or not or if I can proceed. Also, I'm confused by the version number referenced in the last line. 1.006001 is our current version - I thought I was installing version 1.6.1. Are these version numbers equivalent, i.e., are the zeros not meaningful?. I was actually looking for version 1.2.3 (or greater) - where can I find that? Thanks, Phil Phil Dagosto Sr. Software Engineer Edge Bio 201 Perry Parkway, Suite 5 Gaithersburg, MD 20850 pdagosto at edgebio.com (240) 912-8669 From chapmanb at 50mail.com Thu Feb 28 21:30:01 2013 From: chapmanb at 50mail.com (Brad Chapman) Date: Thu, 28 Feb 2013 21:30:01 -0500 Subject: [Bioperl-l] Coming soon: BOSC/Broad Hackathon, BOSC Codefest Message-ID: <874ngvua1i.fsf@fastmail.fm> Hi all; There are some upcoming coding events and conferences of interest to open source biology programmers: - BOSC/Broad Interoperability Hackathon -- This is a two day coding session at the Broad Institute in Cambridge, MA on April 7-8 focused on improving tool interoperability. Sign up and details: http://j.mp/XJT6ew - Codefest at the Bioinformatics Open Source Conference -- This year BOSC is taking place in Berlin from July 19-20 and we'll have a two day coding session before the conference. This is the 4th year of Codefests and they've proven to be a productive and fun time to work collectively on open source projects. Sign up and details: http://www.open-bio.org/wiki/Codefest_2013 BOSC conference: http://www.open-bio.org/wiki/BOSC_2013 Here are the key dates for the events and abstracts: April 7-8, 2013: BOSC/Broad Interoperability Hackathon, Cambridge, MA April 12, 2013: BOSC abstracts due July 17-18, 2013: Codefest 2013, Berlin July 19-20, 2013: BOSC 2013, Berlin Looking forward to seeing everyone this spring and summer for plenty of fun science and code, Brad From jason.stajich at gmail.com Fri Feb 1 01:58:57 2013 From: jason.stajich at gmail.com (Jason Stajich) Date: Thu, 31 Jan 2013 22:58:57 -0800 Subject: [Bioperl-l] Bioperl-l Digest, Vol 117, Issue 13 In-Reply-To: <575B184F-C051-4FEF-9BEB-7AB98E3C52A6@yahoo.com> References: <575B184F-C051-4FEF-9BEB-7AB98E3C52A6@yahoo.com> Message-ID: Dan - I think the answer is yes if others are doing it - I am not in a position to be much of a main coder. I don't know which format you speak of here or if you had to write something for the text blast changes or something else. Specific bug reports on formats that aren't working is always helpful. The XML format has been pretty stable so I would suggest that if you are simply parsing reports not looking at them. Chris posted instructions on how to contribute and the move to github simplifies this. That you had to write a whole new parser seems probably a bit severe - I hope that in the future people can speak to the problems sooner. If I hit a wall with something I can't do I usually write the code to fix it and contribute it back but I don't play follow-the-format-changes with the tools anymore, but hopefully others like yourself can make the contributions. If you speak to the response I made to the question below, I don't think anyone will be trying and support the NCBI's additional markups that refer to the upstream and downstream features as they are laid out in the text files without some serious effort. Perhaps in the future that information will be reported in the XML format and thus be more parseable. best wishes, Jason On Jan 30, 2013, at 1:40 PM, Dan kilburn wrote: > Hi Jason, > > Are there any plans to keep SearchIO up to date with ncbi blast? I know they change formats ridiculously often, but I had to write my own parser to get sequence identity, which I would rather not have done. I realize that this job would be a big load on anyone who takes it, but it's so fundamental. Maybe I can help. > > --Dan > Sent from my iPhone > > On Jan 30, 2013, at 12:00 PM, bioperl-l-request at lists.open-bio.org wrote: > >> Send Bioperl-l mailing list submissions to >> bioperl-l at lists.open-bio.org >> >> To subscribe or unsubscribe via the World Wide Web, visit >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> or, via email, send a message with subject or body 'help' to >> bioperl-l-request at lists.open-bio.org >> >> You can reach the person managing the list at >> bioperl-l-owner at lists.open-bio.org >> >> When replying, please edit your Subject line so it is more specific >> than "Re: Contents of Bioperl-l digest..." >> >> >> Today's Topics: >> >> 1. Re: Parsing Blast-Report extracting "Features flanking .." >> (Jason Stajich) >> >> >> ---------------------------------------------------------------------- >> >> Message: 1 >> Date: Tue, 29 Jan 2013 11:00:16 -0800 >> From: Jason Stajich >> Subject: Re: [Bioperl-l] Parsing Blast-Report extracting "Features >> flanking .." >> To: buschj at hhu.de >> Cc: bioperl-l at lists.open-bio.org >> Message-ID: <6E83E3F3-C304-4DC4-9A11-FE1CA90F207D at gmail.com> >> Content-Type: text/plain; charset=us-ascii >> >> We don't parse the NCBI feature info from the BLAST reports per your query. To look up a specific feature you can use Bio::DB::GenBank to query for sequence from a specific feature by accession number - see the HOWTOs for that. >> >> However, most people use tools that generate SAM/BAM files with short reads - then you can use a tool like bedtools to find overlaps of reads with the locations of features. >> >> basically: >> - download the genome and GFF for arabidopsis >> - align your sRNA to the genome with a short read aligner - bowtie, bwa, others >> - convert your sam to bam file with SAMtools or picard >> - compare the location of features with the reads to get expression summaries or individuals reads with BEDTools >> >> >> On Jan 25, 2013, at 2:20 AM, jobu wrote: >> >>> Am 22.01.2013 19:03, schrieb Mgavi Brathwaite: >>>> What upstream and downstream elements are you interested in? >>> >>> >>> I've got a huge pile of short RNA reads. >>> Part of the question now is whether those RNA fragments originate from >>> siRNA events, >>> or may represent miRNAs / parts of pre-miRNAs. >>> >>> So I did an online blast search against database nt. >>> The resulting report quite often just gives subject information like this: >>> >>> ----- >>>> gb|CP002686.1| Arabidopsis thaliana chromosome 3, complete sequence >>> Length=23459830 >>> ----- >>> >>> Now I would like to get the hit's neighbouring regions for further >>> analysis. >>> Preferably I would like to do that in an automized way, but the only >>> possible action with this kind of subject gi | description would be to >>> fetch the entire chromosomal sequence I guess ? >>> >>> However, >>> right below the line above, the report states more precisely: >>> >>> ------ >>> Features flanking this part of subject sequence: >>> 8872 bp at 5' side: cytochrome P450 90B1 >>> 402 bp at 3' side: U1 small nuclear ribonucleoprotein-70K >>> ------ >>> >>> Still I would like to have the possibility to automatically fetch the >>> subject's sequence(s), >>> as of now I think parsing the report with SearchIO won't let me aquire >>> that information, because SearchIO does not recognize report sections >>> like those. >>> >>> I hope I did not miss any of SearchIOs capabilities, but I could not >>> find any method covering my wish?! >>> >>> Right now maybe the only way to get the information I want is to >>> construct my own parser and write it out into a separate file, which in >>> turn again I could read into a hash before processing the Blast-Report >>> with SearchIO to combine both data for further automized work. >>> >>> I am aware though that even successfully getting the flanking features >>> would leave me with the more or less wide intergenic gap my hsp is >>> located in. >>> >>> However I'm in need of a way to get the flanking features including >>> their annotation and the region spanning between them. >>> But I hope I do not have to get complete sequences to accomplish that, >>> as this would be kind of an overkill. >>> >>> with kind regards >>> Jochen >>> >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> Jason Stajich >> jason.stajich at gmail.com >> jason at bioperl.org >> >> >> >> >> ------------------------------ >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> End of Bioperl-l Digest, Vol 117, Issue 13 >> ****************************************** > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Jason Stajich jason.stajich at gmail.com jason at bioperl.org From dr_kilburn59 at yahoo.com Fri Feb 1 09:25:34 2013 From: dr_kilburn59 at yahoo.com (Dan Kilburn) Date: Fri, 1 Feb 2013 06:25:34 -0800 (PST) Subject: [Bioperl-l] Bioperl-l Digest, Vol 117, Issue 13 In-Reply-To: References: <575B184F-C051-4FEF-9BEB-7AB98E3C52A6@yahoo.com> Message-ID: <1359728734.27412.YahooMailNeo@web162006.mail.bf1.yahoo.com> Hi Jason, ? Thanks for?the detailed feedback.? The real reason I had to write my own parser is that even with close, repeated support from NCBI we couldn't get XML output with short_web_blast.pl?because the parameter that turns on XML output was not functioning (they've probably fixed it by now), and I had to crank out a parser asap to support a job talk. ? I don't think the upstream and downstream feature reports are particulalry useful, becase in mammals they tend to be so far away that they are not likely to be biologically relevant.? But the internal motif reports are useful, maybe especially if you are blasting short reads, like I was.? A 16-mer preserved domain hit is really good if you're blasting 18-mer Illumina short reads, like I was. ? As far as my involvement goes, I got diagnosed with cancer on Wednesday, so I'll be taking a step back until next week's surgery and taking a lot a deep breaths.? On the other hand, this just makes me more motivated: I've been thinking alot about time, and timely contributions, the last two days. ? Cheers, Dan ________________________________ From: Jason Stajich To: Dan kilburn Cc: "bioperl-l at lists.open-bio.org" Sent: Friday, February 1, 2013 1:58 AM Subject: Re: [Bioperl-l] Bioperl-l Digest, Vol 117, Issue 13 Dan -? I think the answer is yes if others are doing it - I am not in a position to be much of a main coder. I don't know which format you speak of here or if you had to write something for the text blast changes or something else. ?Specific bug reports on formats that aren't working is always helpful. ?The XML format has been pretty stable so I would suggest that if you are simply parsing reports not looking at them. Chris posted instructions on how to contribute and the move to github simplifies this. ?That you had to write a whole new parser seems probably a bit severe - I hope that in the future people can speak to the problems sooner. If I hit a wall with something I can't do I usually write the code to fix it and contribute it back but I don't play follow-the-format-changes with the tools anymore, but hopefully others like yourself can make the contributions. If you speak to the response I made to the question below, I don't think anyone will be trying and support the NCBI's additional markups that refer to the upstream and downstream features as they are laid out in the text files without some serious effort. Perhaps in the future that information will be reported in the XML format and thus be more parseable. best wishes, Jason On Jan 30, 2013, at 1:40 PM, Dan kilburn wrote: Hi Jason, > >Are there any plans to keep SearchIO up to date with ncbi blast? I know they change formats ridiculously often, but I had to write my own parser to get sequence identity, which I would rather not have done. I realize that this job would be a big load on anyone who takes it, but it's so fundamental. Maybe I can help. > >--Dan >Sent from my iPhone > >On Jan 30, 2013, at 12:00 PM, bioperl-l-request at lists.open-bio.org wrote: > > >Send Bioperl-l mailing list submissions to >>??bioperl-l at lists.open-bio.org >> >>To subscribe or unsubscribe via the World Wide Web, visit >>??http://lists.open-bio.org/mailman/listinfo/bioperl-l >>or, via email, send a message with subject or body 'help' to >>??bioperl-l-request at lists.open-bio.org >> >>You can reach the person managing the list at >>??bioperl-l-owner at lists.open-bio.org >> >>When replying, please edit your Subject line so it is more specific >>than "Re: Contents of Bioperl-l digest..." >> >> >>Today's Topics: >> >>?1. Re: ?Parsing Blast-Report extracting "Features flanking ???.." >>????(Jason Stajich) >> >> >>---------------------------------------------------------------------- >> >>Message: 1 >>Date: Tue, 29 Jan 2013 11:00:16 -0800 >>From: Jason Stajich >>Subject: Re: [Bioperl-l] Parsing Blast-Report extracting "Features >>??flanking ???.." >>To: buschj at hhu.de >>Cc: bioperl-l at lists.open-bio.org >>Message-ID: <6E83E3F3-C304-4DC4-9A11-FE1CA90F207D at gmail.com> >>Content-Type: text/plain; ???charset=us-ascii >> >>We don't parse the NCBI feature info from the BLAST reports per your query. To look up a specific feature you can use Bio::DB::GenBank to query for sequence from a specific feature by accession number - see the HOWTOs for that. >> >>However, most people use tools that generate SAM/BAM files with short reads - then you can use a tool like bedtools to find overlaps of reads with the locations of features. >> >>basically: >>- download the genome and GFF for arabidopsis >>- align your sRNA to the genome with a short read aligner - bowtie, bwa, others >>- convert your sam to bam file with SAMtools or picard >>- compare the location of features with the reads to get expression summaries or individuals reads with BEDTools >> >> >>On Jan 25, 2013, at 2:20 AM, jobu wrote: >> >> >>Am 22.01.2013 19:03, schrieb Mgavi Brathwaite: >>> >>>What upstream and downstream elements are you interested in? >>>> >>> >>>I've got a huge pile of short RNA reads. >>>Part of the question now is whether those RNA fragments originate from >>>siRNA events, >>>or may represent miRNAs / parts of pre-miRNAs. >>> >>>So I did an online ?blast search against database nt. >>>The resulting report quite often just gives subject information like this: >>> >>>----- >>> >>>gb|CP002686.1| Arabidopsis thaliana chromosome 3, complete sequence >>>>Length=23459830 >>>----- >>> >>>Now I would like to get the hit's neighbouring regions ?for further >>>analysis. >>>Preferably I would like to do that ?in an automized way, but the only >>>possible action with this kind of subject gi | description would be to >>>fetch the entire chromosomal ?sequence I guess ? >>> >>>However, >>>right below the line above, the report states more precisely: >>> >>>------ >>>Features flanking this part of subject sequence: >>>8872 bp at 5' side: cytochrome P450 90B1 >>>402 bp at 3' side: U1 small nuclear ribonucleoprotein-70K >>>------ >>> >>>Still I would like to have the possibility to automatically fetch the >>>subject's sequence(s), >>>as of now I think ?parsing the report with SearchIO won't let me aquire >>>that information, because SearchIO does not recognize report sections >>>like those. >>> >>>I hope I did not miss any of SearchIOs capabilities, but I could not >>>find any method covering my wish?! >>> >>>Right now maybe the only way to get the information I want is to >>>construct my own parser and write it out into a separate file, which in >>>turn again ?I could read into a hash before processing the Blast-Report >>>with SearchIO to combine both data for further automized work. >>> >>>I am aware though that even successfully getting the flanking features >>>would leave me with the more or less wide ?intergenic gap my hsp is >>>located in. >>> >>>However I'm in need of a way to get the flanking features including >>>their annotation and the region spanning between them. >>>But I hope I do not have to get complete sequences to accomplish that, >>>as this would be kind of an overkill. >>> >>>with kind regards >>>Jochen >>> >>> >>> >>>_______________________________________________ >>>Bioperl-l mailing list >>>Bioperl-l at lists.open-bio.org >>>http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>Jason Stajich >>jason.stajich at gmail.com >>jason at bioperl.org >> >> >> >> >>------------------------------ >> >>_______________________________________________ >>Bioperl-l mailing list >>Bioperl-l at lists.open-bio.org >>http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >>End of Bioperl-l Digest, Vol 117, Issue 13 >>****************************************** >> >_______________________________________________ >Bioperl-l mailing list >Bioperl-l at lists.open-bio.org >http://lists.open-bio.org/mailman/listinfo/bioperl-l > Jason Stajich jason.stajich at gmail.com jason at bioperl.org From carandraug+dev at gmail.com Sat Feb 2 20:44:31 2013 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Sun, 3 Feb 2013 01:44:31 +0000 Subject: [Bioperl-l] TCofee does not accept named arguments and issue with output option Message-ID: Hi the TCoffee module does not options of the named argument type: -arg => option one needs to do like 'arg' => option Is there a special reason for this? I tracked down this to the commit 7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e 12 years ago[1]. A comment on the code actually says "don't want named parameters"[2] (though the commit message sounds pretty innocuous "migrated to new Bio::Root::RootI chained new"). Is there a reason for this? The rest of bioperl has no issue with named parameters, and the API should be the same as Clustalw which also has no problem with it. This is very easy to fix, I can submit a pull request no problem. Also, shouldn't the code complain in the case of non-supported options? Took me a very long time to find out the problem because there was no complaints coming from the code. There is also a problem with the way it handles the output option. I'll have to look closer into it, but the documentation is simply incorrect. "'output' => 'fasta_aln'" gives an error while just 'fasta' (undocumented), works fine. Carn? [1] https://github.com/carandraug/bioperl-run/commit/7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e [2] https://github.com/carandraug/bioperl-run/commit/7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e#L0R374 From cjfields at illinois.edu Sun Feb 3 16:54:51 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Sun, 3 Feb 2013 21:54:51 +0000 Subject: [Bioperl-l] TCofee does not accept named arguments and issue with output option In-Reply-To: References: Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE14D30@CHIMBX5.ad.uillinois.edu> Carn?, On Feb 2, 2013, at 7:44 PM, Carn? Draug wrote: > Hi > > the TCoffee module does not options of the named argument type: > > -arg => option > > one needs to do like > > 'arg' => option > > Is there a special reason for this? I tracked down this to the commit > > 7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e > > 12 years ago[1]. A comment on the code actually says "don't want named > parameters"[2] (though the commit message sounds pretty innocuous > "migrated to new Bio::Root::RootI chained new"). Is there a reason for > this? The rest of bioperl has no issue with named parameters, and the > API should be the same as Clustalw which also has no problem with it. > This is very easy to fix, I can submit a pull request no problem. IIRC the reasoning behind this was to differentiate Bioperl parameters from command-specific ones. This decision predates my involvement w/ core dev, but my general feeling is that anything that is an object attribute (regardless whether it is a direct representation of a value passed to a wrapped program or not) should be preceded by '-' for consistency. The downside of big changes like this: potential backwards compatibility issues. Such changes would need to be tested out rigorously, as there are a ton of old scripts that would potentially break with a direct change. I don't have a problem breaking this with a bioperl 2.0 release, though. > Also, shouldn't the code complain in the case of non-supported > options? Took me a very long time to find out the problem because > there was no complaints coming from the code. Yes, it should complain when options are given that do not make sense, some validation would help there. With some modules this might be a side-effect of using AUTOLOAD or simply not checking the parameters. > There is also a problem with the way it handles the output option. > I'll have to look closer into it, but the documentation is simply > incorrect. "'output' => 'fasta_aln'" gives an error while just 'fasta' > (undocumented), works fine. That's entirely possible. > Carn? > [1] https://github.com/carandraug/bioperl-run/commit/7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e > [2] https://github.com/carandraug/bioperl-run/commit/7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e#L0R374 As an aside, there are a few downsides of trying to implement command-line parameters as perl object attributes (getter/setter), one being that many can't be directly represented as an object attribute (namely, anything that can't be a getter/setter named subroutine, such as those having hyphens, starting with a number, etc) so you have to hack your way around it. Infernal was this way IIRC. Maybe these should just be simply stored as a semi-validated set of key-value pairs. chris From carandraug+dev at gmail.com Sun Feb 3 23:34:22 2013 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Mon, 4 Feb 2013 04:34:22 +0000 Subject: [Bioperl-l] TCofee does not accept named arguments and issue with output option In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE14D30@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE14D30@CHIMBX5.ad.uillinois.edu> Message-ID: On 3 February 2013 21:54, Fields, Christopher J wrote: > On Feb 2, 2013, at 7:44 PM, Carn? Draug wrote: > >> Hi >> >> the TCoffee module does not options of the named argument type: >> >> -arg => option >> >> one needs to do like >> >> 'arg' => option >> >> Is there a special reason for this? I tracked down this to the commit >> >> 7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e >> >> 12 years ago[1]. A comment on the code actually says "don't want named >> parameters"[2] (though the commit message sounds pretty innocuous >> "migrated to new Bio::Root::RootI chained new"). Is there a reason for >> this? The rest of bioperl has no issue with named parameters, and the >> API should be the same as Clustalw which also has no problem with it. >> This is very easy to fix, I can submit a pull request no problem. > > IIRC the reasoning behind this was to differentiate Bioperl parameters from command-specific ones. This decision predates my involvement w/ core dev, but my general feeling is that anything that is an object attribute (regardless whether it is a direct representation of a value passed to a wrapped program or not) should be preceded by '-' for consistency. > > The downside of big changes like this: potential backwards compatibility issues. Such changes would need to be tested out rigorously, as there are a ton of old scripts that would potentially break with a direct change. I don't have a problem breaking this with a bioperl 2.0 release, though. Should passing the tests be enough? There's one for TCofee. At the moment I don't see how this would cause compatibility issues, we are adding an option, not removing it. But the comment on the code, stating plainly that the -param API was not wanted caught me by surpise and why I'm asking. > As an aside, there are a few downsides of trying to implement command-line parameters as perl object attributes (getter/setter), one being that many can't be directly represented as an object attribute (namely, anything that can't be a getter/setter named subroutine, such as those having hyphens, starting with a number, etc) so you have to hack your way around it. Infernal was this way IIRC. Maybe these should just be simply stored as a semi-validated set of key-value pairs. >From a quick glance at the list of TCoffee parameters I don't at the moment see any that should cause problem. I have submitted a bug report[1] which mentions some other issues I found with TCoffee. If someone could comment on them would be great and I can start fixing it. Carn? [1] https://redmine.open-bio.org/issues/3406 From whereverroadgoes at gmail.com Mon Feb 4 10:39:19 2013 From: whereverroadgoes at gmail.com (Slym) Date: Mon, 4 Feb 2013 07:39:19 -0800 (PST) Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases Message-ID: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com> The result I get is: Number of bases of type A = Number of bases of type C = Number of bases of type G = Number of bases of type T = i.e. There's no expected values. Please help! #! /usr/bin/perl use Bio::Tools::SeqStats; use Bio::Seq; open (FILE, "seq.fasta"); @array = ; # Removing first line of fasta shift (@array); $array = join('', at array); open (FILE2, ">>seq2.fasta"); print FILE2 "$array"; $seqobj = Bio::PrimarySeq->new( -file => "sekw2.fasta", - alphabet => 'dna',); my $seq_stats = Bio::Tools::SeqStats->new(-seq=>$seqobj); my $monomer_ref = $seq_stats->count_monomers(); foreach $base (sort keys %$monomer_ref) { print "Liczba zasad typu ", $base," = ", $monomer_ref{$base},"\n"; } From hamish.mcwilliam at bioinfo-user.org.uk Mon Feb 4 11:59:16 2013 From: hamish.mcwilliam at bioinfo-user.org.uk (Hamish McWilliam) Date: Mon, 4 Feb 2013 16:59:16 +0000 Subject: [Bioperl-l] Where to get BLASTCLUST or equivalent? In-Reply-To: References: <200305311150.h4VBopn2019091@localhost.localdomain> Message-ID: BLASTCLUST is part of the legacy NCBI BLAST package (not NCBI BLAST+) and can be obtained from: ftp://ftp.ncbi.nlm.nih.gov/blast/executables/release/LATEST As Robert notes there are many other tools which can be used to perform sequence clustering, Wikipedia has a Sequence Clustering article (http://en.wikipedia.org/wiki/Sequence_clustering) which lists some of the most commonly used. All the best, Hamish On 1 February 2013 04:15, Rob wrote: > Cyril C.C. Chua bmb.leeds.ac.uk> writes: > >> >> Hi, >> >> I have some difficulty in sourcing for BLASTCLUST or related >> programs/mods. Does any1 know exactly how to locate them? >> >> Regards >> >> Cyril Chua >> > > > Hi Cyril, > > I heard of the following programmes that might do similar things (I HAVEN'T > used any of them yet): > > Afree - http://www.vicbioinformatics.com/software.afree.shtml > Uclust - http://drive5.com/uclust/uclust_userguide_2_1.pdf > Usearch - http://www.drive5.com/usearch/ > DomClust - http://mbgd.genome.ad.jp/domclust/ > > or > > Check this: > > http://ppod.princeton.edu/help/help_tech.html > > God bless, > > > Robert > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- ---- "Saying the internet has changed dramatically over the last five years is clich? ? the internet is always changing dramatically" - Craig Labovitz, Arbor Networks. From whereverroadgoes at gmail.com Mon Feb 4 12:34:10 2013 From: whereverroadgoes at gmail.com (Slym) Date: Mon, 4 Feb 2013 09:34:10 -0800 (PST) Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases In-Reply-To: References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com> Message-ID: Thanks Roy, It still doesn't seem to produce anything. :/ From roy.chaudhuri at gmail.com Mon Feb 4 12:51:03 2013 From: roy.chaudhuri at gmail.com (Roy Chaudhuri) Date: Mon, 4 Feb 2013 17:51:03 +0000 Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases In-Reply-To: References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com> Message-ID: Sorry, I'd missed another problem in your code - you are trying to load a fasta file using Bio::PrimarySeq. To read sequence data from a file you should use Bio::SeqIO, see: http://www.bioperl.org/wiki/HOWTO:Beginners#Retrieving_a_sequence_from_a_file http://www.bioperl.org/wiki/HOWTO:SeqIO Cheers, Roy. From asjo at koldfront.dk Mon Feb 4 12:58:25 2013 From: asjo at koldfront.dk (Adam =?iso-8859-1?Q?Sj=F8gren?=) Date: Mon, 04 Feb 2013 18:58:25 +0100 Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases In-Reply-To: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com> (Slym's message of "Mon, 4 Feb 2013 07:39:19 -0800 (PST)") References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com> Message-ID: <8738xc2c72.fsf@topper.koldfront.dk> On Mon, 4 Feb 2013 07:39:19 -0800 (PST), Slym wrote: > #! /usr/bin/perl > use Bio::Tools::SeqStats; > use Bio::Seq; It can be a good idea to add "use strict; use warnings;" to the top of your script. At least two problems in your program would have been caught by perl if you had. > open (FILE, "seq.fasta"); Using (global) literal filehandles and the two parameter open() is somewhat outdated, a more current way to do it could be: open my $fh, '<', 'seq.fasta'; > @array = ; > # Removing first line of fasta > shift (@array); > $array = join('', at array); > open (FILE2, ">>seq2.fasta"); > print FILE2 "$array"; Note that you are writing just the sequence to your seq2.fasta file here, so the new file isn't really a fasta file. > $seqobj = Bio::PrimarySeq->new( -file => "sekw2.fasta", > - alphabet => 'dna',); Bio::PrimarySeq doesn't take a '-file' parameter. Also, note that the filename is different than before "sekw2" vs. "seq2"! Either you should use Bio::SeqIO with a '-file' parameter, or you can use Bio::PrimarySeq with a '-seq' parameter. > my $seq_stats = Bio::Tools::SeqStats->new(-seq=>$seqobj); > my $monomer_ref = $seq_stats->count_monomers(); > foreach $base (sort keys %$monomer_ref) { > print "Liczba zasad typu ", $base," = ", $monomer_ref{$base},"\n"; Here you wanted $monomer_ref->{$base}, as %monomer_ref isn't mentioned anywhere else. > } Here is a complete version of your script - I chose to use Bio::SeqIO - that works: #!/usr/bin/perl use strict; use warnings; use Bio::SeqIO; use Bio::Tools::SeqStats; my $io=Bio::SeqIO->new(-file=>'seq.fasta', -alphabet=>'dna'); my $seqobj=$io->next_seq; # Get the first sequence from the file my $seq_stats = Bio::Tools::SeqStats->new(-seq=>$seqobj); my $monomer_ref = $seq_stats->count_monomers(); foreach my $base (sort keys %$monomer_ref) { print "Liczba zasad typu ", $base," = ", $monomer_ref->{$base},"\n"; } E.g.: $ cat seq.fasta >test aaaacccggt $ ./slym.pl Liczba zasad typu A = 4 Liczba zasad typu C = 3 Liczba zasad typu G = 2 Liczba zasad typu T = 1 $ Best regards, Adam -- "Grittings. Ma nam is Kahlfin." Adam Sj?gren asjo at koldfront.dk From whereverroadgoes at gmail.com Mon Feb 4 13:02:29 2013 From: whereverroadgoes at gmail.com (Slym) Date: Mon, 4 Feb 2013 10:02:29 -0800 (PST) Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases In-Reply-To: References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com> Message-ID: The thing is, if I use Bio::SeqIO then Bio::Tools::SeqStats produces an error (saying that it wants input provided by Bio::PrimarySeq). (btw in this line $seqobj = Bio::PrimarySeq->new( -file => "sekw2.fasta", - alphabet => 'dna',); there's a typo "sekw2" instead of "seq2" but this is correct in my original code). From whereverroadgoes at gmail.com Mon Feb 4 13:02:29 2013 From: whereverroadgoes at gmail.com (Slym) Date: Mon, 4 Feb 2013 10:02:29 -0800 (PST) Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases In-Reply-To: References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com> Message-ID: The thing is, if I use Bio::SeqIO then Bio::Tools::SeqStats produces an error (saying that it wants input provided by Bio::PrimarySeq). (btw in this line $seqobj = Bio::PrimarySeq->new( -file => "sekw2.fasta", - alphabet => 'dna',); there's a typo "sekw2" instead of "seq2" but this is correct in my original code). From cjfields at illinois.edu Mon Feb 4 13:54:39 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Mon, 4 Feb 2013 18:54:39 +0000 Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases In-Reply-To: References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE161ED@CHIMBX5.ad.uillinois.edu> Please make sure and read both Roy's and Adam's responses all the way through; Bio::SeqIO is not a sequence object but the front-end for format parsing (e.g. FASTA, etc). Bio::PrimarySeq does not have a '-file' parameter, Bio::SeqIO does. If SeqStats truly doesn't work with Bio::Seq we can fix that, but according to Adam he has tested using Bio::SeqIO out and it seems to work. chris On Feb 4, 2013, at 12:02 PM, Slym wrote: > The thing is, if I use Bio::SeqIO then Bio::Tools::SeqStats produces an > error (saying that it wants input provided by Bio::PrimarySeq). > (btw in this line > $seqobj = Bio::PrimarySeq->new( -file => "sekw2.fasta", - alphabet => > 'dna',); > there's a typo "sekw2" instead of "seq2" but this is correct in my original > code). > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From asjo at koldfront.dk Mon Feb 4 15:00:32 2013 From: asjo at koldfront.dk (Adam =?iso-8859-1?Q?Sj=F8gren?=) Date: Mon, 04 Feb 2013 21:00:32 +0100 Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases In-Reply-To: (Slym's message of "Mon, 4 Feb 2013 10:02:29 -0800 (PST)") References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com> Message-ID: <87txpr26jj.fsf@topper.koldfront.dk> On Mon, 4 Feb 2013 10:02:29 -0800 (PST), Slym wrote: > The thing is, if I use Bio::SeqIO then Bio::Tools::SeqStats produces an > error (saying that it wants input provided by Bio::PrimarySeq). That sounds like you forgot to call ->next_seq() on the Bio::SeqIO object - to get a sequence object - please see the complete, working example I sent earlier. Best regards, Adam -- "Denial springs eternal." Adam Sj?gren asjo at koldfront.dk From scott at scottcain.net Tue Feb 5 09:45:14 2013 From: scott at scottcain.net (Scott Cain) Date: Tue, 5 Feb 2013 09:45:14 -0500 Subject: [Bioperl-l] Have your say in the 2013 GMOD Community Survey! Message-ID: Give us your thoughts on the GMOD project and win a personal DNA test from 23andMe! The GMOD project provides tools like GBrowse, Galaxy, MAKER, JBrowse, Tripal, Apollo, Chado, and many more to a huge community of users and developers around the world. To make sure that GMOD is giving you the support you need, we want to know how you use GMOD, which components you find valuable, your opinion on support, training, and GMOD's strengths and weaknesses. Your feedback is vital in helping GMOD to serve its user community more effectively and to suggest future directions for the project. Do the survey: http://gmod.org/survey.html The survey should take between 10 and 15 minutes (including thinking time), and participants can enter a draw to win "A Journey Through Your DNA", the personal DNA test from 23andMe (the winner can pick a $50 Amazon gift voucher if they prefer). The survey will be open until March 1st. Results will be collated and discussed at the April 2013 GMOD Meeting in Cambridge, UK, and posted on the GMOD wiki at http://gmod.org. Please spread the word to other friends and colleagues who use GMOD: the more voices we hear, the better the picture we get of the needs of our users, and the better we can help you! Do the survey: http://gmod.org/survey.html If you have any questions or problems with the survey, please email me -- I will be happy to help out! Thanks, Scott -- ------------------------------------------------------------------------ Scott Cain, Ph. D. scott at scottcain dot net GMOD Coordinator (http://gmod.org/) 216-392-3087 Ontario Institute for Cancer Research From tiago.hori at gmail.com Tue Feb 5 10:21:55 2013 From: tiago.hori at gmail.com (Tiago Hori) Date: Tue, 5 Feb 2013 07:21:55 -0800 (PST) Subject: [Bioperl-l] Search I::O Message-ID: <39b1269f-63a7-4b29-af79-8c93ab231abf@googlegroups.com> Hi All, I am trying to find the best putative orthologs for 44K Atlantic Salmon sequences, and so I need to parse 44K BLAST reports to find the best human hit. I am trying to learn Seach::IO, but when I try the first example on the HOWTO: use strict; use Bio::SearchIO; my $in = new Bio::SearchIO(-format => 'blast' -file => 'C001R047.txt'); while( my $result = $in->next_result ) { ## $result is a Bio::Search::Result::ResultI compliant object while( my $hit = $result->next_hit ) { ## $hit is a Bio::Search::Hit::HitI compliant object while( my $hsp = $hit->next_hsp ) { ## $hsp is a Bio::Search::HSP::HSPI compliant object if( $hsp->length('total') > 50 ) { if ( $hsp->percent_identity >= 75 ) { print "Query=", $result->query_name, " Hit=", $hit->name, " Length=", $hsp->length('total'), " Percent_id=", $hsp->percent_identity, "\n"; } } } } } I get this error: Odd number of elements in hash assignment at /usr/local/share/perl/5.14.2/Bio/SearchIO.pm line 189. I am using BioPerl version 1.6.901. Is there a format problem with the blast reports? Any help would be greatly appreciated! T. From tiago.hori at gmail.com Tue Feb 5 10:33:32 2013 From: tiago.hori at gmail.com (Tiago Hori) Date: Tue, 5 Feb 2013 07:33:32 -0800 (PST) Subject: [Bioperl-l] Search::IO example from HOWTO Message-ID: Hi All, I am trying to run tha example from the Search::IO how to use strict; use Bio::SearchIO; my $in = new Bio::SearchIO(-format => 'blast' -file => 'test.txt'); while( my $result = $in->next_result ) { ## $result is a Bio::Search::Result::ResultI compliant object while( my $hit = $result->next_hit ) { ## $hit is a Bio::Search::Hit::HitI compliant object while( my $hsp = $hit->next_hsp ) { ## $hsp is a Bio::Search::HSP::HSPI compliant object if( $hsp->length('total') > 50 ) { if ( $hsp->percent_identity >= 75 ) { print "Query=", $result->query_name, " Hit=", $hit->name, " Length=", $hsp->length('total'), " Percent_id=", $hsp->percent_identity, "\n"; } } } } } And I get this error:Odd number of elements in hash assignment at /usr/local/share/perl/5.14.2/Bio/SearchIO.pm line 189. Can anybody help! Cheers, T. From carandraug+dev at gmail.com Tue Feb 5 13:56:21 2013 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Tue, 5 Feb 2013 18:56:21 +0000 Subject: [Bioperl-l] removing packages from bioperl-live Message-ID: Hi some of the bioperl-live packages have already been split into separate repositories. However, they were never actually removed from bioperl-live. This creates 2 entry points for bug fixes and implementations. After a chat on #bioperl, I was told to ask here. Should these be removed? For example, there's bioperl-FeatureIO but that code alo exists in bioperl-live. Can I remove it from bioperl-live? Carn? From cjfields at illinois.edu Tue Feb 5 14:34:07 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Tue, 5 Feb 2013 19:34:07 +0000 Subject: [Bioperl-l] Removing Bio::FeatureIO? was Re: removing packages from bioperl-live In-Reply-To: References: Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu> Probably should retitle this to ask the question directly (make sure the right radars are pinged). My vote is yes, it should be removed. There were a lot of implementation issues with it that ended up becoming problematic. I do believe it is used, though, so I would like to get additional responses from the community before removing it and pointing to the separate repository (where there has been a lot of experimenting going on). chris On Feb 5, 2013, at 12:56 PM, Carn? Draug wrote: > Hi > > some of the bioperl-live packages have already been split into > separate repositories. However, they were never actually removed from > bioperl-live. This creates 2 entry points for bug fixes and > implementations. After a chat on #bioperl, I was told to ask here. > > Should these be removed? For example, there's bioperl-FeatureIO but > that code alo exists in bioperl-live. Can I remove it from > bioperl-live? > > Carn? > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From scott at scottcain.net Tue Feb 5 14:36:10 2013 From: scott at scottcain.net (Scott Cain) Date: Tue, 5 Feb 2013 14:36:10 -0500 Subject: [Bioperl-l] Removing Bio::FeatureIO? was Re: removing packages from bioperl-live In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu> Message-ID: I'm sure it will lead to lots of fun, but I suspect you are right and it should be removed. It's time you yank on that bandaid :-) Scott On Tue, Feb 5, 2013 at 2:34 PM, Fields, Christopher J wrote: > Probably should retitle this to ask the question directly (make sure the right radars are pinged). > > My vote is yes, it should be removed. There were a lot of implementation issues with it that ended up becoming problematic. I do believe it is used, though, so I would like to get additional responses from the community before removing it and pointing to the separate repository (where there has been a lot of experimenting going on). > > chris > > On Feb 5, 2013, at 12:56 PM, Carn? Draug wrote: > >> Hi >> >> some of the bioperl-live packages have already been split into >> separate repositories. However, they were never actually removed from >> bioperl-live. This creates 2 entry points for bug fixes and >> implementations. After a chat on #bioperl, I was told to ask here. >> >> Should these be removed? For example, there's bioperl-FeatureIO but >> that code alo exists in bioperl-live. Can I remove it from >> bioperl-live? >> >> Carn? >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- ------------------------------------------------------------------------ Scott Cain, Ph. D. scott at scottcain dot net GMOD Coordinator (http://gmod.org/) 216-392-3087 Ontario Institute for Cancer Research From carandraug+dev at gmail.com Tue Feb 5 15:06:23 2013 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Tue, 5 Feb 2013 20:06:23 +0000 Subject: [Bioperl-l] dependencies on perl version Message-ID: Hi how much perl backwards compatibility does bioperl needs to keep? If I have something I want to implement and use state (requires 5.010), is it acceptable? 5.010 is already a quite old perl version. Of course, there are other less elegant ways to implement those features. If I can't use modern perl stuff, what version number is the limit? Carn? From carandraug+dev at gmail.com Tue Feb 5 15:10:01 2013 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Tue, 5 Feb 2013 20:10:01 +0000 Subject: [Bioperl-l] Removing Bio::FeatureIO? was Re: removing packages from bioperl-live In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu> Message-ID: On 5 February 2013 19:34, Fields, Christopher J wrote: > Probably should retitle this to ask the question directly (make sure the right radars are pinged). > > My vote is yes, it should be removed. There were a lot of implementation issues with it that ended up becoming problematic. I do believe it is used, though, so I would like to get additional responses from the community before removing it and pointing to the separate repository (where there has been a lot of experimenting going on). Mentioning Bio::FeatureIO was just an example. I meant to ask it as more general. If the code is already in a separate repository, should it be removed from bioperl-live? Carn? From cjfields at illinois.edu Tue Feb 5 15:56:48 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Tue, 5 Feb 2013 20:56:48 +0000 Subject: [Bioperl-l] dependencies on perl version In-Reply-To: References: Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> Aim for 5.10.1, but be careful of smart-match. If you do this, make sure to add a 'use 5.010' pragma at the top. (for those who don't like this, please speak up. perl 5.8 has been around almost 8 yrs, we would like to allow using new features if at all possible) chris On Feb 5, 2013, at 2:06 PM, Carn? Draug wrote: > Hi > > how much perl backwards compatibility does bioperl needs to keep? > > If I have something I want to implement and use state (requires > 5.010), is it acceptable? 5.010 is already a quite old perl version. > Of course, there are other less elegant ways to implement those > features. If I can't use modern perl stuff, what version number is the > limit? > > Carn? > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Tue Feb 5 15:59:38 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Tue, 5 Feb 2013 20:59:38 +0000 Subject: [Bioperl-l] Removing Bio::FeatureIO? was Re: removing packages from bioperl-live In-Reply-To: References: <118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1829D@CHIMBX5.ad.uillinois.edu> On Feb 5, 2013, at 2:10 PM, Carn? Draug wrote: > On 5 February 2013 19:34, Fields, Christopher J wrote: >> Probably should retitle this to ask the question directly (make sure the right radars are pinged). >> >> My vote is yes, it should be removed. There were a lot of implementation issues with it that ended up becoming problematic. I do believe it is used, though, so I would like to get additional responses from the community before removing it and pointing to the separate repository (where there has been a lot of experimenting going on). > > Mentioning Bio::FeatureIO was just an example. I meant to ask it as > more general. If the code is already in a separate repository, should > it be removed from bioperl-live? > > Carn? Yes for Bio::FeatureIO, no for Bio::Root::Root and the others at the moment (I want to get a release out by March 1, which I'm planning on announcing later today, so the less disruptive it is the better). Once we get a new release out we should remove the rest. chris From cjfields at illinois.edu Tue Feb 5 16:53:29 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Tue, 5 Feb 2013 21:53:29 +0000 Subject: [Bioperl-l] Next BioPerl release Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu> All, I am scheduling the next BioPerl CPAN release tentatively for March 1. Any help in triaging bug reports would be greatly appreciated! Amongst all other changes, as mentioned in a separate thread we will remove Bio::FeatureIO, now developed in a separate repository: https://github.com/bioperl/Bio-FeatureIO Feedback, suggestions, etc are greatly appreciated. chris From miker at htblis.com Tue Feb 5 19:54:17 2013 From: miker at htblis.com (Michael Rogoff) Date: Tue, 5 Feb 2013 16:54:17 -0800 Subject: [Bioperl-l] Bio::Graphics error when rendering features with Split locations Message-ID: When trying to render features from a genbank file that include a split location e.g.: promoter join(1000..1080,1..5) /label=PROM1 The following exception is raised: Can't locate object method "has_tag" via package "Bio::Location::Simple" at lib/perl5/site_perl/5.10.1/Bio/Graphics/Glyph.pm line 704, line 36. This can be reproduced with the code in the example "Rendering Features from a GenBank or EMBL File" from the Graphics HOW-TO: http://www.bioperl.org/wiki/HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File Is there a way to change the script so that split locations would, at the very least, not cause a fatal error? Is there a different glyph type that needs to be used? Thanks in advance for any help. I've attached a simple genbank input that will reproduce the error: LOCUS sample2 1080 bp DNA circular DEFINITION Cloning vector sample2 ACCESSION sample2 VERSION sample2.1 GI:4352432 COMMENT Component Fragments FEATURES Location/Qualifiers terminator 39..328 /label=TERM1 /note="terminator 1" misc_feature 393..488 /label=MF1 CDS complement(800..900) /label=CDS1 /note="resistence gene" promoter join(1000..1080,1..5) /label=PROM1 ORIGIN 1 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 61 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 121 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 181 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 241 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 301 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 361 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 421 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 481 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 541 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 601 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 661 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 721 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 781 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 841 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 901 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 961 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1021 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn // P.S. I think I have traced the source of the problem to Glyph's _subfeat method, which in the case of a feature with split locations is returning location objects instead of feature objects. Is this a bug? sub _subfeat { my $class = shift; my $feature = shift; return $feature->segments if $feature->can('segments'); my @split = eval { my $id = $feature->location->seq_id; my @subs = $feature->location->sub_Location; grep {$id eq $_->seq_id} @subs; }; return @split if @split; # Either the APIs have changed, or I got confused at some point... return $feature->get_SeqFeatures if $feature->can('get_SeqFeatures'); return $feature->sub_SeqFeature if $feature->can('sub_SeqFeature'); return; } From l.m.timmermans at students.uu.nl Tue Feb 5 21:40:27 2013 From: l.m.timmermans at students.uu.nl (Leon Timmermans) Date: Wed, 6 Feb 2013 03:40:27 +0100 Subject: [Bioperl-l] dependencies on perl version In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> Message-ID: On Tue, Feb 5, 2013 at 9:56 PM, Fields, Christopher J wrote: > Aim for 5.10.1, but be careful of smart-match. If you do this, make sure to add a 'use 5.010' pragma at the top. > > (for those who don't like this, please speak up. perl 5.8 has been around almost 8 yrs, we would like to allow using new features if at all possible) I *really* hate saying it, but I fear a lot of places are still stuck on 5.8, in particular on 5.8.8 because of CentOS 5. I know my department still is and doesn't seem to be in a hurry to upgrade, and I'm pretty sure it won't be the only one (though personally I use a self-compiled 5.16). Leon From florent.angly at gmail.com Tue Feb 5 21:51:27 2013 From: florent.angly at gmail.com (Florent Angly) Date: Wed, 06 Feb 2013 12:51:27 +1000 Subject: [Bioperl-l] Removing Bio::FeatureIO? was Re: removing packages from bioperl-live In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1829D@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu> <118F034CF4C3EF48A96F86CE585B94BF6CE1829D@CHIMBX5.ad.uillinois.edu> Message-ID: <5111C52F.50101@gmail.com> On 06/02/13 06:59, Fields, Christopher J wrote: > On Feb 5, 2013, at 2:10 PM, Carn? Draug wrote: > >> On 5 February 2013 19:34, Fields, Christopher J wrote: >>> Probably should retitle this to ask the question directly (make sure the right radars are pinged). >>> >>> My vote is yes, it should be removed. There were a lot of implementation issues with it that ended up becoming problematic. I do believe it is used, though, so I would like to get additional responses from the community before removing it and pointing to the separate repository (where there has been a lot of experimenting going on). >> Mentioning Bio::FeatureIO was just an example. I meant to ask it as >> more general. If the code is already in a separate repository, should >> it be removed from bioperl-live? >> >> Carn? > Yes for Bio::FeatureIO, no for Bio::Root::Root and the others at the moment (I want to get a release out by March 1, which I'm planning on announcing later today, so the less disruptive it is the better). Once we get a new release out we should remove the rest. > > chris > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Sounds good to me (I've been burnt once by the fact that Bio::FeatureIO is in two places). Florent From florent.angly at gmail.com Tue Feb 5 21:56:19 2013 From: florent.angly at gmail.com (Florent Angly) Date: Wed, 06 Feb 2013 12:56:19 +1000 Subject: [Bioperl-l] dependencies on perl version In-Reply-To: References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> Message-ID: <5111C653.2010703@gmail.com> For what it's worth, the current stable version of Debian uses perl 5.10.1 (http://packages.debian.org/stable/perl/perl). Florent On 06/02/13 12:40, Leon Timmermans wrote: > On Tue, Feb 5, 2013 at 9:56 PM, Fields, Christopher J > wrote: >> Aim for 5.10.1, but be careful of smart-match. If you do this, make sure to add a 'use 5.010' pragma at the top. >> >> (for those who don't like this, please speak up. perl 5.8 has been around almost 8 yrs, we would like to allow using new features if at all possible) > I *really* hate saying it, but I fear a lot of places are still stuck > on 5.8, in particular on 5.8.8 because of CentOS 5. I know my > department still is and doesn't seem to be in a hurry to upgrade, and > I'm pretty sure it won't be the only one (though personally I use a > self-compiled 5.16). > > Leon > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From hlapp at drycafe.net Tue Feb 5 22:27:35 2013 From: hlapp at drycafe.net (Hilmar Lapp) Date: Tue, 5 Feb 2013 22:27:35 -0500 Subject: [Bioperl-l] Next BioPerl release In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu> Message-ID: <09524241-59F8-4BFF-8054-53CD0A649C11@drycafe.net> On Feb 5, 2013, at 4:53 PM, Fields, Christopher J wrote: > I am scheduling the next BioPerl CPAN release tentatively for March 1. Yay!! Thanks for your leadership again, Chris, and for volunteering your time for the project. If nothing else, and I know this is no compensation really worth speaking of, we owe you beer, and I'll certainly pay my debt to you in Berlin if you come there. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : =========================================================== From hlapp at drycafe.net Tue Feb 5 22:32:40 2013 From: hlapp at drycafe.net (Hilmar Lapp) Date: Tue, 5 Feb 2013 22:32:40 -0500 Subject: [Bioperl-l] dependencies on perl version In-Reply-To: <5111C653.2010703@gmail.com> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> Message-ID: Does anyone know what Ubuntu uses? I've heard lots of other old version problems with CentOS. 8 years is really old, and at some point I fear that weighing backwards compatibility too much just holds us back in a real detrimental way. -hilmar On Feb 5, 2013, at 9:56 PM, Florent Angly wrote: > For what it's worth, the current stable version of Debian uses perl 5.10.1 (http://packages.debian.org/stable/perl/perl). > Florent > > On 06/02/13 12:40, Leon Timmermans wrote: >> On Tue, Feb 5, 2013 at 9:56 PM, Fields, Christopher J >> wrote: >>> Aim for 5.10.1, but be careful of smart-match. If you do this, make sure to add a 'use 5.010' pragma at the top. >>> >>> (for those who don't like this, please speak up. perl 5.8 has been around almost 8 yrs, we would like to allow using new features if at all possible) >> I *really* hate saying it, but I fear a lot of places are still stuck >> on 5.8, in particular on 5.8.8 because of CentOS 5. I know my >> department still is and doesn't seem to be in a hurry to upgrade, and >> I'm pretty sure it won't be the only one (though personally I use a >> self-compiled 5.16). >> >> Leon >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : =========================================================== From cjfields at illinois.edu Tue Feb 5 22:58:08 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 6 Feb 2013 03:58:08 +0000 Subject: [Bioperl-l] dependencies on perl version In-Reply-To: References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE18CBE@CHIMBX5.ad.uillinois.edu> Re: being held back, I agree. I don't necessarily want to intentionally break current modules by adding modern code unless it can be demonstrated to be a decent benefit performance-wise, but I don't want to impede new additions by requiring compat with perl 5.8 (hence my suggestion of a 'use 5.01x' pragma when appropriate). Ubuntu 12.04 LTS is on perl 5.14.2: http://askubuntu.com/questions/80672/what-perl-version-will-be-in-12-04-lts BTW, I was wrong about perl 5.8 being 8 yrs old; it's almost 11 yrs old (perl 5.8.0 was released on 7/18/2002). perl 5.8 reached end-of-life in 2008, fixes being only for security reasons. So, I support dropping perl 5.8 support, but we should have a decent route of use for the folks stuck on old clusters. chris On Feb 5, 2013, at 9:32 PM, Hilmar Lapp wrote: > Does anyone know what Ubuntu uses? I've heard lots of other old version problems with CentOS. > > 8 years is really old, and at some point I fear that weighing backwards compatibility too much just holds us back in a real detrimental way. > > -hilmar > > On Feb 5, 2013, at 9:56 PM, Florent Angly wrote: > >> For what it's worth, the current stable version of Debian uses perl 5.10.1 (http://packages.debian.org/stable/perl/perl). >> Florent >> >> On 06/02/13 12:40, Leon Timmermans wrote: >>> On Tue, Feb 5, 2013 at 9:56 PM, Fields, Christopher J >>> wrote: >>>> Aim for 5.10.1, but be careful of smart-match. If you do this, make sure to add a 'use 5.010' pragma at the top. >>>> >>>> (for those who don't like this, please speak up. perl 5.8 has been around almost 8 yrs, we would like to allow using new features if at all possible) >>> I *really* hate saying it, but I fear a lot of places are still stuck >>> on 5.8, in particular on 5.8.8 because of CentOS 5. I know my >>> department still is and doesn't seem to be in a hurry to upgrade, and >>> I'm pretty sure it won't be the only one (though personally I use a >>> self-compiled 5.16). >>> >>> Leon >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From l.m.timmermans at students.uu.nl Tue Feb 5 23:11:52 2013 From: l.m.timmermans at students.uu.nl (Leon Timmermans) Date: Wed, 6 Feb 2013 05:11:52 +0100 Subject: [Bioperl-l] dependencies on perl version In-Reply-To: References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> Message-ID: On Wed, Feb 6, 2013 at 4:32 AM, Hilmar Lapp wrote: > Does anyone know what Ubuntu uses? 5.14.2, distrowatch is your friend ;-) > I've heard lots of other old version problems with CentOS. I know people who still use CentOS 4 in production :-| > 8 years is really old, and at some point I fear that weighing backwards compatibility too much just holds us back in a real detrimental way. CentOS 5 is 6 years old (and will be supported another 4), but CentOS 6 is 'only' 19 months. perl missing a release in the 5.8-5.10 timeframe combined with an unfortunate alignment of its release schedule with Red Hat's don't do us any favors here. Leon From cjfields at illinois.edu Tue Feb 5 23:14:24 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 6 Feb 2013 04:14:24 +0000 Subject: [Bioperl-l] dependencies on perl version In-Reply-To: References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE18E52@CHIMBX5.ad.uillinois.edu> On Feb 5, 2013, at 8:40 PM, Leon Timmermans wrote: > On Tue, Feb 5, 2013 at 9:56 PM, Fields, Christopher J > wrote: >> Aim for 5.10.1, but be careful of smart-match. If you do this, make sure to add a 'use 5.010' pragma at the top. >> >> (for those who don't like this, please speak up. perl 5.8 has been around almost 8 yrs, we would like to allow using new features if at all possible) > > I *really* hate saying it, but I fear a lot of places are still stuck > on 5.8, in particular on 5.8.8 because of CentOS 5. I know my > department still is and doesn't seem to be in a hurry to upgrade, and > I'm pretty sure it won't be the only one (though personally I use a > self-compiled 5.16). > > Leon We had the same problem for a while, but our sysadmins were willing to set up perl 5.12 (at that time) loadable as a module (we can of course set up a local perl as well). We're now using a sysadmin-installed perl 5.16 with our current cluster. chris From cjfields at illinois.edu Tue Feb 5 23:24:31 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 6 Feb 2013 04:24:31 +0000 Subject: [Bioperl-l] dependencies on perl version In-Reply-To: References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> On Feb 5, 2013, at 10:11 PM, Leon Timmermans wrote: > On Wed, Feb 6, 2013 at 4:32 AM, Hilmar Lapp wrote: >> Does anyone know what Ubuntu uses? > > 5.14.2, distrowatch is your friend ;-) > >> I've heard lots of other old version problems with CentOS. > > I know people who still use CentOS 4 in production :-| > >> 8 years is really old, and at some point I fear that weighing backwards compatibility too much just holds us back in a real detrimental way. > > CentOS 5 is 6 years old (and will be supported another 4), but CentOS > 6 is 'only' 19 months. perl missing a release in the 5.8-5.10 > timeframe combined with an unfortunate alignment of its release > schedule with Red Hat's don't do us any favors here. > > Leon Right, it took ~8 yrs to go from 5.8 to 5.10. I'd like to point out that Python users are in the same boat: the Python version for CentOS 5 is 2.4.3, and Biopython requires a minimum of python 2.5 (and recommends python 2.7). We can always state that perl 5.8 is supported for the upcoming Bioperl release, but we're dropping v5.8 support for any future releases. chris From l.m.timmermans at students.uu.nl Tue Feb 5 23:33:57 2013 From: l.m.timmermans at students.uu.nl (Leon Timmermans) Date: Wed, 6 Feb 2013 05:33:57 +0100 Subject: [Bioperl-l] dependencies on perl version In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> Message-ID: On Wed, Feb 6, 2013 at 5:24 AM, Fields, Christopher J wrote: > Right, it took ~8 yrs to go from 5.8 to 5.10. I'd like to point out that Python users are in the same boat: the Python version for CentOS 5 is 2.4.3, and Biopython requires a minimum of python 2.5 (and recommends python 2.7). > > We can always state that perl 5.8 is supported for the upcoming Bioperl release, but we're dropping v5.8 support for any future releases. Sounds reasonable. These things shouldn't come as a surprise. I suspect that the thing that will save us is that most of these people install it once and then never upgrade. Leon From hartzell at alerce.com Wed Feb 6 12:58:07 2013 From: hartzell at alerce.com (George Hartzell) Date: Wed, 6 Feb 2013 09:58:07 -0800 Subject: [Bioperl-l] dependencies on perl version In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> Message-ID: <20754.39343.128576.743448@gargle.gargle.HOWL> Fields, Christopher J writes: > [...] > Right, it took ~8 yrs to go from 5.8 to 5.10. I'd like to point > out that Python users are in the same boat: the Python version for > CentOS 5 is 2.4.3, and Biopython requires a minimum of python 2.5 > (and recommends python 2.7). > > We can always state that perl 5.8 is supported for the upcoming > Bioperl release, but we're dropping v5.8 support for any future > releases. Do more than drop support for 5.8. The Perl community has put a transparent and predictable process in place for releasing [generally] better versions of the language. It means that Perl has a chance of continuing to be relevant, attracting new talent and actually *fixing* some of the s&%t that gives Perl a bad rap. It gives people something to plan around, no one should be surprised that v 5.X.Y is coming out in mid 20ZZ. BioPerl should do the same thing, declare a release policy that trails along with the Perl release schedule. Keep it simple and no one can argue with it. Support Perl releases as long as the releases themselves are supported. Rather than expending energy supporting out of date platforms, put the energy into being modern (or Modern...), better distro building and packaging, testing, documentation and releasing so that the process of staying current is painless. Look forward. Keep it interesting and fun. Everyone running Mac OS 9 on their Pismo, raise your hand. Anyone make their living running sequencing gels in Plexiglas doohickeys on their lab bench? I'm not suggesting that the BioPerl community is free to make arbitrary and capricious changes that makes it difficult for *anyone* to get anything done. Churn is a waste of time. But why should the all-volunteer BioPerl community be stuck supporting code from 12 years ago because it's cost effective for someone else to avoid spending *their* $/time/people to stay up to date. Those sites that value stability/maturity/stagnation so highly have already accepted the cost/difficulty of nailing one of their feet to the floor as they try to run forward. They recognize and depend on the benefits of having that stable base but generally they've also accepted the costs associated with their restrictive choices. They know how to pull in separate kernel/driver updates so that they can actually run on nearly modern hardware. They know, and live with, the fact that they're not going to have access to the shiny new stuff. And they know how to stay up to date, when they need to, with the software that their users need to be competitive (e.g. BioConductor and R). As long as (if/when...) updating a BioPerl release is something that can reliably happen with a few cpanm invocations then the sites that otherwise favor punctuated equilibrium will learn to handle gradual change. Those folks that are "stuck" on older releases always have the option of supporting professional Perl programmers to keep older releases going, backport changes, etc.... They're already buying support for their platforms (or freeloading and coping), let them put bread on the table at one of the bioinformatics consultancies or labs if they have something special they need. Have fun. Use sharp tools. Do cool science. Build cool things. No one is paying you to be backwards compatible with the previous millennium. g. From amackey at virginia.edu Wed Feb 6 13:47:46 2013 From: amackey at virginia.edu (Aaron Mackey) Date: Wed, 6 Feb 2013 13:47:46 -0500 Subject: [Bioperl-l] dependencies on perl version In-Reply-To: <20754.39343.128576.743448@gargle.gargle.HOWL> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> Message-ID: Huzzah! -- Aaron J. Mackey, PhD Assistant Professor Center for Public Health Genomics University of Virginia amackey at virginia.edu http://www.cphg.virginia.edu/mackey On Wed, Feb 6, 2013 at 12:58 PM, George Hartzell wrote: > Fields, Christopher J writes: > > [...] > > Right, it took ~8 yrs to go from 5.8 to 5.10. I'd like to point > > out that Python users are in the same boat: the Python version for > > CentOS 5 is 2.4.3, and Biopython requires a minimum of python 2.5 > > (and recommends python 2.7). > > > > We can always state that perl 5.8 is supported for the upcoming > > Bioperl release, but we're dropping v5.8 support for any future > > releases. > > Do more than drop support for 5.8. > > The Perl community has put a transparent and predictable process in > place for releasing [generally] better versions of the language. It > means that Perl has a chance of continuing to be relevant, attracting > new talent and actually *fixing* some of the s&%t that gives Perl a > bad rap. It gives people something to plan around, no one should be > surprised that v 5.X.Y is coming out in mid 20ZZ. > > BioPerl should do the same thing, declare a release policy that trails > along with the Perl release schedule. Keep it simple and no one can > argue with it. Support Perl releases as long as the releases > themselves are supported. > > Rather than expending energy supporting out of date platforms, put the > energy into being modern (or Modern...), better distro building and > packaging, testing, documentation and releasing so that the process of > staying current is painless. > > Look forward. Keep it interesting and fun. > > Everyone running Mac OS 9 on their Pismo, raise your hand. Anyone > make their living running sequencing gels in Plexiglas doohickeys on > their lab bench? > > I'm not suggesting that the BioPerl community is free to make > arbitrary and capricious changes that makes it difficult for *anyone* > to get anything done. Churn is a waste of time. > > But why should the all-volunteer BioPerl community be stuck supporting > code from 12 years ago because it's cost effective for someone else to > avoid spending *their* $/time/people to stay up to date. > > Those sites that value stability/maturity/stagnation so highly have > already accepted the cost/difficulty of nailing one of their feet to > the floor as they try to run forward. They recognize and depend on > the benefits of having that stable base but generally they've also > accepted the costs associated with their restrictive choices. They > know how to pull in separate kernel/driver updates so that they can > actually run on nearly modern hardware. They know, and live with, the > fact that they're not going to have access to the shiny new stuff. > And they know how to stay up to date, when they need to, with the > software that their users need to be competitive (e.g. BioConductor > and R). > > As long as (if/when...) updating a BioPerl release is something that > can reliably happen with a few cpanm invocations then the sites that > otherwise favor punctuated equilibrium will learn to handle gradual > change. > > Those folks that are "stuck" on older releases always have the option > of supporting professional Perl programmers to keep older releases > going, backport changes, etc.... They're already buying support for > their platforms (or freeloading and coping), let them put bread on the > table at one of the bioinformatics consultancies or labs if they have > something special they need. > > Have fun. Use sharp tools. Do cool science. Build cool things. No > one is paying you to be backwards compatible with the previous > millennium. > > g. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From tiago.hori at gmail.com Wed Feb 6 08:25:41 2013 From: tiago.hori at gmail.com (Tiago Hori) Date: Wed, 6 Feb 2013 05:25:41 -0800 (PST) Subject: [Bioperl-l] Problems installing Bio::Tools::Run:StandAloneBlastPlus Message-ID: <9b488c6e-34b3-4269-a7ac-e2206720939a@googlegroups.com> Hi Guys, I am trying to install the module Bio::Tools::Run:StandAloneBlastPlus, but it has been hard so far. I managed to install and compile samtools, after finding all the dependencies, but I am still missing something! I posted the complete report below! Any help, would be great! Cheers, T. cpan[1]> install Bio::Tools::Run::StandAloneBlastPlus Reading '/home/tiagohori/.cpan/Metadata' Database was generated on Tue, 05 Feb 2013 18:41:03 GMT Running install for module 'Bio::Tools::Run::StandAloneBlastPlus' Running make for C/CJ/CJFIELDS/BioPerl-Run-1.006900.tar.gz Checksum for /home/tiagohori/.cpan/sources/authors/id/C/CJ/CJFIELDS/BioPerl-Run-1.006900.tar.gz ok Scanning cache /home/tiagohori/.cpan/build for sizes ..................................------------------------------------------DONE DEL(1/20): /home/tiagohori/.cpan/build/BioPerl-Run-1.006900-qpHfzz DEL(2/20): /home/tiagohori/.cpan/build/BioPerl-Run-1.006900-qpHfzz.yml DEL(3/20): /home/tiagohori/.cpan/build/BioPerl-Run-1.006900-nMOXgO DEL(4/20): /home/tiagohori/.cpan/build/BioPerl-Run-1.006900-nMOXgO.yml DEL(5/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-bgBQyC DEL(6/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-bgBQyC.yml DEL(7/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-Ki3dbt DEL(8/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-Ki3dbt.yml DEL(9/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-ciM7U4 DEL(10/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-ciM7U4.yml DEL(11/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-oDyi_5 DEL(12/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-oDyi_5.yml DEL(13/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-AQiiAn DEL(14/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-AQiiAn.yml DEL(15/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-0H2Z9o DEL(16/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-0H2Z9o.yml DEL(17/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-c_8A_U DEL(18/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-c_8A_U.yml DEL(19/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-lWtV8v DEL(20/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-lWtV8v.yml CPAN.pm: Building C/CJ/CJFIELDS/BioPerl-Run-1.006900.tar.gz Install scripts? y/n [n ] n Do you want to run tests that require connection to servers across the internet (likely to cause some failures)? y/n [n ] n - will not run internet-requiring tests Created MYMETA.yml and MYMETA.json Creating new 'Build' script for 'BioPerl-Run' version '1.006900' Building BioPerl-Run CJFIELDS/BioPerl-Run-1.006900.tar.gz ./Build -- OK Running Build test t/Amap.t ...................... 1/18 # Required executable for Bio::Tools::Run::Alignment::Amap is not present t/Amap.t ...................... ok t/AnalysisFactory_soap.t ...... skipped: Network tests have not been requested t/Analysis_soap.t ............. skipped: Network tests have not been requested t/BEDTools.t .................. 3/423 # Required executable for Bio::Tools::Run::BEDTools is not present t/BEDTools.t .................. ok t/BWA.t ....................... 1/36 # Required executable for Bio::Tools::Run::BWA is not present t/BWA.t ....................... ok t/Blat.t ...................... 1/33 # Required executable for Bio::Tools::Run::Alignment::Blat is not present # Looks like you planned 33 tests but ran 20. t/Blat.t ...................... Dubious, test returned 255 (wstat 65280, 0xff00) Failed 13/33 subtests (less 15 skipped subtests: 5 okay) t/Bowtie.t .................... 1/73 # Required executable for Bio::Tools::Run::Bowtie is not present t/Bowtie.t .................... ok t/Cap3.t ...................... 1/91 # Required executable for Bio::Tools::Run::Cap3 is not present t/Cap3.t ...................... ok t/Clustalw.t .................. 1/45 # Required executable for Bio::Tools::Run::Alignment::Clustalw is not present t/Clustalw.t .................. ok t/Coil.t ...................... 2/6 # Required executable for Bio::Tools::Run::Coil is not present t/Coil.t ...................... ok t/Consense.t .................. 1/9 # Required executable for Bio::Tools::Run::Phylo::Phylip::Consense is not present t/Consense.t .................. ok t/DBA.t ....................... 1/18 # Required executable for Bio::Tools::Run::Alignment::DBA is not present t/DBA.t ....................... ok t/DrawGram.t .................. 1/6 # Required executable for Bio::Tools::Run::Phylo::Phylip::DrawGram is not present t/DrawGram.t .................. ok t/DrawTree.t .................. 1/6 # Required executable for Bio::Tools::Run::Phylo::Phylip::DrawTree is not present t/DrawTree.t .................. ok t/EMBOSS.t .................... ok t/Ensembl.t ................... skipped: Network tests have not been requested t/Eponine.t ................... 1/7 # Looks like you planned 7 tests but ran 2. t/Eponine.t ................... Dubious, test returned 255 (wstat 65280, 0xff00) Failed 5/7 subtests t/Exonerate.t ................. 1/89 # Required executable for Bio::Tools::Run::Alignment::Exonerate is not present t/Exonerate.t ................. ok t/FootPrinter.t ............... 1/24 # Required executable for Bio::Tools::Run::FootPrinter is not present t/FootPrinter.t ............... ok t/Genemark.hmm.prokaryotic.t .. 1/99 # Required environment variable $GENEMARK_MODELS is not set t/Genemark.hmm.prokaryotic.t .. ok t/Genewise.t .................. 1/20 # Required executable for Bio::Tools::Run::Genewise is not present t/Genewise.t .................. ok t/Genscan.t ................... 1/6 # Required environment variable $GENSCANDIR is not set t/Genscan.t ................... ok t/Gerp.t ...................... 1/33 # Required executable for Bio::Tools::Run::Phylo::Gerp is not present t/Gerp.t ...................... ok t/Glimmer2.t .................. 1/217 # Required executable for Bio::Tools::Run::Glimmer is not present t/Glimmer2.t .................. ok t/Glimmer3.t .................. 1/111 # Required executable for Bio::Tools::Run::Glimmer is not present t/Glimmer3.t .................. ok t/Gumby.t ..................... 1/124 # Required executable for Bio::Tools::Run::Phylo::Gumby is not present t/Gumby.t ..................... ok t/Hmmer.t ..................... 1/27 # Required executable for Bio::Tools::Run::Hmmer is not present t/Hmmer.t ..................... ok t/Hyphy.t ..................... 2/15 # Required executable for Bio::Tools::Run::Phylo::Hyphy::SLAC is not present t/Hyphy.t ..................... ok t/Infernal.t .................. 1/43 # Required executable for Bio::Tools::Run::Infernal is not present t/Infernal.t .................. ok t/Kalign.t .................... 1/8 # Required executable for Bio::Tools::Run::Alignment::Kalign is not present t/Kalign.t .................... ok t/LVB.t ....................... 1/19 # Required executable for Bio::Tools::Run::Phylo::LVB is not present t/LVB.t ....................... ok t/Lagan.t ..................... 1/12 # Required executable for Bio::Tools::Run::Alignment::Lagan is not present t/Lagan.t ..................... ok t/MAFFT.t ..................... 1/17 # Required executable for Bio::Tools::Run::Alignment::MAFFT is not present t/MAFFT.t ..................... ok t/MCS.t ....................... 1/24 # Required executable for Bio::Tools::Run::MCS is not present t/MCS.t ....................... ok t/Maq.t ....................... 1/51 # Required executable for Bio::Tools::Run::Maq is not present t/Maq.t ....................... ok t/Match.t ..................... 1/7 # Required executable for Bio::Tools::Run::Match is not present t/Match.t ..................... ok t/Mdust.t ..................... 1/5 # Required executable for Bio::Tools::Run::Mdust is not present t/Mdust.t ..................... ok t/Meme.t ...................... 1/25 # Required executable for Bio::Tools::Run::Meme is not present t/Meme.t ...................... ok t/Minimo.t .................... 1/72 # Required executable for Bio::Tools::Run::Minimo is not present t/Minimo.t .................... ok t/Molphy.t .................... 1/10 # Required executable for Bio::Tools::Run::Phylo::Molphy::ProtML is not present t/Molphy.t .................... ok t/Muscle.t .................... 1/16 # Required executable for Bio::Tools::Run::Alignment::Muscle is not present t/Muscle.t .................... ok t/Neighbor.t .................. 1/17 # Required executable for Bio::Tools::Run::Phylo::Phylip::Neighbor is not present t/Neighbor.t .................. ok t/Newbler.t ................... 1/98 # Required executable for Bio::Tools::Run::Newbler is not present t/Newbler.t ................... ok t/Njtree.t .................... 1/6 # Required executable for Bio::Tools::Run::Phylo::Njtree::Best is not present t/Njtree.t .................... ok t/PAML.t ...................... 1/28 # Required executable for Bio::Tools::Run::Phylo::PAML::Codeml is not present t/PAML.t ...................... ok t/Pal2Nal.t ................... 1/9 # Required executable for Bio::Tools::Run::Alignment::Pal2Nal is not present t/Pal2Nal.t ................... ok t/PhastCons.t ................. 1/181 # Required executable for Bio::Tools::Run::Phylo::Phast::PhastCons is not present t/PhastCons.t ................. ok t/Phrap.t ..................... 1/127 # Required executable for Bio::Tools::Run::Phrap is not present t/Phrap.t ..................... ok t/Phyml.t ..................... 1/47 # Required executable for Bio::Tools::Run::Phylo::Phyml is not present t/Phyml.t ..................... ok t/Primate.t ................... 1/8 # Required executable for Bio::Tools::Run::Primate is not present t/Primate.t ................... ok t/Primer3.t ................... 1/9 # Required executable for Bio::Tools::Run::Primer3 is not present t/Primer3.t ................... ok t/Prints.t .................... 1/7 # Required executable for Bio::Tools::Run::Prints is not present t/Prints.t .................... ok t/Probalign.t ................. 1/13 # Required executable for Bio::Tools::Run::Alignment::Probalign is not present t/Probalign.t ................. ok t/Probcons.t .................. 1/11 # Required executable for Bio::Tools::Run::Alignment::Probcons is not present t/Probcons.t .................. ok t/Profile.t ................... 1/7 # Required executable for Bio::Tools::Run::Profile is not present t/Profile.t ................... ok t/Promoterwise.t .............. 1/9 # Required executable for Bio::Tools::Run::Promoterwise is not present t/Promoterwise.t .............. ok t/ProtDist.t .................. 1/14 # Required executable for Bio::Tools::Run::Phylo::Phylip::ProtDist is not present t/ProtDist.t .................. ok t/ProtPars.t .................. 1/11 # Required executable for Bio::Tools::Run::Phylo::Phylip::ProtPars is not present t/ProtPars.t .................. ok t/Pseudowise.t ................ 1/18 # Required executable for Bio::Tools::Run::Pseudowise is not present t/Pseudowise.t ................ ok t/QuickTree.t ................. 1/13 # Required executable for Bio::Tools::Run::Phylo::QuickTree is not present t/QuickTree.t ................. ok t/RepeatMasker.t .............. 1/12 RepeatMasker program not found as or not executable. # Required executable for Bio::Tools::Run::RepeatMasker is not present t/RepeatMasker.t .............. ok t/SABlastPlus.t ............... 1/65 # Required executable for Bio::Tools::Run::BlastPlus is not present # Looks like you planned 65 tests but ran 63. t/SABlastPlus.t ............... Dubious, test returned 255 (wstat 65280, 0xff00) Failed 2/65 subtests (less 59 skipped subtests: 4 okay) t/SLR.t ....................... 1/7 # Required executable for Bio::Tools::Run::Phylo::SLR is not present t/SLR.t ....................... ok t/Samtools.t .................. ok t/Seg.t ....................... 1/8 # Required executable for Bio::Tools::Run::Seg is not present t/Seg.t ....................... ok t/Semphy.t .................... 1/19 # Required executable for Bio::Tools::Run::Phylo::Semphy is not present t/Semphy.t .................... ok t/SeqBoot.t ................... 1/9 # Required executable for Bio::Tools::Run::Phylo::Phylip::SeqBoot is not present t/SeqBoot.t ................... ok t/Signalp.t ................... 1/7 # Required executable for Bio::Tools::Run::Signalp is not present t/Signalp.t ................... ok t/Sim4.t ...................... 1/23 # Required executable for Bio::Tools::Run::Alignment::Sim4 is not present t/Sim4.t ...................... ok t/Simprot.t ................... 1/6 # Required executable for Bio::Tools::Run::Simprot is not present t/Simprot.t ................... ok t/SoapEU-function.t ........... skipped: The optional module Bio::DB::ESoap (or dependencies thereof) was not installed t/SoapEU-unit.t ............... skipped: The optional module Bio::DB::ESoap (or dependencies thereof) was not installed t/StandAloneFasta.t ........... 1/15 # Required executable for Bio::Tools::Run::Alignment::StandAloneFasta is not present t/StandAloneFasta.t ........... ok t/TCoffee.t ................... 1/27 # Required executable for Bio::Tools::Run::Alignment::TCoffee is not present t/TCoffee.t ................... ok t/TigrAssembler.t ............. 1/88 # Required executable for Bio::Tools::Run::TigrAssembler is not present # Required executable for Bio::Tools::Run::TigrAssembler is not present t/TigrAssembler.t ............. ok t/Tmhmm.t ..................... 1/9 # Required executable for Bio::Tools::Run::Tmhmm is not present t/Tmhmm.t ..................... ok t/TribeMCL.t .................. ok t/Vista.t ..................... ok t/gmap-run.t .................. 1/8 # Required executable for Bio::Tools::Run::Alignment::Gmap is not present t/gmap-run.t .................. ok t/tRNAscanSE.t ................ 1/12 # Required executable for Bio::Tools::Run::tRNAscanSE is not present t/tRNAscanSE.t ................ ok Test Summary Report ------------------- t/Blat.t (Wstat: 65280 Tests: 20 Failed: 0) Non-zero exit status: 255 Parse errors: Bad plan. You planned 33 tests but ran 20. t/Eponine.t (Wstat: 65280 Tests: 2 Failed: 0) Non-zero exit status: 255 Parse errors: Bad plan. You planned 7 tests but ran 2. t/SABlastPlus.t (Wstat: 65280 Tests: 63 Failed: 0) Non-zero exit status: 255 Parse errors: Bad plan. You planned 65 tests but ran 63. Files=80, Tests=2876, 39 wallclock secs ( 0.54 usr 0.23 sys + 32.54 cusr 4.94 csys = 38.25 CPU) Result: FAIL Failed 3/80 test programs. 0/2876 subtests failed. CJFIELDS/BioPerl-Run-1.006900.tar.gz ./Build test -- NOT OK //hint// to see the cpan-testers results for installing this module, try: reports CJFIELDS/BioPerl-Run-1.006900.tar.gz Running Build install make test had returned bad status, won't install without force From guy.leonard at gmail.com Wed Feb 6 13:35:38 2013 From: guy.leonard at gmail.com (guy.leonard at gmail.com) Date: Wed, 6 Feb 2013 10:35:38 -0800 (PST) Subject: [Bioperl-l] Next BioPerl release In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu> Message-ID: <3e4d717e-b58a-4bfd-943d-6f213bfae260@googlegroups.com> Nice, super work. Will there be a rough list of feature changes/addition/deprecation, or shall I consult git logs? On Tuesday, 5 February 2013 21:53:29 UTC, Christopher Fields wrote: > > All, > > I am scheduling the next BioPerl CPAN release tentatively for March 1. > Any help in triaging bug reports would be greatly appreciated! > > Amongst all other changes, as mentioned in a separate thread we will > remove Bio::FeatureIO, now developed in a separate repository: > > https://github.com/bioperl/Bio-FeatureIO > > Feedback, suggestions, etc are greatly appreciated. > > chris > _______________________________________________ > Bioperl-l mailing list > Biop... at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From guy.leonard at gmail.com Wed Feb 6 13:35:38 2013 From: guy.leonard at gmail.com (guy.leonard at gmail.com) Date: Wed, 6 Feb 2013 10:35:38 -0800 (PST) Subject: [Bioperl-l] Next BioPerl release In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu> Message-ID: <3e4d717e-b58a-4bfd-943d-6f213bfae260@googlegroups.com> Nice, super work. Will there be a rough list of feature changes/addition/deprecation, or shall I consult git logs? On Tuesday, 5 February 2013 21:53:29 UTC, Christopher Fields wrote: > > All, > > I am scheduling the next BioPerl CPAN release tentatively for March 1. > Any help in triaging bug reports would be greatly appreciated! > > Amongst all other changes, as mentioned in a separate thread we will > remove Bio::FeatureIO, now developed in a separate repository: > > https://github.com/bioperl/Bio-FeatureIO > > Feedback, suggestions, etc are greatly appreciated. > > chris > _______________________________________________ > Bioperl-l mailing list > Biop... at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From sidd.basu at gmail.com Wed Feb 6 14:36:17 2013 From: sidd.basu at gmail.com (Siddhartha Basu) Date: Wed, 6 Feb 2013 13:36:17 -0600 Subject: [Bioperl-l] Re: Next BioPerl release In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu> Message-ID: <5112b0b3.a5dc320a.4105.1fe3@mx.google.com> Hi, On Tue, 05 Feb 2013, Fields, Christopher J wrote: > All, > > I am scheduling the next BioPerl CPAN release tentatively for March 1. Any help in triaging bug reports would be greatly appreciated! > > Amongst all other changes, as mentioned in a separate thread we will remove Bio::FeatureIO, now developed in a separate repository: > > https://github.com/bioperl/Bio-FeatureIO > > Feedback, suggestions, etc are greatly appreciated. Here are CI build report on 5.12, 5.14 and 5.16 using travis. https://api.travis-ci.org/jobs/4623997/log.txt?deansi=true https://api.travis-ci.org/jobs/4623998/log.txt?deansi=true https://api.travis-ci.org/jobs/4623999/log.txt?deansi=true Could not get 5.10 to work on travis. Though i activated the (--network) option, it still didn't run one of the test that needs network. Also, initially got confused by the fact that though it has dist.ini, the tests still has to run through Build.PL. Running **dzil test** do not work. Hope this helps. thanks, -siddhartha > > chris > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Wed Feb 6 14:46:49 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 6 Feb 2013 19:46:49 +0000 Subject: [Bioperl-l] Next BioPerl release In-Reply-To: <3e4d717e-b58a-4bfd-943d-6f213bfae260@googlegroups.com> References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu> <3e4d717e-b58a-4bfd-943d-6f213bfae260@googlegroups.com> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1A109@CHIMBX5.ad.uillinois.edu> We've been a little better at keeping track of significant changes this time 'round. There aren't a lot of major updates, but it's important to make sure we get a release out to ensure everyone (not just those familiar with git) can access them. chris On Feb 6, 2013, at 12:35 PM, wrote: > Nice, super work. > > Will there be a rough list of feature changes/addition/deprecation, or > shall I consult git logs? > > On Tuesday, 5 February 2013 21:53:29 UTC, Christopher Fields wrote: >> >> All, >> >> I am scheduling the next BioPerl CPAN release tentatively for March 1. >> Any help in triaging bug reports would be greatly appreciated! >> >> Amongst all other changes, as mentioned in a separate thread we will >> remove Bio::FeatureIO, now developed in a separate repository: >> >> https://github.com/bioperl/Bio-FeatureIO >> >> Feedback, suggestions, etc are greatly appreciated. >> >> chris >> _______________________________________________ >> Bioperl-l mailing list >> Biop... at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Wed Feb 6 14:54:58 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 6 Feb 2013 19:54:58 +0000 Subject: [Bioperl-l] Next BioPerl release In-Reply-To: <5112b0b3.a5dc320a.4105.1fe3@mx.google.com> References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu> <5112b0b3.a5dc320a.4105.1fe3@mx.google.com> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1A1AB@CHIMBX5.ad.uillinois.edu> On Feb 6, 2013, at 1:36 PM, Siddhartha Basu wrote: > Hi, > > On Tue, 05 Feb 2013, Fields, Christopher J wrote: > >> All, >> >> I am scheduling the next BioPerl CPAN release tentatively for March 1. Any help in triaging bug reports would be greatly appreciated! >> >> Amongst all other changes, as mentioned in a separate thread we will remove Bio::FeatureIO, now developed in a separate repository: >> >> https://github.com/bioperl/Bio-FeatureIO >> >> Feedback, suggestions, etc are greatly appreciated. > > Here are CI build report on 5.12, 5.14 and 5.16 using travis. > https://api.travis-ci.org/jobs/4623997/log.txt?deansi=true > https://api.travis-ci.org/jobs/4623998/log.txt?deansi=true > https://api.travis-ci.org/jobs/4623999/log.txt?deansi=true > > Could not get 5.10 to work on travis. Though i activated the (--network) > option, it still didn't run one of the test that needs network. Also, initially got > confused by the fact that though it has dist.ini, the tests still has > to run through Build.PL. Running **dzil test** do not work. > > Hope this helps. > > thanks, > -siddhartha Just to point out, that was for Bio-FeatureIO. Truthfully I'm not worried about that one yet; got to get over Mt. Everest first (the main release). Build.PL is there mainly as a convenience for users w/o Dist::Zilla, which, last I recall, had a higher dependency list than even BioPerl (though I may be mistaken). I'll probably have to set up a Build.PL that can be clobbered by Dist::Zilla as needed. Or we can just get rid of it and insist that dev. code has to be added via 'use lib' or PERL5LIB, and not allow installation. chris From sidd.basu at gmail.com Wed Feb 6 15:26:06 2013 From: sidd.basu at gmail.com (Siddhartha Basu) Date: Wed, 6 Feb 2013 14:26:06 -0600 Subject: [Bioperl-l] Re: Next BioPerl release In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1A1AB@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu> <5112b0b3.a5dc320a.4105.1fe3@mx.google.com> <118F034CF4C3EF48A96F86CE585B94BF6CE1A1AB@CHIMBX5.ad.uillinois.edu> Message-ID: <5112bc60.c69e320a.1e98.2028@mx.google.com> On Wed, 06 Feb 2013, Fields, Christopher J wrote: > On Feb 6, 2013, at 1:36 PM, Siddhartha Basu > wrote: > > > Hi, > > > > On Tue, 05 Feb 2013, Fields, Christopher J wrote: > > > >> All, > >> > >> I am scheduling the next BioPerl CPAN release tentatively for March 1. Any help in triaging bug reports would be greatly appreciated! > >> > >> Amongst all other changes, as mentioned in a separate thread we will remove Bio::FeatureIO, now developed in a separate repository: > >> > >> https://github.com/bioperl/Bio-FeatureIO > >> > >> Feedback, suggestions, etc are greatly appreciated. > > > > Here are CI build report on 5.12, 5.14 and 5.16 using travis. > > https://api.travis-ci.org/jobs/4623997/log.txt?deansi=true > > https://api.travis-ci.org/jobs/4623998/log.txt?deansi=true > > https://api.travis-ci.org/jobs/4623999/log.txt?deansi=true > > > > Could not get 5.10 to work on travis. Though i activated the (--network) > > option, it still didn't run one of the test that needs network. Also, initially got > > confused by the fact that though it has dist.ini, the tests still has > > to run through Build.PL. Running **dzil test** do not work. > > > > Hope this helps. > > > > thanks, > > -siddhartha > > Just to point out, that was for Bio-FeatureIO. Truthfully I'm not worried about that one yet; got to get over Mt. Everest first (the main release). So, what are steps left for getting the release out to CPAN. Like are there lot of feature branches still left to be merged, are there a lot of unit tests still not passing. Just trying to figure out anyway i could be of any help to expedite the release process. However, if they are already taken care of, please ignore. > > Build.PL is there mainly as a convenience for users w/o Dist::Zilla, which, last I recall, had a higher dependency list than even BioPerl (though I may be mistaken). I'll probably have to set up a Build.PL that can be clobbered by Dist::Zilla as needed. As far as the error i encountered, presence of Build.PL was blocking dzil build/release process. And by default, dzil expects to generate Build.PL during its build/release process. However, i am not sure which mode is the most suitable for bioperl devs. > Or we can just get rid of it and insist that dev. code has to be added via 'use lib' or PERL5LIB, and not allow installation. thanks, -siddhartha > > chris From hlapp at drycafe.net Wed Feb 6 16:30:33 2013 From: hlapp at drycafe.net (Hilmar Lapp) Date: Wed, 6 Feb 2013 16:30:33 -0500 Subject: [Bioperl-l] dependencies on perl version In-Reply-To: <20754.39343.128576.743448@gargle.gargle.HOWL> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> Message-ID: Great points, George, and you're making a very compelling argument. I'm in total agreement. It's almost becoming a reason to having to be embarrassed to still be programming in Perl these days, so one might as well have fun while it lasts. -hilmar On Feb 6, 2013, at 12:58 PM, George Hartzell wrote: > Fields, Christopher J writes: >> [...] >> Right, it took ~8 yrs to go from 5.8 to 5.10. I'd like to point >> out that Python users are in the same boat: the Python version for >> CentOS 5 is 2.4.3, and Biopython requires a minimum of python 2.5 >> (and recommends python 2.7). >> >> We can always state that perl 5.8 is supported for the upcoming >> Bioperl release, but we're dropping v5.8 support for any future >> releases. > > Do more than drop support for 5.8. > > The Perl community has put a transparent and predictable process in > place for releasing [generally] better versions of the language. It > means that Perl has a chance of continuing to be relevant, attracting > new talent and actually *fixing* some of the s&%t that gives Perl a > bad rap. It gives people something to plan around, no one should be > surprised that v 5.X.Y is coming out in mid 20ZZ. > > BioPerl should do the same thing, declare a release policy that trails > along with the Perl release schedule. Keep it simple and no one can > argue with it. Support Perl releases as long as the releases > themselves are supported. > > Rather than expending energy supporting out of date platforms, put the > energy into being modern (or Modern...), better distro building and > packaging, testing, documentation and releasing so that the process of > staying current is painless. > > Look forward. Keep it interesting and fun. > > Everyone running Mac OS 9 on their Pismo, raise your hand. Anyone > make their living running sequencing gels in Plexiglas doohickeys on > their lab bench? > > I'm not suggesting that the BioPerl community is free to make > arbitrary and capricious changes that makes it difficult for *anyone* > to get anything done. Churn is a waste of time. > > But why should the all-volunteer BioPerl community be stuck supporting > code from 12 years ago because it's cost effective for someone else to > avoid spending *their* $/time/people to stay up to date. > > Those sites that value stability/maturity/stagnation so highly have > already accepted the cost/difficulty of nailing one of their feet to > the floor as they try to run forward. They recognize and depend on > the benefits of having that stable base but generally they've also > accepted the costs associated with their restrictive choices. They > know how to pull in separate kernel/driver updates so that they can > actually run on nearly modern hardware. They know, and live with, the > fact that they're not going to have access to the shiny new stuff. > And they know how to stay up to date, when they need to, with the > software that their users need to be competitive (e.g. BioConductor > and R). > > As long as (if/when...) updating a BioPerl release is something that > can reliably happen with a few cpanm invocations then the sites that > otherwise favor punctuated equilibrium will learn to handle gradual > change. > > Those folks that are "stuck" on older releases always have the option > of supporting professional Perl programmers to keep older releases > going, backport changes, etc.... They're already buying support for > their platforms (or freeloading and coping), let them put bread on the > table at one of the bioinformatics consultancies or labs if they have > something special they need. > > Have fun. Use sharp tools. Do cool science. Build cool things. No > one is paying you to be backwards compatible with the previous > millennium. > > g. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : =========================================================== From cjfields at illinois.edu Wed Feb 6 17:11:06 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 6 Feb 2013 22:11:06 +0000 Subject: [Bioperl-l] BioPerl long-term, was Re: dependencies on perl version In-Reply-To: References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> George, Should put your post on a pedestal :) tl;dr version: I completely agree, but we need help in order to do this. Long(-winded) version: I agree completely, backwards compatibility is killing us. But, we do need current and new people to get involved and help drive this forward. We need people on all fronts, from coding and bug fixes to documentation and web site maintenance. I've been driving this bus for a number of years now. Not getting tired yet, but I am getting substantially busier with my current endeavors, so my time spent working on BioPerl has dwindled considerably. Any additional support or sharing of responsibilities will help tremendously in keeping up momentum (if someone else wants to take the wheel for a bit, please let me know :). If we follow the perl release route, we should streamline the release process (think Dist::Zilla), end support of older versions of Perl, and work on a sustainable release schedule. The fact that we have so many of us so-called 'old folks' speaking up in favor of this is a very good sign. We do need a bit more than that; we need help. BioPerl is a very large project. A key point we need to address, which is very important for the future of BioPerl. I use Perl quite a bit in my current work (dabble with Ruby and Python as well when I have to). BioPerl? A little, but not as much as I could. Shocked? The main three reason I don't use it 'in anger': performance, performance, and performance. It is very important that we make a concerted effort to address this at all levels. It could be as simple as completely separating parsing from object creation (where the bulk of performance problems seem to lie, but not all of them). A specific example: Heng Li once tested the performance of FASTQ parsing (perl, python, bioperl, biopython, his C code, etc). BioPerl's FASTQ couldn't even be measured; IIRC it went on for many hours until he killed it. This was with the older version of the parser, but I'm willing to bet the newer one I wrote isn't any better. This. needs. to. change. I see no problem in stating any generic parsing and low-level interfaces are just as much a part of what BioPerl encompasses as the higher-level Bio::* classes themselves. Steve and Jason were on to something with SearchIO; it's maybe not as performant as we would like, but it certainly is more flexible in terms of what can be done, b/c it separates out low-level parsing from object creation. That's the general model we should look at. There is a good reason Biopython is following this model with their SearchIO implementation (Peter C, are you reading this?) We have a lot of very talented people involved with this project, both on the purely computational and purely biological end as well as the folks like me who straddle the two domains. A lot of good code out there that can be used, wrapped, taken advantage of, including everything we currently have in BioPerl. Let's come up with something that both works and works well, that people can use on a regular basis, even at a low level if they choose. That alone would dissuade new users from writing up (yet another) custom FASTA/FASTQ/BLAST/GenBank/etc parser b/c the BioPerl one takes millennia to finish. A few examples on this front: Rob Buels created a generic parser for GFF3 (Bio::GFF3::LowLevel) with very few dependencies, we wrap this with the newer Bio::FeatureIO code. Leon has Bio::SFF. Lincoln of course wrote Bio::DB::Sam and Bio::DB::BigFile. I have started a wrapper around Heng's FASTQ/FASTA parsing code (kseq), it seems to work quite well (~20M FASTQ in 30 sec last I recall?). So: If it means targeting performance, backwards-compatibility be damned (using Devel::NYTProf?), we do that. If it means creating a new Bio-NGS repo to focus some of these efforts, so be it. If it means we get away from the Java-based interface stuff in favor of something more Perl-like (roles anyone?), then I'm all for it. If it means we modularize BioPerl so this can be done, well, you probably know where I stand (yes). If it means this is to be BioPerl 2.0, then let's move that direction, sooner than later. But I can't do it alone. We (not just me, but we) need to drive the direction we take. First one who codes gets the gold ring. chris On Feb 6, 2013, at 12:47 PM, Aaron Mackey wrote: > Huzzah! > > -- > Aaron J. Mackey, PhD > Assistant Professor > Center for Public Health Genomics > University of Virginia > amackey at virginia.edu > http://www.cphg.virginia.edu/mackey > > > On Wed, Feb 6, 2013 at 12:58 PM, George Hartzell wrote: > Fields, Christopher J writes: > > [...] > > Right, it took ~8 yrs to go from 5.8 to 5.10. I'd like to point > > out that Python users are in the same boat: the Python version for > > CentOS 5 is 2.4.3, and Biopython requires a minimum of python 2.5 > > (and recommends python 2.7). > > > > We can always state that perl 5.8 is supported for the upcoming > > Bioperl release, but we're dropping v5.8 support for any future > > releases. > > Do more than drop support for 5.8. > > The Perl community has put a transparent and predictable process in > place for releasing [generally] better versions of the language. It > means that Perl has a chance of continuing to be relevant, attracting > new talent and actually *fixing* some of the s&%t that gives Perl a > bad rap. It gives people something to plan around, no one should be > surprised that v 5.X.Y is coming out in mid 20ZZ. > > BioPerl should do the same thing, declare a release policy that trails > along with the Perl release schedule. Keep it simple and no one can > argue with it. Support Perl releases as long as the releases > themselves are supported. > > Rather than expending energy supporting out of date platforms, put the > energy into being modern (or Modern...), better distro building and > packaging, testing, documentation and releasing so that the process of > staying current is painless. > > Look forward. Keep it interesting and fun. > > Everyone running Mac OS 9 on their Pismo, raise your hand. Anyone > make their living running sequencing gels in Plexiglas doohickeys on > their lab bench? > > I'm not suggesting that the BioPerl community is free to make > arbitrary and capricious changes that makes it difficult for *anyone* > to get anything done. Churn is a waste of time. > > But why should the all-volunteer BioPerl community be stuck supporting > code from 12 years ago because it's cost effective for someone else to > avoid spending *their* $/time/people to stay up to date. > > Those sites that value stability/maturity/stagnation so highly have > already accepted the cost/difficulty of nailing one of their feet to > the floor as they try to run forward. They recognize and depend on > the benefits of having that stable base but generally they've also > accepted the costs associated with their restrictive choices. They > know how to pull in separate kernel/driver updates so that they can > actually run on nearly modern hardware. They know, and live with, the > fact that they're not going to have access to the shiny new stuff. > And they know how to stay up to date, when they need to, with the > software that their users need to be competitive (e.g. BioConductor > and R). > > As long as (if/when...) updating a BioPerl release is something that > can reliably happen with a few cpanm invocations then the sites that > otherwise favor punctuated equilibrium will learn to handle gradual > change. > > Those folks that are "stuck" on older releases always have the option > of supporting professional Perl programmers to keep older releases > going, backport changes, etc.... They're already buying support for > their platforms (or freeloading and coping), let them put bread on the > table at one of the bioinformatics consultancies or labs if they have > something special they need. > > Have fun. Use sharp tools. Do cool science. Build cool things. No > one is paying you to be backwards compatible with the previous > millennium. > > g. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From cjfields at illinois.edu Wed Feb 6 17:34:42 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 6 Feb 2013 22:34:42 +0000 Subject: [Bioperl-l] BioPerl long-term, was Re: dependencies on perl version In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1AF0C@CHIMBX5.ad.uillinois.edu> I want to clarify, parser optimization isn't the only point we need to focus on by any means (and may not be the main one). There is a lot of room for improvement top to bottom, that was one specific example I have long held to be an issue. -c On Feb 6, 2013, at 4:11 PM, "Fields, Christopher J" wrote: > Shocked? The main three reason I don't use it 'in anger': performance, performance, and performance. It is very important that we make a concerted effort to address this at all levels. It could be as simple as completely separating parsing from object creation (where the bulk of performance problems seem to lie, but not all of them). ... From p.j.a.cock at googlemail.com Wed Feb 6 17:43:13 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 6 Feb 2013 22:43:13 +0000 Subject: [Bioperl-l] BioPerl long-term, was Re: dependencies on perl version In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> Message-ID: On Wed, Feb 6, 2013 at 10:11 PM, Fields, Christopher J wrote: > > I see no problem in stating any generic parsing and low-level interfaces > are just as much a part of what BioPerl encompasses as the higher-level > Bio::* classes themselves. Steve and Jason were on to something with > SearchIO; it's maybe not as performant as we would like, but it certainly > is more flexible in terms of what can be done, b/c it separates out > low-level parsing from object creation. That's the general model we > should look at. There is a good reason Biopython is following this > model with their SearchIO implementation (Peter C, are you reading this?) Actually I don't think we did end up with that kind of separation in the Biopython SearchIO - which is not so say it isn't an excellent model to follow. Rather the Biopython SearchIO (like the BioPerl one) had as the first goal a consistent object model across assorted file formats. The idea of a low level minimal overhead parsers (which are very format specific), on which a heavier but consistent object model can be built might be a good balance - the high level API has the connivence, but if you give that up you can have more speed. That's what I recommend with FASTQ and Biopython, e.g. http://news.open-bio.org/news/2009/09/biopython-fast-fastq/ > > I have started a wrapper around Heng's FASTQ/FASTA parsing > code (kseq), it seems to work quite well (~20M FASTQ in 30 sec > last I recall?). > I'd have to dig through my emails, but I think the BioRuby guys looked at that too - as I recall while it was fast, the error handling left something to be desired. Email me directly or on the BioRuby list if you want to follow up on that. Regards, Peter From cjfields at illinois.edu Wed Feb 6 17:53:21 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 6 Feb 2013 22:53:21 +0000 Subject: [Bioperl-l] FASTQ, was Re: BioPerl long-term, was Re: dependencies on perl version In-Reply-To: References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu> On Feb 6, 2013, at 4:43 PM, Peter Cock wrote: > On Wed, Feb 6, 2013 at 10:11 PM, Fields, Christopher J > wrote: >> >> I see no problem in stating any generic parsing and low-level interfaces >> are just as much a part of what BioPerl encompasses as the higher-level >> Bio::* classes themselves. Steve and Jason were on to something with >> SearchIO; it's maybe not as performant as we would like, but it certainly >> is more flexible in terms of what can be done, b/c it separates out >> low-level parsing from object creation. That's the general model we >> should look at. There is a good reason Biopython is following this >> model with their SearchIO implementation (Peter C, are you reading this?) > > Actually I don't think we did end up with that kind of separation in the > Biopython SearchIO - which is not so say it isn't an excellent model > to follow. Rather the Biopython SearchIO (like the BioPerl one) had > as the first goal a consistent object model across assorted file > formats. > > The idea of a low level minimal overhead parsers (which are very > format specific), on which a heavier but consistent object model > can be built might be a good balance - the high level API has the > connivence, but if you give that up you can have more speed. > That's what I recommend with FASTQ and Biopython, e.g. > http://news.open-bio.org/news/2009/09/biopython-fast-fastq/ > >> >> I have started a wrapper around Heng's FASTQ/FASTA parsing >> code (kseq), it seems to work quite well (~20M FASTQ in 30 sec >> last I recall?). >> > > I'd have to dig through my emails, but I think the BioRuby guys > looked at that too - as I recall while it was fast, the error handling > left something to be desired. Email me directly or on the BioRuby > list if you want to follow up on that. > > Regards, > > Peter I did a little on this, worth following up on, but I pulled the FASTQ test examples you created from the paper to test it out. IIRC it parsed where it needed to, but I'm not sure how it handled bad sequences, so yes, worth looking into. Maybe worth moving to open-bio-l for broader discussion. chris From whereverroadgoes at gmail.com Wed Feb 6 16:59:04 2013 From: whereverroadgoes at gmail.com (Slym) Date: Wed, 6 Feb 2013 13:59:04 -0800 (PST) Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases In-Reply-To: <87txpr26jj.fsf@topper.koldfront.dk> References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com> <87txpr26jj.fsf@topper.koldfront.dk> Message-ID: <411e920d-e614-417d-9198-78bef9adba16@googlegroups.com> Everything's working now! Thank you very much, especially to you Adam! > From carandraug+dev at gmail.com Wed Feb 6 20:38:20 2013 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Thu, 7 Feb 2013 01:38:20 +0000 Subject: [Bioperl-l] dependencies on perl version In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> Message-ID: On 5 February 2013 20:56, Fields, Christopher J wrote: > On Feb 5, 2013, at 2:06 PM, Carn? Draug wrote: >> how much perl backwards compatibility does bioperl needs to keep? > > Aim for 5.10.1, but be careful of smart-match. Well, I solved my problem differently and ended up not needing any of the new features. But next time I'll know. Thanks Carn? From pcantalupo at gmail.com Wed Feb 6 23:04:08 2013 From: pcantalupo at gmail.com (Paul Cantalupo) Date: Wed, 6 Feb 2013 23:04:08 -0500 Subject: [Bioperl-l] bug 3376 status needs updated Message-ID: Hi, A few months ago, I fixed bug 3376 ( https://github.com/bioperl/bioperl-live/commit/142e5d79e3a6593db32bf0af99048f47d01bd3f2). The Redmine bug page (https://redmine.open-bio.org/issues/3376) hasn't been updated to resolved or closed. Should I do this or is Chris the only one who does that? Thank you, Paul From cjfields at illinois.edu Wed Feb 6 23:20:30 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Thu, 7 Feb 2013 04:20:30 +0000 Subject: [Bioperl-l] bug 3376 status needs updated In-Reply-To: References: Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1B45C@CHIMBX5.ad.uillinois.edu> No, go ahead and close it. Let me know if you run into perm. problems with it. chris On Feb 6, 2013, at 10:04 PM, Paul Cantalupo wrote: > Hi, > > A few months ago, I fixed bug 3376 ( > https://github.com/bioperl/bioperl-live/commit/142e5d79e3a6593db32bf0af99048f47d01bd3f2). > The Redmine bug page (https://redmine.open-bio.org/issues/3376) hasn't been > updated to resolved or closed. Should I do this or is Chris the only one > who does that? > > Thank you, > > Paul > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From l.m.timmermans at students.uu.nl Thu Feb 7 04:07:57 2013 From: l.m.timmermans at students.uu.nl (Leon Timmermans) Date: Thu, 7 Feb 2013 10:07:57 +0100 Subject: [Bioperl-l] Next BioPerl release In-Reply-To: <5112bc60.c69e320a.1e98.2028@mx.google.com> References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu> <5112b0b3.a5dc320a.4105.1fe3@mx.google.com> <118F034CF4C3EF48A96F86CE585B94BF6CE1A1AB@CHIMBX5.ad.uillinois.edu> <5112bc60.c69e320a.1e98.2028@mx.google.com> Message-ID: On Wed, Feb 6, 2013 at 9:26 PM, Siddhartha Basu wrote: > As far as the error i encountered, presence of Build.PL was blocking dzil > build/release process. And by default, dzil expects to generate > Build.PL during its build/release process. However, i am not sure which > mode is the most suitable for bioperl devs. You can prune the Build.PL, and then let dzil add its own. We wouldn't be the first to do that sort of thing. Leon From amackey at virginia.edu Thu Feb 7 10:25:07 2013 From: amackey at virginia.edu (Aaron Mackey) Date: Thu, 7 Feb 2013 10:25:07 -0500 Subject: [Bioperl-l] FASTQ, was Re: BioPerl long-term, was Re: dependencies on perl version In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> <118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu> Message-ID: You might also want to consider a lazy/pull-based parser to defer parsing/object-building for pieces of the object that don't get used. This also usually provides some error tolerance. -Aaron -- Aaron J. Mackey, PhD Assistant Professor Center for Public Health Genomics University of Virginia amackey at virginia.edu http://www.cphg.virginia.edu/mackey On Wed, Feb 6, 2013 at 5:53 PM, Fields, Christopher J wrote: > On Feb 6, 2013, at 4:43 PM, Peter Cock wrote: > > > On Wed, Feb 6, 2013 at 10:11 PM, Fields, Christopher J > > wrote: > >> > >> I see no problem in stating any generic parsing and low-level interfaces > >> are just as much a part of what BioPerl encompasses as the higher-level > >> Bio::* classes themselves. Steve and Jason were on to something with > >> SearchIO; it's maybe not as performant as we would like, but it > certainly > >> is more flexible in terms of what can be done, b/c it separates out > >> low-level parsing from object creation. That's the general model we > >> should look at. There is a good reason Biopython is following this > >> model with their SearchIO implementation (Peter C, are you reading > this?) > > > > Actually I don't think we did end up with that kind of separation in the > > Biopython SearchIO - which is not so say it isn't an excellent model > > to follow. Rather the Biopython SearchIO (like the BioPerl one) had > > as the first goal a consistent object model across assorted file > > formats. > > > > The idea of a low level minimal overhead parsers (which are very > > format specific), on which a heavier but consistent object model > > can be built might be a good balance - the high level API has the > > connivence, but if you give that up you can have more speed. > > That's what I recommend with FASTQ and Biopython, e.g. > > http://news.open-bio.org/news/2009/09/biopython-fast-fastq/ > > > >> > >> I have started a wrapper around Heng's FASTQ/FASTA parsing > >> code (kseq), it seems to work quite well (~20M FASTQ in 30 sec > >> last I recall?). > >> > > > > I'd have to dig through my emails, but I think the BioRuby guys > > looked at that too - as I recall while it was fast, the error handling > > left something to be desired. Email me directly or on the BioRuby > > list if you want to follow up on that. > > > > Regards, > > > > Peter > > I did a little on this, worth following up on, but I pulled the FASTQ test > examples you created from the paper to test it out. IIRC it parsed where > it needed to, but I'm not sure how it handled bad sequences, so yes, worth > looking into. Maybe worth moving to open-bio-l for broader discussion. > > chris > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From tiago.hori at gmail.com Thu Feb 7 09:58:37 2013 From: tiago.hori at gmail.com (Tiago Hori) Date: Thu, 7 Feb 2013 06:58:37 -0800 (PST) Subject: [Bioperl-l] Search I::O In-Reply-To: <6B0BCF1B-4B67-4697-9B34-8F822B4DC565@gmail.com> References: <39b1269f-63a7-4b29-af79-8c93ab231abf@googlegroups.com> <6B0BCF1B-4B67-4697-9B34-8F822B4DC565@gmail.com> Message-ID: Thanks, Jason! It is working Now. So here is what I am trying to accomplish. For a given Blastx report, I want to extract the best BLASTx hit that is human, and does not contain unnamed or Predicted. I got very close, but I still can't get it to give me only the top BLAST hit, it gives me all blast hits that meet my criteria. I tried using "last" to stop it from looping through the hits, once it found a human one, but it didn't work. Can someone help? Here is my code so far (mostly stolen for the wiki). use strict; use Bio::SearchIO; my $in = new Bio::SearchIO(-format => 'blast', -file => 'testsalmon.txt'); while( my $result = $in->next_result ) { ## $result is a Bio::Search::Result::ResultI compliant object while( my $hit = $result->next_hit ) { ## $hit is a Bio::Search::Hit::HitI compliant object if( $hit->description !~ /[Uu]nnamed|PREDICTED|hypothetical/){ if( $hit->description =~ /Homo sapiens/){ while( my $hsp = $hit->next_hsp ) { ## $hsp is a Bio::Search::HSP::HSPI compliant object if( $hsp->length('total') > 50 ) { if ( $hsp->percent_identity >= 30) { if( $hsp->evalue <= 1e-05){ print "Query=", $result->query_name,"\t", " Description=", $hit->description,"\t", " Hit=", $hit->name,"\t", " Length=", $hsp->length('total'),"\t", " Percent_id=", $hsp->percent_identity,"\t", } } } } } } } } T. On Wednesday, February 6, 2013 6:46:47 PM UTC-3:30, Jason Stajich wrote: > > you are missing a comma after the -format => 'blast' > should be > my $in = Bio::SearchIO->new(-format => 'blast', > -file => 'XXX' ); > > > On Feb 5, 2013, at 7:21 AM, Tiago Hori > > wrote: > > > Hi All, > > > > I am trying to find the best putative orthologs for 44K Atlantic Salmon > > sequences, and so I need to parse 44K BLAST reports to find the best > human > > hit. I am trying to learn Seach::IO, but when I try the first example on > > the HOWTO: use strict; > > use Bio::SearchIO; > > > > my $in = new Bio::SearchIO(-format => 'blast' > > -file => 'C001R047.txt'); > > > > while( my $result = $in->next_result ) { > > ## $result is a Bio::Search::Result::ResultI compliant object > > while( my $hit = $result->next_hit ) { > > ## $hit is a Bio::Search::Hit::HitI compliant object > > while( my $hsp = $hit->next_hsp ) { > > ## $hsp is a Bio::Search::HSP::HSPI compliant object > > if( $hsp->length('total') > 50 ) { > > if ( $hsp->percent_identity >= 75 ) { > > print "Query=", $result->query_name, > > " Hit=", $hit->name, > > " Length=", $hsp->length('total'), > > " Percent_id=", $hsp->percent_identity, "\n"; > > } > > } > > } > > } > > } > > > > I get this error: Odd number of elements in hash assignment at > > /usr/local/share/perl/5.14.2/Bio/SearchIO.pm line 189. > > > > I am using BioPerl version 1.6.901. Is there a format problem with the > > blast reports? > > > > Any help would be greatly appreciated! > > > > T. > > _______________________________________________ > > Bioperl-l mailing list > > Biop... at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Jason Stajich > jason.... at gmail.com > ja... at bioperl.org > > From cjfields at illinois.edu Thu Feb 7 10:56:04 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Thu, 7 Feb 2013 15:56:04 +0000 Subject: [Bioperl-l] FASTQ, was Re: BioPerl long-term, was Re: dependencies on perl version In-Reply-To: References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> <118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1C42B@CHIMBX5.ad.uillinois.edu> This will likely be the approach for more NGS-friendly Bio::Seq class. Calculation of the PHRED scores could also be deferred until needed. seqtk has some C-based methods that we could possibly take advantage of, but will have to look into it. chris On Feb 7, 2013, at 9:25 AM, Aaron Mackey wrote: > You might also want to consider a lazy/pull-based parser to defer parsing/object-building for pieces of the object that don't get used. This also usually provides some error tolerance. > > -Aaron > > -- > Aaron J. Mackey, PhD > Assistant Professor > Center for Public Health Genomics > University of Virginia > amackey at virginia.edu > http://www.cphg.virginia.edu/mackey > > > On Wed, Feb 6, 2013 at 5:53 PM, Fields, Christopher J wrote: > On Feb 6, 2013, at 4:43 PM, Peter Cock wrote: > > > On Wed, Feb 6, 2013 at 10:11 PM, Fields, Christopher J > > wrote: > >> > >> I see no problem in stating any generic parsing and low-level interfaces > >> are just as much a part of what BioPerl encompasses as the higher-level > >> Bio::* classes themselves. Steve and Jason were on to something with > >> SearchIO; it's maybe not as performant as we would like, but it certainly > >> is more flexible in terms of what can be done, b/c it separates out > >> low-level parsing from object creation. That's the general model we > >> should look at. There is a good reason Biopython is following this > >> model with their SearchIO implementation (Peter C, are you reading this?) > > > > Actually I don't think we did end up with that kind of separation in the > > Biopython SearchIO - which is not so say it isn't an excellent model > > to follow. Rather the Biopython SearchIO (like the BioPerl one) had > > as the first goal a consistent object model across assorted file > > formats. > > > > The idea of a low level minimal overhead parsers (which are very > > format specific), on which a heavier but consistent object model > > can be built might be a good balance - the high level API has the > > connivence, but if you give that up you can have more speed. > > That's what I recommend with FASTQ and Biopython, e.g. > > http://news.open-bio.org/news/2009/09/biopython-fast-fastq/ > > > >> > >> I have started a wrapper around Heng's FASTQ/FASTA parsing > >> code (kseq), it seems to work quite well (~20M FASTQ in 30 sec > >> last I recall?). > >> > > > > I'd have to dig through my emails, but I think the BioRuby guys > > looked at that too - as I recall while it was fast, the error handling > > left something to be desired. Email me directly or on the BioRuby > > list if you want to follow up on that. > > > > Regards, > > > > Peter > > I did a little on this, worth following up on, but I pulled the FASTQ test examples you created from the paper to test it out. IIRC it parsed where it needed to, but I'm not sure how it handled bad sequences, so yes, worth looking into. Maybe worth moving to open-bio-l for broader discussion. > > chris > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From amackey at virginia.edu Thu Feb 7 11:09:14 2013 From: amackey at virginia.edu (Aaron Mackey) Date: Thu, 7 Feb 2013 11:09:14 -0500 Subject: [Bioperl-l] FASTQ, was Re: BioPerl long-term, was Re: dependencies on perl version In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1C42B@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> <118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu> <118F034CF4C3EF48A96F86CE585B94BF6CE1C42B@CHIMBX5.ad.uillinois.edu> Message-ID: e.g., a pull-based FASTQ parser that did nothing else at the top level but "chunk" the file into as-yet-unparsed four-line blobs could appear to work very fast, if the user code did nothing but count the number of entries: while (my $seq = $seqio->nextseq) { $ct++ }; in other words, you defer *everything* except the minimal amount of parsing/logic required to detect object boundaries. This is, in fact, the exact opposite of the event-based SearchIO "push" parsers, which always perform the most parsing possible, despite the user never accessing most of the material. Lastly, with respect to performance, if the parsing/object building operation is not simply IO bound, then parallel parser/object-building CPU threads could be considered, which could then dynamically adapt to pre-parse attributes (e.g. quality scores) that the calling code was actually using. What's the state of thread-safe Perl these days? -Aaron On Thu, Feb 7, 2013 at 10:56 AM, Fields, Christopher J < cjfields at illinois.edu> wrote: > This will likely be the approach for more NGS-friendly Bio::Seq class. > Calculation of the PHRED scores could also be deferred until needed. > > seqtk has some C-based methods that we could possibly take advantage of, > but will have to look into it. > > chris > > On Feb 7, 2013, at 9:25 AM, Aaron Mackey wrote: > > > You might also want to consider a lazy/pull-based parser to defer > parsing/object-building for pieces of the object that don't get used. This > also usually provides some error tolerance. > > > > -Aaron > From sidd.basu at gmail.com Thu Feb 7 11:38:47 2013 From: sidd.basu at gmail.com (Siddhartha Basu) Date: Thu, 7 Feb 2013 10:38:47 -0600 Subject: [Bioperl-l] Re: FASTQ, was Re:BioPerl long-term, was Re: dependencies on perl version In-Reply-To: References: <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> <118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu> <118F034CF4C3EF48A96F86CE585B94BF6CE1C42B@CHIMBX5.ad.uillinois.edu> Message-ID: <5113d899.ea64320a.489a.262d@mx.google.com> Another approach might be use map-reduce(Hadoop) if possible. I have seen one implementation in biopython's GFF3 parser. http://bcbio.wordpress.com/2009/03/22/mapreduce-implementation-of-gff-parsing-for-biopython/ -siddhartha On Thu, 07 Feb 2013, Aaron Mackey wrote: > e.g., a pull-based FASTQ parser that did nothing else at the top level but > "chunk" the file into as-yet-unparsed four-line blobs could appear to work > very fast, if the user code did nothing but count the number of entries: > > while (my $seq = $seqio->nextseq) { $ct++ }; > > in other words, you defer *everything* except the minimal amount of > parsing/logic required to detect object boundaries. > > This is, in fact, the exact opposite of the event-based SearchIO "push" > parsers, which always perform the most parsing possible, despite the user > never accessing most of the material. > > Lastly, with respect to performance, if the parsing/object building > operation is not simply IO bound, then parallel parser/object-building CPU > threads could be considered, which could then dynamically adapt to > pre-parse attributes (e.g. quality scores) that the calling code was > actually using. What's the state of thread-safe Perl these days? > > -Aaron > > > On Thu, Feb 7, 2013 at 10:56 AM, Fields, Christopher J < > cjfields at illinois.edu> wrote: > > > This will likely be the approach for more NGS-friendly Bio::Seq class. > > Calculation of the PHRED scores could also be deferred until needed. > > > > seqtk has some C-based methods that we could possibly take advantage of, > > but will have to look into it. > > > > chris > > > > On Feb 7, 2013, at 9:25 AM, Aaron Mackey wrote: > > > > > You might also want to consider a lazy/pull-based parser to defer > > parsing/object-building for pieces of the object that don't get used. This > > also usually provides some error tolerance. > > > > > > -Aaron > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Thu Feb 7 11:55:53 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Thu, 7 Feb 2013 16:55:53 +0000 Subject: [Bioperl-l] FASTQ, was Re:BioPerl long-term, was Re: dependencies on perl version In-Reply-To: <5113d899.ea64320a.489a.262d@mx.google.com> References: <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> <118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu> <118F034CF4C3EF48A96F86CE585B94BF6CE1C42B@CHIMBX5.ad.uillinois.edu> <5113d899.ea64320a.489a.262d@mx.google.com> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1C7B8@CHIMBX5.ad.uillinois.edu> I think we will want to allow for a multitude of implementations. SeqIO already allows for that to a degree, but multiple backend implementations (say, different ways of parsing/processing FASTQ and others) isn't supported yet. chris On Feb 7, 2013, at 10:38 AM, Siddhartha Basu wrote: > Another approach might be use map-reduce(Hadoop) if possible. I have > seen one implementation in biopython's GFF3 parser. > http://bcbio.wordpress.com/2009/03/22/mapreduce-implementation-of-gff-parsing-for-biopython/ > > -siddhartha > > > On Thu, 07 Feb 2013, Aaron Mackey wrote: > >> e.g., a pull-based FASTQ parser that did nothing else at the top level but >> "chunk" the file into as-yet-unparsed four-line blobs could appear to work >> very fast, if the user code did nothing but count the number of entries: >> >> while (my $seq = $seqio->nextseq) { $ct++ }; >> >> in other words, you defer *everything* except the minimal amount of >> parsing/logic required to detect object boundaries. >> >> This is, in fact, the exact opposite of the event-based SearchIO "push" >> parsers, which always perform the most parsing possible, despite the user >> never accessing most of the material. >> >> Lastly, with respect to performance, if the parsing/object building >> operation is not simply IO bound, then parallel parser/object-building CPU >> threads could be considered, which could then dynamically adapt to >> pre-parse attributes (e.g. quality scores) that the calling code was >> actually using. What's the state of thread-safe Perl these days? >> >> -Aaron >> >> >> On Thu, Feb 7, 2013 at 10:56 AM, Fields, Christopher J < >> cjfields at illinois.edu> wrote: >> >>> This will likely be the approach for more NGS-friendly Bio::Seq class. >>> Calculation of the PHRED scores could also be deferred until needed. >>> >>> seqtk has some C-based methods that we could possibly take advantage of, >>> but will have to look into it. >>> >>> chris >>> >>> On Feb 7, 2013, at 9:25 AM, Aaron Mackey wrote: >>> >>>> You might also want to consider a lazy/pull-based parser to defer >>> parsing/object-building for pieces of the object that don't get used. This >>> also usually provides some error tolerance. >>>> >>>> -Aaron >>> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Thu Feb 7 12:01:07 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Thu, 7 Feb 2013 17:01:07 +0000 Subject: [Bioperl-l] FASTQ, was Re: BioPerl long-term, was Re: dependencies on perl version In-Reply-To: References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> <118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu> <118F034CF4C3EF48A96F86CE585B94BF6CE1C42B@CHIMBX5.ad.uillinois.edu> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1C7EF@CHIMBX5.ad.uillinois.edu> re: thread-safe perl, so-so at best from what I understand. chris On Feb 7, 2013, at 10:09 AM, Aaron Mackey wrote: > e.g., a pull-based FASTQ parser that did nothing else at the top level but "chunk" the file into as-yet-unparsed four-line blobs could appear to work very fast, if the user code did nothing but count the number of entries: > > while (my $seq = $seqio->nextseq) { $ct++ }; > > in other words, you defer *everything* except the minimal amount of parsing/logic required to detect object boundaries. > > This is, in fact, the exact opposite of the event-based SearchIO "push" parsers, which always perform the most parsing possible, despite the user never accessing most of the material. > > Lastly, with respect to performance, if the parsing/object building operation is not simply IO bound, then parallel parser/object-building CPU threads could be considered, which could then dynamically adapt to pre-parse attributes (e.g. quality scores) that the calling code was actually using. What's the state of thread-safe Perl these days? > > -Aaron > > > On Thu, Feb 7, 2013 at 10:56 AM, Fields, Christopher J wrote: > This will likely be the approach for more NGS-friendly Bio::Seq class. Calculation of the PHRED scores could also be deferred until needed. > > seqtk has some C-based methods that we could possibly take advantage of, but will have to look into it. > > chris > > On Feb 7, 2013, at 9:25 AM, Aaron Mackey wrote: > > > You might also want to consider a lazy/pull-based parser to defer parsing/object-building for pieces of the object that don't get used. This also usually provides some error tolerance. > > > > -Aaron From hartzell at alerce.com Thu Feb 7 16:36:24 2013 From: hartzell at alerce.com (George Hartzell) Date: Thu, 7 Feb 2013 13:36:24 -0800 Subject: [Bioperl-l] BioPerl long-term, was Re: dependencies on perl version In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> Message-ID: <20756.7768.125680.662488@gargle.gargle.HOWL> Fields, Christopher J writes: > George, > > Should put your post on a pedestal :) > > tl;dr version: I completely agree, but we need help in order to do this. > [...] And therein lies the [a] problem. Don't look at me.... I'm not coding on bioinformatics problems these days (though I'm available...) so _maybe_ I shouldn't have gotten up on the soapbox. But I'm so sick of getting into arguments (or walking away from them...) with Ruby and Python [and lisp and *PHP*] fans; Perl is dead, you can't write good code in Perl, look - Ruby has GEMS!, etc... Perl of the olden days was an easy language in which to write really shitty code. Even the Perl of the BioPerl heyday wasn't really much help; role your own OO, role your own distro-building, mountains of monkey-work to provide consistent POD, versioning, etc... But that's not the Perl that I use. I have Moose and Moo. TAP and the things built on it. Dist::Zilla. PerlTidy. PerlCritic. cpanm. MetaCPAN. Pinto. GitHub. Perlbrew. Wow. It isn't any harder to write good code, for measures that I care about, using Perl than it is *any* of the other similar languages. And it's just as easy, and happens just as frequently, for people to write shitty (undocumented, untested, poorly managed, poorly packaged, ...) stuff in the other languages. GET OFF MY LAWN, KID! (Yeah, I know...) But BioPerl *is* dying. You might be standing on the shoulders of giants when you use it to solve a problem, but you *definitely* have those same giants (and their extended families) on your shoulders every time I see you try move the project forward. All of that history has become the tail that's wagging the dog. If all y'all are going to keep the thing alive, moving forward and contributing to new great works then make Apple your hero. Deprecate the stuff that's holding you back, give folks a path forward and move on. Have fun. Use sharp tools. Do cool science. Build cool things. Advance your careers (forgot that one last time). Be reasonable and professional. Supporting last year's projects is someone else's business opportunity. g. ps. Are all y'all following this thread? http://news.ycombinator.com/item?id=5123022 Maybe someone should search down for this bit: "Where to start? Any list of this [sic] projects?" and insert a plug for the various open-bio projects. (But "someone" doesn't work here, he said...). From cjfields at illinois.edu Thu Feb 7 18:12:19 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Thu, 7 Feb 2013 23:12:19 +0000 Subject: [Bioperl-l] BioPerl long-term, was Re: dependencies on perl version In-Reply-To: <20756.7768.125680.662488@gargle.gargle.HOWL> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> <20756.7768.125680.662488@gargle.gargle.HOWL> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1D071@CHIMBX5.ad.uillinois.edu> On Feb 7, 2013, at 3:36 PM, George Hartzell wrote: > Fields, Christopher J writes: >> George, >> >> Should put your post on a pedestal :) >> >> tl;dr version: I completely agree, but we need help in order to do this. >> [...] > > And therein lies the [a] problem. Don't look at me.... > > I'm not coding on bioinformatics problems these days (though I'm > available...) so _maybe_ I shouldn't have gotten up on the soapbox. > > But I'm so sick of getting into arguments (or walking away from > them...) with Ruby and Python [and lisp and *PHP*] fans; Perl is dead, > you can't write good code in Perl, look - Ruby has GEMS!, etc? Right, but that's a perception not just in the Bio* world. It's larger and more pervasive than that. > Perl of the olden days was an easy language in which to write really > shitty code. Even the Perl of the BioPerl heyday wasn't really much > help; role your own OO, role your own distro-building, mountains of > monkey-work to provide consistent POD, versioning, etc... > > But that's not the Perl that I use. I have Moose and Moo. TAP and > the things built on it. Dist::Zilla. PerlTidy. PerlCritic. cpanm. > MetaCPAN. Pinto. GitHub. Perlbrew. Wow. Yes, and that is the direction we need to go in. > It isn't any harder to write good code, for measures that I care > about, using Perl than it is *any* of the other similar languages. > > And it's just as easy, and happens just as frequently, for people to > write shitty (undocumented, untested, poorly managed, poorly packaged, > ...) stuff in the other languages. Oh, I know. I'm working on some very nice looking but terribly implemented Python code now. > GET OFF MY LAWN, KID! (Yeah, I know...) > > But BioPerl *is* dying. You might be standing on the shoulders of > giants when you use it to solve a problem, but you *definitely* have > those same giants (and their extended families) on your shoulders > every time I see you try move the project forward. All of that > history has become the tail that's wagging the dog. Yep. > If all y'all are going to keep the thing alive, moving forward and > contributing to new great works then make Apple your hero. Deprecate > the stuff that's holding you back, give folks a path forward and move > on. That's fine. > Have fun. Use sharp tools. Do cool science. Build cool things. > Advance your careers (forgot that one last time). Be reasonable and > professional. > > Supporting last year's projects is someone else's business > opportunity. > > g. Right, but this isn't just my show. I can't do this alone; it's simply too much code and I don't have even 1/4 the time I used to have. > ps. Are all y'all following this thread? > > http://news.ycombinator.com/item?id=5123022 > > Maybe someone should search down for this bit: "Where to start? Any > list of this [sic] projects?" and insert a plug for the various > open-bio projects. (But "someone" doesn't work here, he said?). Read the original guy's post. He's completely delusional (okay, maybe not *completely*, but he comes across as quite bitter and unrealistic). Frankly I don't feel so bad if he wants to leave. He doesn't like messy things. Biology is messy, if one doesn't understand that then computational biology is not for them. chris From carandraug+dev at gmail.com Thu Feb 7 23:12:22 2013 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Fri, 8 Feb 2013 04:12:22 +0000 Subject: [Bioperl-l] BioPerl long-term, was Re: dependencies on perl version Message-ID: On 6 February 2013 22:11, "Fields, Christopher J" wrote: > [...] > So: > > If it means targeting performance, backwards-compatibility be damned (using Devel::NYTProf?), we do that. > > If it means creating a new Bio-NGS repo to focus some of these efforts, so be it. > > If it means we get away from the Java-based interface stuff in favor of something more Perl-like (roles anyone?), then I'm all for it. > > If it means we modularize BioPerl so this can be done, well, you probably know where I stand (yes). > > If it means this is to be BioPerl 2.0, then let's move that direction, sooner than later. > > But I can't do it alone. We (not just me, but we) need to drive the direction we take. > > First one who codes gets the gold ring. Hi I know I'm not much involved with bioperl development but here's my suggestion as maintainer of another quite modular free software project. I swear I'm not promoting it. Skip to the last paragraph for the very short version. Octave Forge is now a collection of packages for GNU Octave, each released independently whenever its maintainer sees fit. But it wasn't like that before. For a long time, everything was released at the same time, there was no independent packages. Then it was decided to split it into sections: main, extra and nonfree (free software dependent on non-free libraries, now purged), and inside those, it was split into packages, each with its own maintainer. But some packages were (and are) more active that the others. Some packages even came from single contributions and we never heard from the authors again. And so, with time, cruft settled in. We didn't want to remove the code, but no one was interested or comfortable enough on the field, to fix it either. Packages that had a much more active development were being dragged down by code that no one was maintaining. So we broke with that and each package is now released independently. We have packages that haven't been released in 3 years yes, but that just shows the packages that no one cares about. Those have been marked as unmaintained and anyone can come around and make a release if they care about it. As the maintainer of the project, I do *not* make the releases of the packages. The package maintainers prepares everything and uploads them, I only run a handful of tests (takes me 10min), upload it to our server, and make the official announcement. I am also the maintainer of one of the packages, and have often made releases of unmaintained packages because I needed it. That's to show, if they are important enough for someone, they will get a release somehow. If they are not important, why would we waste our time on them anyway? We now around 5 package releases per month, many of them being minor releases with a handful of bug fixes. Preparing a release of a small package is much easier and much less trouble than preparing a giant release encompassing all of them at the same time. Short version: I'd recommend to split the project into much smaller ones. Some of the small ones will wither and die but those are the less important ones, and will allow the others, the ones that people care about, freedom to grow faster. Bioperl would still be just one project, that incorporates a hundred or so of smaller modules. Let those who care the most about a specific module to take care of it and make the releases. Releasing a module becomes much simpler, which means more releases, more activity, and the smaller code base for each module also make it less intimidating for new contributors. Carn? From hartzell at alerce.com Fri Feb 8 01:17:17 2013 From: hartzell at alerce.com (George Hartzell) Date: Thu, 7 Feb 2013 22:17:17 -0800 Subject: [Bioperl-l] injecting a bit of levity.... Message-ID: <20756.39021.553502.116384@gargle.gargle.HOWL> Perl's not dead. It's FAMOUS! http://imgs.xkcd.com/comics/perl_problems.png g. From carandraug+dev at gmail.com Fri Feb 8 01:57:30 2013 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Fri, 8 Feb 2013 06:57:30 +0000 Subject: [Bioperl-l] getting a Bio::Search::HSP::HSPI from Bio::SimpleAlign (to find differences between sequences) Message-ID: Hi I already have a Bio::SimpleAlign object (got it after using TCoffee through bioperl-run module) and I'm trying to get a Bio::Search::HSP::HSPI object from a pair of the aligned sequences. How can I do this? I want to use the seq_inds method to compare the sequences. Here's my actual problem just in case I should be trying to fix it some other way. I have a bunch of sequences from protein isoforms. They have small differences between them, point-mutations, small insertions or deletions, nothing too big. I want to make a table of the mutations that each of them has against the consensus sequence. I already made the alignment and got have the consensus with "$align->consensus_string". Now, I want to get something like: isoform1: Ala67Gly, His90_Met91insGln isoform2: .... The seq_inds method from the Bio::Search::HSP::HSPI class seems to do the part of finding the differences, but how can I get one? I can't find it on the documentation. Any tips, and even showing a different approach to my problem, are most appreciated. Thanks, Carn? From l.m.timmermans at students.uu.nl Fri Feb 8 06:18:58 2013 From: l.m.timmermans at students.uu.nl (Leon Timmermans) Date: Fri, 8 Feb 2013 12:18:58 +0100 Subject: [Bioperl-l] BioPerl long-term, was Re: dependencies on perl version In-Reply-To: <20756.7768.125680.662488@gargle.gargle.HOWL> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> <20756.7768.125680.662488@gargle.gargle.HOWL> Message-ID: On Thu, Feb 7, 2013 at 10:36 PM, George Hartzell wrote: > But I'm so sick of getting into arguments (or walking away from > them...) with Ruby and Python [and lisp and *PHP*] fans; Perl is dead, > you can't write good code in Perl, look - Ruby has GEMS!, etc... > > Perl of the olden days was an easy language in which to write really > shitty code. Even the Perl of the BioPerl heyday wasn't really much > help; role your own OO, role your own distro-building, mountains of > monkey-work to provide consistent POD, versioning, etc... > > But that's not the Perl that I use. I have Moose and Moo. TAP and > the things built on it. Dist::Zilla. PerlTidy. PerlCritic. cpanm. > MetaCPAN. Pinto. GitHub. Perlbrew. Wow. I share that experience. > But BioPerl *is* dying. You might be standing on the shoulders of > giants when you use it to solve a problem, but you *definitely* have > those same giants (and their extended families) on your shoulders > every time I see you try move the project forward. All of that > history has become the tail that's wagging the dog. I share your sentiment. Most of BioPerl is architected so badly I can't stomach it most days, and I've worked on hairy codebases included perl itself. There's just too much sick and wrong. It's like hundreds of dot-com-era cgi scripts. The problem (which is common in scientific computing) is that once code works it's effectively abandoned. BioPerl is essentially a gathering of more than a thousand such modules. > If all y'all are going to keep the thing alive, moving forward and > contributing to new great works then make Apple your hero. Deprecate > the stuff that's holding you back, give folks a path forward and move > on. That would be lovely, but who is going to do that? We're suffering from the tragedy of the commons. > Have fun. Use sharp tools. Do cool science. Build cool things. > Advance your careers (forgot that one last time). Be reasonable and > professional. Sounds like good advice to me :-) > Supporting last year's projects is someone else's business > opportunity. True! > ps. Are all y'all following this thread? > > http://news.ycombinator.com/item?id=5123022 > > Maybe someone should search down for this bit: "Where to start? Any > list of this [sic] projects?" and insert a plug for the various > open-bio projects. (But "someone" doesn't work here, he said...). Interesting discussion, though the original post is too cynical even for my taste. Leon From cjfields at illinois.edu Fri Feb 8 09:08:56 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Fri, 8 Feb 2013 14:08:56 +0000 Subject: [Bioperl-l] BioPerl long-term, was Re: dependencies on perl version In-Reply-To: References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> <20756.7768.125680.662488@gargle.gargle.HOWL> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1DA2D@CHIMBX5.ad.uillinois.edu> On Feb 8, 2013, at 5:18 AM, Leon Timmermans wrote: > On Thu, Feb 7, 2013 at 10:36 PM, George Hartzell wrote: >> But I'm so sick of getting into arguments (or walking away from >> them...) with Ruby and Python [and lisp and *PHP*] fans; Perl is dead, >> you can't write good code in Perl, look - Ruby has GEMS!, etc... >> >> Perl of the olden days was an easy language in which to write really >> shitty code. Even the Perl of the BioPerl heyday wasn't really much >> help; role your own OO, role your own distro-building, mountains of >> monkey-work to provide consistent POD, versioning, etc... >> >> But that's not the Perl that I use. I have Moose and Moo. TAP and >> the things built on it. Dist::Zilla. PerlTidy. PerlCritic. cpanm. >> MetaCPAN. Pinto. GitHub. Perlbrew. Wow. > > I share that experience. > >> But BioPerl *is* dying. You might be standing on the shoulders of >> giants when you use it to solve a problem, but you *definitely* have >> those same giants (and their extended families) on your shoulders >> every time I see you try move the project forward. All of that >> history has become the tail that's wagging the dog. > > I share your sentiment. Most of BioPerl is architected so badly I > can't stomach it most days, and I've worked on hairy codebases > included perl itself. There's just too much sick and wrong. It's like > hundreds of dot-com-era cgi scripts. > > The problem (which is common in scientific computing) is that once > code works it's effectively abandoned. BioPerl is essentially a > gathering of more than a thousand such modules. Yep, the progression from 'it works' to 'it works very well' tends to have very high activation energy. Many of the fixes tend to be more bandaids (get it working) than fundamental surgery. I tried my hand at this, got a few things done. >> If all y'all are going to keep the thing alive, moving forward and >> contributing to new great works then make Apple your hero. Deprecate >> the stuff that's holding you back, give folks a path forward and move >> on. > > That would be lovely, but who is going to do that? We're suffering > from the tragedy of the commons. Spot on, but we could break that path for the time being. I think BioPerl as is will have to be in maintenance mode; we need a new effort to break with older perl, older practices. >> Have fun. Use sharp tools. Do cool science. Build cool things. >> Advance your careers (forgot that one last time). Be reasonable and >> professional. > > Sounds like good advice to me :-) > >> Supporting last year's projects is someone else's business >> opportunity. > > True! We just need to make a bioperl 1.x branch for the maintenance bit, rechristen 'master' as 'v2', and just move on to fixing the f****** code. Let's move on that. >> ps. Are all y'all following this thread? >> >> http://news.ycombinator.com/item?id=5123022 >> >> Maybe someone should search down for this bit: "Where to start? Any >> list of this [sic] projects?" and insert a plug for the various >> open-bio projects. (But "someone" doesn't work here, he said...). > > Interesting discussion, though the original post is too cynical even > for my taste. > > Leon Yes, that's not unusual unfortunately. We have a number of physicists and mathematicians here who have started their initial forays into computational biology, they're all startled at how noisy it is and how messy code can. Of course their disciplines have had the benefit of teaching students how to (somewhat decently) code for the last 40 years. chris From l.m.timmermans at students.uu.nl Fri Feb 8 07:08:06 2013 From: l.m.timmermans at students.uu.nl (Leon Timmermans) Date: Fri, 8 Feb 2013 13:08:06 +0100 Subject: [Bioperl-l] BioPerl long-term, was Re: dependencies on perl version In-Reply-To: References: Message-ID: On Fri, Feb 8, 2013 at 5:12 AM, Carn? Draug wrote: > Short version: > I'd recommend to split the project into much smaller ones. Some of the > small ones will wither and die but those are the less important ones, > and will allow the others, the ones that people care about, freedom to > grow faster. Bioperl would still be just one project, that > incorporates a hundred or so of smaller modules. Let those who care > the most about a specific module to take care of it and make the > releases. Releasing a module becomes much simpler, which means more > releases, more activity, and the smaller code base for each module > also make it less intimidating for new contributors. That has been a goal for some time now, but it's fairly complicated. Not only do we have a LOT of modules (bioperl-live alone is more than 900), they also have complicated dependencies. I've attached the results of my static dependency analysis of bioperl-live. I suspect this split-up needs to done by automated graph analysis, it's too much to do by hand. Leon -------------- next part -------------- A non-text attachment was scrubbed... Name: deps.dot Type: application/octet-stream Size: 93463 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: deps.png Type: image/png Size: 6694525 bytes Desc: not available URL: From sebastien.moretti at unil.ch Fri Feb 8 11:19:29 2013 From: sebastien.moretti at unil.ch (=?ISO-8859-1?Q?Moretti_S=E9bastien?=) Date: Fri, 08 Feb 2013 17:19:29 +0100 Subject: [Bioperl-l] PhyloXML Message-ID: <51152591.9010402@unil.ch> Hi I would like to add some XML to an existing PhyloXML tree. No problem to read and write it. I would like to add smthg after the tag as in http://www.phyloxml.org/examples_syntax/phyloxml_syntax_example_1.html but get problems with add_phyloXML_annotation() : Can't locate object method "annotation" via package "Bio::Tree::Tree" at /software/share/perl5/vendor_perl/Bio/TreeIO/phyloxml.pm line 984, line 1 (#1) (F) You called a method correctly, and it correctly indicated a package functioning as a class, but that package doesn't define that particular method, nor does any of its base classes. See perlobj. Uncaught exception from user code: Can't locate object method "annotation" via package "Bio::Tree::Tree" at /software/share/perl5/vendor_perl/Bio/TreeIO/phyloxml.pm line 984, line 1. at /software/share/perl5/vendor_perl/Bio/TreeIO/phyloxml.pm line 984 Bio::TreeIO::phyloxml::element_default('Bio::TreeIO::phyloxml=HASH(0x134b1268)') called at /software/share/perl5/vendor_perl/Bio/TreeIO/phyloxml.pm line 670 Bio::TreeIO::phyloxml::processXMLNode('Bio::TreeIO::phyloxml=HASH(0x134b1268)') called at /software/share/perl5/vendor_perl/Bio/TreeIO/phyloxml.pm line 309 Bio::TreeIO::phyloxml::add_phyloXML_annotation('Bio::TreeIO::phyloxml=HASH(0x134b1268)', '-obj', 'Bio::Tree::Tree=HASH(0x13525258)', '-xml', 'SUMF family') called at ./add_annotation_to_phyloxml.pl line 40 I think I do something wrong but what ? Here is the code my $treeio = new Bio::TreeIO(-file => "$infile", -format => 'phyloxml', ); my $tree = $treeio->next_tree; # Add annotation $treeio->add_phyloXML_annotation(-obj => $tree, -xml => 'SUMF family', ); -- S?bastien Moretti From cjfields at illinois.edu Sat Feb 9 01:25:17 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Sat, 9 Feb 2013 06:25:17 +0000 Subject: [Bioperl-l] BioPerl future Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1F217@CHIMBX5.ad.uillinois.edu> All, (cross-posting to gmod-gbrowse) I want to gauge the community's thoughts on a few things. At the moment I think we can safely say that BioPerl 1.x is in maintenance mode. By 'maintenance mode', I mean that we can only do so much with it w/o breaking backwards compatibility with old scripts. We need a way forward so that we can address fundamental problems within the core codebase, namely speed. I am thinking at the moment of pushing a 'v1' branch next week after I make an official announcement, with a new 1.6 release coming out from that branch (as already announced, tentatively scheduled for March 1). That frees up master for any code development, removal of modules/cruft, etc. This will open an initial path forward and at least enable us to do more. Make sense? This of course means that any code reliant on v1 should pull from that branch instead of 'master'. Thoughts? chris From cjfields at illinois.edu Sat Feb 9 01:43:24 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Sat, 9 Feb 2013 06:43:24 +0000 Subject: [Bioperl-l] BioPerl long-term, was Re: dependencies on perl version In-Reply-To: References: Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1F2C6@CHIMBX5.ad.uillinois.edu> On Feb 8, 2013, at 6:08 AM, Leon Timmermans wrote: > On Fri, Feb 8, 2013 at 5:12 AM, Carn? Draug wrote: >> Short version: >> I'd recommend to split the project into much smaller ones. Some of the >> small ones will wither and die but those are the less important ones, >> and will allow the others, the ones that people care about, freedom to >> grow faster. Bioperl would still be just one project, that >> incorporates a hundred or so of smaller modules. Let those who care >> the most about a specific module to take care of it and make the >> releases. Releasing a module becomes much simpler, which means more >> releases, more activity, and the smaller code base for each module >> also make it less intimidating for new contributors. > > That has been a goal for some time now, but it's fairly complicated. > Not only do we have a LOT of modules (bioperl-live alone is more than > 900), they also have complicated dependencies. I've attached the > results of my static dependency analysis of bioperl-live. I suspect > this split-up needs to done by automated graph analysis, it's too much > to do by hand. > > Leon > Leon, I'm hoping we can do this sooner than later. In fact, if we proceed with make a 'v1' branch or something similar, we can start extricating out code sooner than later (next few weeks). chris From cjfields at illinois.edu Sat Feb 9 08:51:35 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Sat, 9 Feb 2013 13:51:35 +0000 Subject: [Bioperl-l] [Gmod-gbrowse] BioPerl future Message-ID: Sheldon, The branch is where the old (v1.x) code would reside. Master branch would be v2. Chris Sent via phone -------- Original message -------- From: Sheldon McKay Date: To: "Fields, Christopher J" Cc: BioPerl List ,gmod-gbrowse at lists.sourceforge.net Subject: Re: [Gmod-gbrowse] BioPerl future Hi Chris, This sounds like a good idea. I think it will eventually allow bioperl to evolve into a leaner, meaner package that would be more likely to be adopted by new or isolated bioinformaticians, who tend to be put off by the size and complexity of bioperl as it now stands. One question I have is whether the name of branch v1 might be perceived as a step backward. How about v2? Sheldon On Saturday, February 9, 2013, Fields, Christopher J wrote: All, (cross-posting to gmod-gbrowse) I want to gauge the community's thoughts on a few things. At the moment I think we can safely say that BioPerl 1.x is in maintenance mode. By 'maintenance mode', I mean that we can only do so much with it w/o breaking backwards compatibility with old scripts. We need a way forward so that we can address fundamental problems within the core codebase, namely speed. I am thinking at the moment of pushing a 'v1' branch next week after I make an official announcement, with a new 1.6 release coming out from that branch (as already announced, tentatively scheduled for March 1). That frees up master for any code development, removal of modules/cruft, etc. This will open an initial path forward and at least enable us to do more. Make sense? This of course means that any code reliant on v1 should pull from that branch instead of 'master'. Thoughts? chris ------------------------------------------------------------------------------ Free Next-Gen Firewall Hardware Offer Buy your Sophos next-gen firewall before the end March 2013 and get the hardware for free! Learn more. http://p.sf.net/sfu/sophos-d2d-feb _______________________________________________ Gmod-gbrowse mailing list Gmod-gbrowse at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse -- Sheldon McKay, PhD Computational Biologist DNA Learning Center Cold Spring Harbor Laboratory 1 Bungtown Rd Cold Spring Harbor, NY 11724 (516) 367-5185 www.dnalc.org From sheldon.mckay at gmail.com Sat Feb 9 08:04:50 2013 From: sheldon.mckay at gmail.com (Sheldon McKay) Date: Sat, 9 Feb 2013 08:04:50 -0500 Subject: [Bioperl-l] [Gmod-gbrowse] BioPerl future In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1F217@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1F217@CHIMBX5.ad.uillinois.edu> Message-ID: Hi Chris, This sounds like a good idea. I think it will eventually allow bioperl to evolve into a leaner, meaner package that would be more likely to be adopted by new or isolated bioinformaticians, who tend to be put off by the size and complexity of bioperl as it now stands. One question I have is whether the name of branch v1 might be perceived as a step backward. How about v2? Sheldon On Saturday, February 9, 2013, Fields, Christopher J wrote: > All, > > (cross-posting to gmod-gbrowse) > > I want to gauge the community's thoughts on a few things. At the moment I > think we can safely say that BioPerl 1.x is in maintenance mode. By > 'maintenance mode', I mean that we can only do so much with it w/o breaking > backwards compatibility with old scripts. We need a way forward so that we > can address fundamental problems within the core codebase, namely speed. > > I am thinking at the moment of pushing a 'v1' branch next week after I > make an official announcement, with a new 1.6 release coming out from that > branch (as already announced, tentatively scheduled for March 1). That > frees up master for any code development, removal of modules/cruft, etc. > This will open an initial path forward and at least enable us to do more. > Make sense? This of course means that any code reliant on v1 should pull > from that branch instead of 'master'. > > Thoughts? > > chris > > ------------------------------------------------------------------------------ > Free Next-Gen Firewall Hardware Offer > Buy your Sophos next-gen firewall before the end March 2013 > and get the hardware for free! Learn more. > http://p.sf.net/sfu/sophos-d2d-feb > _______________________________________________ > Gmod-gbrowse mailing list > Gmod-gbrowse at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse > -- Sheldon McKay, PhD Computational Biologist DNA Learning Center Cold Spring Harbor Laboratory 1 Bungtown Rd Cold Spring Harbor, NY 11724 (516) 367-5185 www.dnalc.org From cjfields at illinois.edu Sat Feb 9 23:25:14 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Sun, 10 Feb 2013 04:25:14 +0000 Subject: [Bioperl-l] BioPerl future In-Reply-To: <2F53583C-9E7D-4D6A-A4C2-E5C27DDBA493@illinois.edu> References: <2F53583C-9E7D-4D6A-A4C2-E5C27DDBA493@illinois.edu> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1FC4C@CHIMBX5.ad.uillinois.edu> Apologies if you receive this twice. I never received the replies from the gbrowse list through bioperl-l so it is possible there were mail issues last night. ------------------------ All, (cross-posting to gmod-gbrowse) I want to gauge the community's thoughts on a few things. At the moment I think we can safely say that BioPerl 1.x is in maintenance mode. By 'maintenance mode', I mean that we can only do so much with it w/o breaking backwards compatibility with old scripts. We need a way forward so that we can address fundamental problems within the core codebase, namely speed. I am thinking at the moment of pushing a 'v1' branch next week after I make an official announcement, with a new 1.6 release coming out from that branch (as already announced, tentatively scheduled for March 1). That frees up master for any code development, removal of modules/cruft, etc. This will open an initial path forward and at least enable us to do more. Make sense? This of course means that any code reliant on v1 should pull from that branch instead of 'master'. Thoughts? chris From genehack at genehack.org Sat Feb 9 23:36:07 2013 From: genehack at genehack.org (John SJ Anderson) Date: Sat, 9 Feb 2013 20:36:07 -0800 Subject: [Bioperl-l] BioPerl future In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1FC4C@CHIMBX5.ad.uillinois.edu> References: <2F53583C-9E7D-4D6A-A4C2-E5C27DDBA493@illinois.edu> <118F034CF4C3EF48A96F86CE585B94BF6CE1FC4C@CHIMBX5.ad.uillinois.edu> Message-ID: <668BED38-61AE-4D21-A3BD-B7AEC9361EF6@genehack.org> On Feb 9, 2013, at 8:25 PM, "Fields, Christopher J" wrote: > Thoughts? +1 The other thing to maybe give some advance thought to is organization of the new development. Maybe instead of one big repository, we can encourage the more loosely coupled small pieces that everybody seems to realize we need by having more, smaller repositories. j. -- John SJ Anderson // genehack at genehack.org From carandraug+dev at gmail.com Sun Feb 10 13:40:33 2013 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Sun, 10 Feb 2013 18:40:33 +0000 Subject: [Bioperl-l] BioPerl future Message-ID: On 10 February 2013 17:00, wrote: > Message: 3 > Date: Sat, 9 Feb 2013 20:36:07 -0800 > From: John SJ Anderson > Subject: Re: [Bioperl-l] BioPerl future > To: "Fields, Christopher J" > Cc: BioPerl List > Message-ID: <668BED38-61AE-4D21-A3BD-B7AEC9361EF6 at genehack.org> > Content-Type: text/plain; charset=us-ascii > > On Feb 9, 2013, at 8:25 PM, "Fields, Christopher J" wrote: > >> Thoughts? > > +1 > > The other thing to maybe give some advance thought to is organization of the new development. Maybe instead of one big repository, we can encourage the more loosely coupled small pieces that everybody seems to realize we need by having more, smaller repositories. For those interested, I have just added instructions on the wiki on how to split a subset of modules, tests, files, etc from the bioperl-live repository into a new repository while keeping their old history. http://www.bioperl.org/wiki/Using_Git/Advanced#Split_a_module_from_bioperl-live Carn? From cjfields at illinois.edu Sun Feb 10 15:08:35 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Sun, 10 Feb 2013 20:08:35 +0000 Subject: [Bioperl-l] BioPerl future In-Reply-To: References: Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE20632@CHIMBX5.ad.uillinois.edu> On Feb 10, 2013, at 12:40 PM, Carn? Draug wrote: > On 10 February 2013 17:00, wrote: >> Message: 3 >> Date: Sat, 9 Feb 2013 20:36:07 -0800 >> From: John SJ Anderson >> Subject: Re: [Bioperl-l] BioPerl future >> To: "Fields, Christopher J" >> Cc: BioPerl List >> Message-ID: <668BED38-61AE-4D21-A3BD-B7AEC9361EF6 at genehack.org> >> Content-Type: text/plain; charset=us-ascii >> >> On Feb 9, 2013, at 8:25 PM, "Fields, Christopher J" wrote: >> >>> Thoughts? >> >> +1 >> >> The other thing to maybe give some advance thought to is organization of the new development. Maybe instead of one big repository, we can encourage the more loosely coupled small pieces that everybody seems to realize we need by having more, smaller repositories. > > For those interested, I have just added instructions on the wiki on > how to split a subset of modules, tests, files, etc from the > bioperl-live repository into a new repository while keeping their old > history. > > http://www.bioperl.org/wiki/Using_Git/Advanced#Split_a_module_from_bioperl-live > > Carn? It's probably worth looking at this page as well, then: http://www.bioperl.org/wiki/BioPerl_Modularization We should probably merge the two. chris From hlapp at drycafe.net Sun Feb 10 20:03:34 2013 From: hlapp at drycafe.net (Hilmar Lapp) Date: Sun, 10 Feb 2013 20:03:34 -0500 Subject: [Bioperl-l] PhyloXML In-Reply-To: <51152591.9010402@unil.ch> References: <51152591.9010402@unil.ch> Message-ID: On Feb 8, 2013, at 11:19 AM, Moretti S?bastien wrote: > # Add annotation > $treeio->add_phyloXML_annotation(-obj => $tree, > -xml => 'SUMF family', > ); If you really have $treeio in your script in this line and not $tree, then that's at least one problem. But the stack suggests that the above isn't the exact line in your script - can you confirm that? -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : =========================================================== From sebastien.moretti at unil.ch Mon Feb 11 02:08:22 2013 From: sebastien.moretti at unil.ch (=?ISO-8859-1?Q?S=E9bastien_MORETTI?=) Date: Mon, 11 Feb 2013 08:08:22 +0100 Subject: [Bioperl-l] PhyloXML In-Reply-To: References: <51152591.9010402@unil.ch> Message-ID: <511898E6.7060400@unil.ch> >> # Add annotation >> $treeio->add_phyloXML_annotation(-obj => $tree, >> -xml => 'SUMF family', >> ); > > If you really have $treeio in your script in this line and not $tree, then that's at least one problem. But the stack suggests that the above isn't the exact line in your script - can you confirm that? > > -hilmar I replaced $treeio by $tree in the above line but still get an error. Don't see what you mean by "the stack suggests that the above isn't the exact line in your script" The only think I changed is the length of the xml string I try to insert. But get the same error with an empty xml string. my $treeio = new Bio::TreeIO(-file => "$infile", -format => 'phyloxml', ); my $tree = $treeio->next_tree; # Add annotation $tree->add_phyloXML_annotation(-obj => $tree, -xml => 'SUMF family', ); Can't locate object method "add_phyloXML_annotation" via package "Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, line 1 (#1) (F) You called a method correctly, and it correctly indicated a package functioning as a class, but that package doesn't define that particular method, nor does any of its base classes. See perlobj. Uncaught exception from user code: Can't locate object method "add_phyloXML_annotation" via package "Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, line 1. at ./add_annotation_to_phyloxml.pl line 40 -- S?bastien Moretti Department of Ecology and Evolution, Biophore, University of Lausanne, CH-1015 Lausanne, Switzerland Tel.: +41 (21) 692 4221/4079 http://bioinfo.unil.ch/ From saladi1 at illinois.edu Tue Feb 12 16:24:34 2013 From: saladi1 at illinois.edu (Shyam Saladi) Date: Tue, 12 Feb 2013 13:24:34 -0800 Subject: [Bioperl-l] Bio::Tools::SeqStats->count_codons Message-ID: Hi, I am using the count_codons method from Bio::Tools::SeqStats and keep getting "AMBIGUOUS" codons, but I can't figure out why exactly. When I translate the same sequence that gives the error using another standard utility like (ExPASy - Translate), it seems to work alright. An example sequence is below. Could anyone lend some insight? Thanks, Shyam AAA AAC AAG AAT ACA ACC ACG ACT AGA AGC AGT *AMBIGUOUS* ATA ATC ATG ATT CAA CAC CAG CAT CCA CCC CCG CCT CGA CGC CGG CGT CTA CTC CTG CTT GAA GAC GAG GAT GCA GCC GCG GCT GGA GGC GGG GGT GTA GTC GTG GTT TAA TAC TAT TCA TCC TCG TCT TGG TGT TTA TTC TTG TTT count filename 1.722488038277511961722488038277511961722 2.966507177033492822966507177033492822967 1.531100478468899521531100478468899521531 0.9569377990430622009569377990430622009569 0.4784688995215311004784688995215311004785 1.722488038277511961722488038277511961722 1.33971291866028708133971291866028708134 1.913875598086124401913875598086124401914 0.1913875598086124401913875598086124401914 0.7655502392344497607655502392344497607656 1.435406698564593301435406698564593301435 * 0.09569377990430622009569377990430622009569* 0.3827751196172248803827751196172248803828 2.488038277511961722488038277511961722488 3.349282296650717703349282296650717703349 3.636363636363636363636363636363636363636 2.870813397129186602870813397129186602871 0.3827751196172248803827751196172248803828 1.626794258373205741626794258373205741627 0.4784688995215311004784688995215311004785 1.722488038277511961722488038277511961722 0.5741626794258373205741626794258373205742 1.052631578947368421052631578947368421053 1.244019138755980861244019138755980861244 0.3827751196172248803827751196172248803828 0.7655502392344497607655502392344497607656 0.1913875598086124401913875598086124401914 2.488038277511961722488038277511961722488 0.4784688995215311004784688995215311004785 0.6698564593301435406698564593301435406699 2.105263157894736842105263157894736842105 0.8612440191387559808612440191387559808612 2.870813397129186602870813397129186602871 1.435406698564593301435406698564593301435 1.722488038277511961722488038277511961722 2.775119617224880382775119617224880382775 2.00956937799043062200956937799043062201 2.488038277511961722488038277511961722488 3.540669856459330143540669856459330143541 2.00956937799043062200956937799043062201 0.1913875598086124401913875598086124401914 2.392344497607655502392344497607655502392 0.8612440191387559808612440191387559808612 5.454545454545454545454545454545454545455 1.913875598086124401913875598086124401914 0.8612440191387559808612440191387559808612 4.593301435406698564593301435406698564593 2.679425837320574162679425837320574162679 0.09569377990430622009569377990430622009569 1.148325358851674641148325358851674641148 1.148325358851674641148325358851674641148 0.8612440191387559808612440191387559808612 0.4784688995215311004784688995215311004785 2.105263157894736842105263157894736842105 0.9569377990430622009569377990430622009569 0.9569377990430622009569377990430622009569 0.09569377990430622009569377990430622009569 2.679425837320574162679425837320574162679 2.966507177033492822966507177033492822967 3.062200956937799043062200956937799043062 2.775119617224880382775119617224880382775 1045 temp.seq ATGGCACGTTTTTTTATTGATCGTCCCATCTTTGCGTGGGTGATCGCCTTAATTATTATGTTGGCGGGGGTGCTTTCAATTCGCACCCTGCCGGTTTCTCAATATCCCAGCATTGCACCGCCAACCGTGGTGATCAGTGCTAACTACCCTGGTGCATCGGCCAAGATTGTTGAAGACTCAGTGACTCAGGTGATTGAGCAACGCATGAAGGGTATCGATCACCTACGTTATATTGCCTCAACCAGCGATAGTTTCGGTAATGCTGAAATCACTTTGACCTTCAATGCCGAAGCCGATCCTGATATTGCTCAGGTACAAGTTCAGAACAAATTGCAGGGTGCAATGACCCTGTTACCACAAGAGGTACAGGCTCAAGGGGTTGACGTTAACAAATCAAGTTCTGGCTTYTTGATGGTGCTGGGTTTCGTATCGACTGACGGTTCCTTAGATAAAGGCGACATCGCCGACTATGTGGGTGCAAACGTACAAGATCCCATGAGCCGTGTACCGGGCGTGGGTGAAATTCAGCTGTTTGGTGCCCAATATGCGATGCGTATATGGCTTGATCCTTTAAAACTGACTCAATATAACTTGACCAGTTTAGAGGTGATCTCGGCGATTCGTGCTCAAAACGCGCAGGTGTCTGCGGGTCAGTTGGGTGGTACGCCGTCAATTCAAGGGCAAGAACTTAACGCCACTGTTTCGGCGCAAAGTCGTTTGCAAACCCCTGAAGAGTTTCGCAAGATTATCCTGAAGTCTGATACTTCGGGTGCGAATGTGTTCCTCGGTGATGTGGCGCGCGTAGAGTTAGGTTCAGAGAGTTATGCCGTTGTCTCGTTCTACAATGGTAAGCCTGCTACTGGTTTAGCGATTAAACTGGCGACAGGCGCAAACGCGTTGGATACCGCTGAAGCTGTTCGTGATAAAGTTGAAGAATTGCGACCTTTCTTCCCGCAAGGGTTGGATGTTGTTTATCCCTACGATACTACGCCATTCGTTGAGAAATCGATAGAAGGCGTGGTACACACCCTGCTCGAAGCGATTGTTCTGGTGTTTGTCATCATGTACCTCTTCCTGCAAAACTTCCGTGCGACCTTAATTCCGACGATTGCGGTACCAGTGGTCTTGCTGGGAACGTTTGCGATTTTGTCGGCCACGGGCTTCTCTATCAACACCCTTACCATGTTTGCTATGGTGCTGGCGATTGGTCTGTTGGTGGACGACGCCATCGTGGTGGTTGAAAACGTTGAGCGGGTGATGTCGGAAGAAGGGTTGAGCCCACTCGAAGCGACTCGTAAATCGATGGATCAAATCACTGGCGCCTTAGTTGGTATTGGTTTGACGTTATCTGCTGTATTTGTGCCAATGGCATTTATGTCGGGTTCTACTGGGGTCATTTACCGTCAGTTCTCGATCACTATCGTGTCTGCGATGGCATTGTCGGTATTAGTGGCCTTGATTTTAACGCCGGCACTTTGTGCCACTATGTTAAAACCCGTGCAGAAGGGACATGGTCATATTGAAACCGGTTTCTTCGGTTGGTTTAACCGTAACTTTGATCGCTTAACTAACCGTTACGAATCCAGTGTGGCGGGCATAGTGAAGCGTGGCTTTAGAGTCATGATGATTTATGTGGCTTTAGTGGTCGCCGTCGGTTGGATCTTCATGCGTATGCCAACTGCATTCTTACCCGATGAAGACCAAGGTATCTTGTTTACGCAGGCGATTTTGCCAACAAACTCGACTCAAGAAAGTACCCTCAAAGTGCTGGATAAGGTATCCGATCACTTCATGGCTGAAGAAGGCGTGAGATCGGTATTCAGCGTGGCGGGCTTTAGCTTTGCGGGTCAAGGCCAAAACATGGGTATCGCTTTCGTTGGCTTGAAGGATTGGTCAGAGCGTGAAGCACCTGGTATGGATGTGCAGTCTATTGCGGGTCGTGCTATGGGTGCCTTTAGTCAAATTAAAGACGCCTTCGTATTTGCCTTCGTACCACCTGCGGTTATTGAGCTGGGTACGGCGAATGGTTTTGACATGTACCTGCAAGATAAAAACGGTCAAGGCCACGATAAGTTAATAGCGGCTCGTAACCAATTGCTGGGTATGGCGGCTCAGAATCCAAACCTTATGGGTGTTCGCCCTAATGGTCAGGAAGATGCGCCAATCTATCAATTGCATATTGATCATGCAAAGTTGAGCGCATTAGGCGTTGATATTGCTAACGTTAACAGTGTGTTGGCAACTGCTTGGGGTGGTTCCTATGTGAACGATTTTATCGACCGCGGCCGTGTGAAAAAGGTATTTGTGCAAGGTGATGCCCAATACCGTATGCAGCCTGAAGACCTCAACACTTGGTACGTGCGTAACAACAAGGGTGACATGGTGCCATTTTCGGCCTTTGCAACAGGTTCTTGGGAATACGGCTCACCGCGTCTAGAACGTTTTAACGGTTTACCAGCGGTGAATATTCAAGGCGCAACTGCACCAGGCTTTAGTACGGGTGCTGCCATGACTATCATGGAGGACTTAGTTAAGCAGCTACCACCTGGCTTTGGCATCGAGTGGAACGGCTTATCCTACGAGGAACGTTTATCGGGTAACCAAGCACCAGCCTTGTATGCGTTGTCGATTCTGGTGGTATTCCTTGTATTAGCAGCCTTGTATGAAAGCTGGTCAGTACCGTTTGCGGTTATCCTTGTGGTTCCATTGGGGATTATCGGTGCTCTATTGGCGATGAATGGTCGAGGCTTGCCTAACGACGTGTTCTTCCAAGTGGGTCTGTTAACAACGGTTGGTTTGGCAACCAAGAACGCCATCTTGATTGTGGAATTTGCAAAAGAATTCTACGAGAAGGGGGCGGGTCTGGTTGAGGCGACCTTACATGCGGTCCGCGTGCGTTTACGTCCGATTTTAATGACGTCGCTCGCTTTTGGTCTGGGGGTTGTACCGCTAGCCATTAGTACAGGTGTGGGTTCGGGCAGTCAGAACGCCATTGGTACCGGTGTACTTGGCGGTATGATGAGTTCGACCTTCTTAGGTATCTTCTTCGTGCCACTGTTCTTCGTCATTGTTGAGCGGATCTTCAGTAAACGAGAGCGAAAAGCGAAAGAGAAAAATCCTACGTCGACGGATTAA From bosborne11 at verizon.net Tue Feb 12 21:30:08 2013 From: bosborne11 at verizon.net (Brian Osborne) Date: Tue, 12 Feb 2013 21:30:08 -0500 Subject: [Bioperl-l] Bio::Tools::SeqStats->count_codons In-Reply-To: References: Message-ID: Shyam, An ambiguous codon would be one that has a character other than [ACTGU] in it. I see '!' in your sequences, that would create an ambiguous codon. Brian O. On Feb 12, 2013, at 4:24 PM, Shyam Saladi wrote: > Hi, > > I am using the count_codons method from Bio::Tools::SeqStats and keep > getting "AMBIGUOUS" codons, but I can't figure out why exactly. > > When I translate the same sequence that gives the error using another > standard utility like (ExPASy - Translate), it seems to work alright. > > An example sequence is below. Could anyone lend some insight? > > Thanks, > Shyam > > > > AAA AAC AAG AAT ACA ACC ACG ACT AGA AGC > AGT *AMBIGUOUS* ATA ATC ATG ATT CAA CAC > CAG CAT CCA CCC CCG CCT CGA CGC CGG > CGT CTA CTC CTG CTT GAA GAC GAG GAT GCA > GCC GCG GCT GGA GGC GGG GGT GTA GTC > GTG GTT TAA TAC TAT TCA TCC TCG TCT TGG > TGT TTA TTC TTG TTT count filename > 1.722488038277511961722488038277511961722 > 2.966507177033492822966507177033492822967 > 1.531100478468899521531100478468899521531 > 0.9569377990430622009569377990430622009569 > 0.4784688995215311004784688995215311004785 > 1.722488038277511961722488038277511961722 > 1.33971291866028708133971291866028708134 > 1.913875598086124401913875598086124401914 > 0.1913875598086124401913875598086124401914 > 0.7655502392344497607655502392344497607656 > 1.435406698564593301435406698564593301435 * > 0.09569377990430622009569377990430622009569* > 0.3827751196172248803827751196172248803828 > 2.488038277511961722488038277511961722488 > 3.349282296650717703349282296650717703349 > 3.636363636363636363636363636363636363636 > 2.870813397129186602870813397129186602871 > 0.3827751196172248803827751196172248803828 > 1.626794258373205741626794258373205741627 > 0.4784688995215311004784688995215311004785 > 1.722488038277511961722488038277511961722 > 0.5741626794258373205741626794258373205742 > 1.052631578947368421052631578947368421053 > 1.244019138755980861244019138755980861244 > 0.3827751196172248803827751196172248803828 > 0.7655502392344497607655502392344497607656 > 0.1913875598086124401913875598086124401914 > 2.488038277511961722488038277511961722488 > 0.4784688995215311004784688995215311004785 > 0.6698564593301435406698564593301435406699 > 2.105263157894736842105263157894736842105 > 0.8612440191387559808612440191387559808612 > 2.870813397129186602870813397129186602871 > 1.435406698564593301435406698564593301435 > 1.722488038277511961722488038277511961722 > 2.775119617224880382775119617224880382775 > 2.00956937799043062200956937799043062201 > 2.488038277511961722488038277511961722488 > 3.540669856459330143540669856459330143541 > 2.00956937799043062200956937799043062201 > 0.1913875598086124401913875598086124401914 > 2.392344497607655502392344497607655502392 > 0.8612440191387559808612440191387559808612 > 5.454545454545454545454545454545454545455 > 1.913875598086124401913875598086124401914 > 0.8612440191387559808612440191387559808612 > 4.593301435406698564593301435406698564593 > 2.679425837320574162679425837320574162679 > 0.09569377990430622009569377990430622009569 > 1.148325358851674641148325358851674641148 > 1.148325358851674641148325358851674641148 > 0.8612440191387559808612440191387559808612 > 0.4784688995215311004784688995215311004785 > 2.105263157894736842105263157894736842105 > 0.9569377990430622009569377990430622009569 > 0.9569377990430622009569377990430622009569 > 0.09569377990430622009569377990430622009569 > 2.679425837320574162679425837320574162679 > 2.966507177033492822966507177033492822967 > 3.062200956937799043062200956937799043062 > 2.775119617224880382775119617224880382775 1045 temp.seq > > ATGGCACGTTTTTTTATTGATCGTCCCATCTTTGCGTGGGTGATCGCCTTAATTATTATGTTGGCGGGGGTGCTTTCAATTCGCACCCTGCCGGTTTCTCAATATCCCAGCATTGCACCGCCAACCGTGGTGATCAGTGCTAACTACCCTGGTGCATCGGCCAAGATTGTTGAAGACTCAGTGACTCAGGTGATTGAGCAACGCATGAAGGGTATCGATCACCTACGTTATATTGCCTCAACCAGCGATAGTTTCGGTAATGCTGAAATCACTTTGACCTTCAATGCCGAAGCCGATCCTGATATTGCTCAGGTACAAGTTCAGAACAAATTGCAGGGTGCAATGACCCTGTTACCACAAGAGGTACAGGCTCAAGGGGTTGACGTTAACAAATCAAGTTCTGGCTTYTTGATGGTGCTGGGTTTCGTATCGACTGACGGTTCCTTAGATAAAGGCGACATCGCCGACTATGTGGGTGCAAACGTACAAGATCCCATGAGCCGTGTACCGGGCGTGGGTGAAATTCAGCTGTTTGGTGCCCAATATGCGATGCGTATATGGCTTGATCCTTTAAAACTGACTCAATATAACTTGACCAGTTTAGAGGTGATCTCGGCGATTCGTGCTCAAAACGCGCAGGTGTCTGCGGGTCAGTTGGGTGGTACGCCGTCAATTCAAGGGCAAGAACTTAACGCCACTGTTTCGGCGCAAAGTCGTTTGCAAACCCCTGAAGAGTTTCGCAAGATTATCCTGAAGTCTGATACTTCGGGTGCGAATGTGTTCCTCGGTGATGTGGCGCGCGTAGAGTTAGGTTCAGAGAGTTATGCCGTTGTCTCGTTCTACAATGGTAAGCCTGCTACTGGTTTAGCGATTAAACTGGCGACAGGCGCAAACGCGTTGGATACCGCTGAAGCTGTTCGTGATAAAGTTGAAGAATTGCGACCTTTCTTCCCGCAAGGGTTGGATGTTGTTTATCCCTACGATACTAC! > GCCATTCGTTGAGAAATCGATAGAAGGCGTGGTACACACCCTGCTCGAAGCGATTGTTCTGGTGTTTGTCATCATGTACCTCTTCCTGCAAAACTTCCGTGCGACCTTAATTCCGACGATTGCGGTACCAGTGGTCTTGCTGGGAACGTTTGCGATTTTGTCGGCCACGGGCTTCTCTATCAACACCCTTACCATGTTTGCTATGGTGCTGGCGATTGGTCTGTTGGTGGACGACGCCATCGTGGTGGTTGAAAACGTTGAGCGGGTGATGTCGGAAGAAGGGTTGAGCCCACTCGAAGCGACTCGTAAATCGATGGATCAAATCACTGGCGCCTTAGTTGGTATTGGTTTGACGTTATCTGCTGTATTTGTGCCAATGGCATTTATGTCGGGTTCTACTGGGGTCATTTACCGTCAGTTCTCGATCACTATCGTGTCTGCGATGGCATTGTCGGTATTAGTGGCCTTGATTTTAACGCCGGCACTTTGTGCCACTATGTTAAAACCCGTGCAGAAGGGACATGGTCATATTGAAACCGGTTTCTTCGGTTGGTTTAACCGTAACTTTGATCGCTTAACTAACCGTTACGAATCCAGTGTGGCGGGCATAGTGAAGCGTGGCTTTAGAGTCATGATGATTTATGTGGCTTTAGTGGTCGCCGTCGGTTGGATCTTCATGCGTATGCCAACTGCATTCTTACCCGATGAAGACCAAGGTATCTTGTTTACGCAGGCGATTTTGCCAACAAACTCGACTCAAGAAAGTACCCTCAAAGTGCTGGATAAGGTATCCGATCACTTCATGGCTGAAGAAGGCGTGAGATCGGTATTCAGCGTGGCGGGCTTTAGCTTTGCGGGTCAAGGCCAAAACATGGGTATCGCTTTCGTTGGCTTGAAGGATTGGTCAGAGCGTGAAGCACCTGGTATGGATGTGCAGTCTATTGCGGGTCGTGCTATGGGTGCCTTTAGTCAAATTAAAGACGCCTTC! > GTATTTGCCTTCGTACCACCTGCGGTTATTGAGCTGGGTACGGCGAATGGTTTTGACATGTACCTGCAAG > ATAAAAACGGTCAAGGCCACGATAAGTTAATAGCGGCTCGTAACCAATTGCTGGGTATGGCGGCTCAGAATCCAAACCTTATGGGTGTTCGCCCTAATGGTCAGGAAGATGCGCCAATCTATCAATTGCATATTGATCATGCAAAGTTGAGCGCATTAGGCGTTGATATTGCTAACGTTAACAGTGTGTTGGCAACTGCTTGGGGTGGTTCCTATGTGAACGATTTTATCGACCGCGGCCGTGTGAAAAAGGTATTTGTGCAAGGTGATGCCCAATACCGTATGCAGCCTGAAGACCTCAACACTTGGTACGTGCGTAACAACAAGGGTGACATGGTGCCATTTTCGGCCTTTGCAACAGGTTCTTGGGAATACGGCTCACCGCGTCTAGAACGTTTTAACGGTTTACCAGCGGTGAATATTCAAGGCGCAACTGCACCAGGCTTTAGTACGGGTGCTGCCATGACTATCATGGAGGACTTAGTTAAGCAGCTACCACCTGGCTTTGGCATCGAGTGGAACGGCTTATCCTACGAGGAACGTTTATCGGGTAACCAAGCACCAGCCTTGTATGCGTTGTCGATTCTGGTGGTATTCCTTGTATTAGCAGCCTTGTATGAAAGCTGGTCAGTACCGTTTGCGGTTATCCTTGTGGTTCCATTGGGGATTATCGGTGCTCTATTGGCGATGAATGGTCGAGGCTTGCCTAACGACGTGTTCTTCCAAGTGGGTCTGTTAACAACGGTTGGTTTGGCAACCAAGAACGCCATCTTGATTGTGGAATTTGCAAAAGAATTCTACGAGAAGGGGGCGGGTCTGGTTGAGGCGACCTTACATGCGGTCCGCGTGCGTTTACGTCCGATTTTAATGACGTCGCTCGCTTTTGGTCTGGGGGTTGTACCGCTAGCCATTAGTACAGGTGTGGGTTCGGGCAGTCAGAACGCCATTGGTACCGGTGTACTTGGCGGTATGATGAGTTCGACCTTCTTA! > GGTATCTTCTTCGTGCCACTGTTCTTCGTCATTGTTGAGCGGATCTTCAGTAAACGAGAGCGAAAAGCGAAAGAGAAAAATCCTACGTCGACGGATTAA > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Wed Feb 13 10:18:10 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 13 Feb 2013 15:18:10 +0000 Subject: [Bioperl-l] [ANNOUNCEMENT] BioPerl Future Development Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE24CF5@CHIMBX5.ad.uillinois.edu> All, tl;dr: A lot of change is coming. Be forewarned and be prepared. This is an 'official' announcement to the BioPerl community on future BioPerl plans. We have decided to move continued maintenance of Bioperl release series over to the new 'v1' branch. This branch will be the point where any future versions of 1.6.x code will be released, starting with the (already-scheduled) March 1 release. The 'master' branch will become the main focal point for future development of BioPerl going into an eventual v2 release, with a focus on performance enhancements, addressing newer technologies like NGS and large data, code cleanup, and simplifying the code base. We welcome any help with code improvements. GMOD folks? Want to help? This is a good opportunity to address BioPerl short-comings in the code base! What this means for anyone using BioPerl currently: 1) We anticipate significant issues if you are relying on the 'master' branch for anything. To inelegantly state it, the core developers are taking back the 'master' branch for future development. Please please please do not rely on the 'master' branch for stable code; if you are reliant on the BioPerl 1.6.x, make sure to use 'v1'. We can revisit whether to make 'v1' the default checkout branch if/when the need arises. 2) Expect not to find some modules. We will be migrating modules requiring external dependencies and other associated chunks of the code base out into their own repositories over the next year to help future maintenance; the eventual intent is to release all of these independently on CPAN. We will completely remove all code previously marked as deprecated, and we may immediately deprecate additional modules if needed (this will of course be discussed on list). 3) Expect version numbering to change significantly. Because we are releasing code in separate repositories, I fully expect downstream versioning problems if we stick with the current system (e.g. all bioperl-live modules having the same version). It will be too much of a headache to sync versions for all modules as this will entail making a full release of all bioperl code, one of the main reasons we are splitting out code to begin with. At the moment, no specific versioning scheme has been chosen, though I *highly* recommend using X.Y versioning for simplicity (e.g. no more 3-point versions). This is the standard that Lincoln has adopted for Bio::Graphics and GBrowse. 4) Expect quick deprecation of methods within modules as needed. These should of course be brought up to the mail list prior to actual implementation, but I would anticipate some things changing as we try to adopt a more consistent method naming scheme. 5) The same steps outlined for bioperl-live will apply for bioperl-run modules. We will have to decide the best approach to use for those, e.g. whether to separate them out based on task (alignment), application group (NGS, BLAST, RNA), etc. and how these may fit organically with bioperl-live modules where appropriate. 6) Do not expect a new CPAN release of such code until Dec 2013. Even then it will be in an alpha stage. We are all busy campers. We do not anticipate significant changes to bioperl-network or bioperl-db at this time beyond updating them to deal with new changes. I'm sure there are many other points that need to be discussed. Please reply over the next week if you have any concerns. chris From cjfields at illinois.edu Wed Feb 13 11:01:07 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 13 Feb 2013 16:01:07 +0000 Subject: [Bioperl-l] Test-pls ignore Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE2506D@CHIMBX5.ad.uillinois.edu> testing the mail list to see if it is working. -c From sebastien.moretti at unil.ch Wed Feb 13 11:21:23 2013 From: sebastien.moretti at unil.ch (=?ISO-8859-1?Q?Moretti_S=E9bastien?=) Date: Wed, 13 Feb 2013 17:21:23 +0100 Subject: [Bioperl-l] PhyloXML In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE24E8D@CHIMBX5.ad.uillinois.edu> References: <51152591.9010402@unil.ch> <511898E6.7060400@unil.ch> <118F034CF4C3EF48A96F86CE585B94BF6CE24E8D@CHIMBX5.ad.uillinois.edu> Message-ID: <511BBD83.2000708@unil.ch> >>>> # Add annotation >>>> $treeio->add_phyloXML_annotation(-obj => $tree, >>>> -xml => 'SUMF family', >>>> ); >>> >>> If you really have $treeio in your script in this line and not $tree, then that's at least one problem. But the stack suggests that the above isn't the exact line in your script - can you confirm that? >>> >>> -hilmar >> >> I replaced $treeio by $tree in the above line but still get an error. >> Don't see what you mean by "the stack suggests that the above isn't the exact line in your script" >> >> The only think I changed is the length of the xml string I try to insert. But get the same error with an empty xml string. >> >> >> >> my $treeio = new Bio::TreeIO(-file => "$infile", >> -format => 'phyloxml', >> ); >> my $tree = $treeio->next_tree; >> >> # Add annotation >> $tree->add_phyloXML_annotation(-obj => $tree, >> -xml => 'SUMF family', >> ); >> >> Can't locate object method "add_phyloXML_annotation" via package >> "Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, line 1 (#1) >> (F) You called a method correctly, and it correctly indicated a package >> functioning as a class, but that package doesn't define that particular >> method, nor does any of its base classes. See perlobj. >> >> Uncaught exception from user code: >> Can't locate object method "add_phyloXML_annotation" via package "Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, line 1. >> at ./add_annotation_to_phyloxml.pl line 40 > > Will have to look into this. One problem we have is that phyloXML support has dwindled, so if anyone wants to take this on I would be more than happy to help them get started. > > chris You mean that BioPerl 1.6.901 has not a full support of PhyloXML ? The problem I have is "expected" ? -- S?bastien Moretti From cjfields at illinois.edu Wed Feb 13 10:47:17 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 13 Feb 2013 15:47:17 +0000 Subject: [Bioperl-l] PhyloXML In-Reply-To: <511898E6.7060400@unil.ch> References: <51152591.9010402@unil.ch> <511898E6.7060400@unil.ch> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE24E8D@CHIMBX5.ad.uillinois.edu> On Feb 11, 2013, at 1:08 AM, S?bastien MORETTI wrote: >>> # Add annotation >>> $treeio->add_phyloXML_annotation(-obj => $tree, >>> -xml => 'SUMF family', >>> ); >> >> If you really have $treeio in your script in this line and not $tree, then that's at least one problem. But the stack suggests that the above isn't the exact line in your script - can you confirm that? >> >> -hilmar > > I replaced $treeio by $tree in the above line but still get an error. > Don't see what you mean by "the stack suggests that the above isn't the exact line in your script" > > The only think I changed is the length of the xml string I try to insert. But get the same error with an empty xml string. > > > > my $treeio = new Bio::TreeIO(-file => "$infile", > -format => 'phyloxml', > ); > my $tree = $treeio->next_tree; > > # Add annotation > $tree->add_phyloXML_annotation(-obj => $tree, > -xml => 'SUMF family', > ); > > Can't locate object method "add_phyloXML_annotation" via package > "Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, line 1 (#1) > (F) You called a method correctly, and it correctly indicated a package > functioning as a class, but that package doesn't define that particular > method, nor does any of its base classes. See perlobj. > > Uncaught exception from user code: > Can't locate object method "add_phyloXML_annotation" via package "Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, line 1. > at ./add_annotation_to_phyloxml.pl line 40 > > > > -- > S?bastien Moretti > Department of Ecology and Evolution, > Biophore, University of Lausanne, > CH-1015 Lausanne, Switzerland > Tel.: +41 (21) 692 4221/4079 > http://bioinfo.unil.ch/\ Will have to look into this. One problem we have is that phyloXML support has dwindled, so if anyone wants to take this on I would be more than happy to help them get started. chris From carandraug+dev at gmail.com Wed Feb 13 12:23:23 2013 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Wed, 13 Feb 2013 17:23:23 +0000 Subject: [Bioperl-l] Next BioPerl release Message-ID: On 5 February 2013 21:53, Fields, Christopher J wrote: > I am scheduling the next BioPerl CPAN release tentatively for March 1. Any help in triaging bug reports would be greatly appreciated! Hi is this release of bioperl-live only or also includes bioperl-run? Carn? From cjfields at illinois.edu Wed Feb 13 12:08:21 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 13 Feb 2013 17:08:21 +0000 Subject: [Bioperl-l] PhyloXML In-Reply-To: <511BBD83.2000708@unil.ch> References: <51152591.9010402@unil.ch> <511898E6.7060400@unil.ch> <118F034CF4C3EF48A96F86CE585B94BF6CE24E8D@CHIMBX5.ad.uillinois.edu> <511BBD83.2000708@unil.ch> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE2532B@CHIMBX5.ad.uillinois.edu> On Feb 13, 2013, at 10:21 AM, Moretti S?bastien wrote: >>>>> # Add annotation >>>>> $treeio->add_phyloXML_annotation(-obj => $tree, >>>>> -xml => 'SUMF family', >>>>> ); >>>> >>>> If you really have $treeio in your script in this line and not $tree, then that's at least one problem. But the stack suggests that the above isn't the exact line in your script - can you confirm that? >>>> >>>> -hilmar >>> >>> I replaced $treeio by $tree in the above line but still get an error. >>> Don't see what you mean by "the stack suggests that the above isn't the exact line in your script" >>> >>> The only think I changed is the length of the xml string I try to insert. But get the same error with an empty xml string. >>> >>> >>> >>> my $treeio = new Bio::TreeIO(-file => "$infile", >>> -format => 'phyloxml', >>> ); >>> my $tree = $treeio->next_tree; >>> >>> # Add annotation >>> $tree->add_phyloXML_annotation(-obj => $tree, >>> -xml => 'SUMF family', >>> ); >>> >>> Can't locate object method "add_phyloXML_annotation" via package >>> "Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, line 1 (#1) >>> (F) You called a method correctly, and it correctly indicated a package >>> functioning as a class, but that package doesn't define that particular >>> method, nor does any of its base classes. See perlobj. >>> >>> Uncaught exception from user code: >>> >>> at ./add_annotation_to_phyloxml.pl line 40 >> >> Will have to look into this. One problem we have is that phyloXML support has dwindled, so if anyone wants to take this on I would be more than happy to help them get started. >> >> chris > > You mean that BioPerl 1.6.901 has not a full support of PhyloXML ? > The problem I have is "expected" ? > > -- > S?bastien Moretti I think it handles most of phyloXML fine, but the implementation of the parser is a little tricky. I tried cleaning this up a few years back but didn't make much progress. The function is in Bio::TreeIO::phyloxml, so the correct call should be (as you previously had it): $treeio->add_phyloXML_annotation(-obj => $tree, -xml => 'SUMF family', ); My guess is that Bio::Tree::Tree was AnnotatableI at one point but that was removed, will have to trace that back. Can you file a bug on this? https://redmine.open-bio.org/ chris From cjfields at illinois.edu Wed Feb 13 13:05:53 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 13 Feb 2013 18:05:53 +0000 Subject: [Bioperl-l] Next BioPerl release In-Reply-To: References: Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE25573@CHIMBX5.ad.uillinois.edu> On Feb 13, 2013, at 11:23 AM, Carn? Draug wrote: > On 5 February 2013 21:53, Fields, Christopher J wrote: >> I am scheduling the next BioPerl CPAN release tentatively for March 1. Any help in triaging bug reports would be greatly appreciated! > > Hi > > is this release of bioperl-live only or also includes bioperl-run? > > Carn? We can work on a bioperl-run release. It's too much to handle both in one go. The problem I have faced with bioperl-run in the past is similar to bioperl-live, that the tools used are a moving target and that makes the wrappers easily out-of-date. I would really like a more flexible generic way of defining these that would allow for easier maintenance. chris From l.m.timmermans at students.uu.nl Wed Feb 13 14:44:22 2013 From: l.m.timmermans at students.uu.nl (Leon Timmermans) Date: Wed, 13 Feb 2013 20:44:22 +0100 Subject: [Bioperl-l] Next BioPerl release In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE25573@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE25573@CHIMBX5.ad.uillinois.edu> Message-ID: On Wed, Feb 13, 2013 at 7:05 PM, Fields, Christopher J wrote: > We can work on a bioperl-run release. It's too much to handle both in one go. The problem I have faced with bioperl-run in the past is similar to bioperl-live, that the tools used are a moving target and that makes the wrappers easily out-of-date. I would really like a more flexible generic way of defining these that would allow for easier maintenance. Also, bioperl-run needs to be cut into smaller distributions even more than bioperl-live. Few people if anyone at all has all tools it tries to wrap at hand, so its almost impossible to pass its testing suite. We need dists that can realistically pass. Leon From cjfields at illinois.edu Wed Feb 13 16:04:26 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 13 Feb 2013 21:04:26 +0000 Subject: [Bioperl-l] Next BioPerl release In-Reply-To: References: <118F034CF4C3EF48A96F86CE585B94BF6CE25573@CHIMBX5.ad.uillinois.edu> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE25B07@CHIMBX5.ad.uillinois.edu> On Feb 13, 2013, at 1:44 PM, Leon Timmermans wrote: > On Wed, Feb 13, 2013 at 7:05 PM, Fields, Christopher J > wrote: >> We can work on a bioperl-run release. It's too much to handle both in one go. The problem I have faced with bioperl-run in the past is similar to bioperl-live, that the tools used are a moving target and that makes the wrappers easily out-of-date. I would really like a more flexible generic way of defining these that would allow for easier maintenance. > > Also, bioperl-run needs to be cut into smaller distributions even more > than bioperl-live. Few people if anyone at all has all tools it tries > to wrap at hand, so its almost impossible to pass its testing suite. > > We need dists that can realistically pass. > > Leon Yup. It's a mess. chris From florent.angly at gmail.com Wed Feb 13 17:33:14 2013 From: florent.angly at gmail.com (Florent Angly) Date: Thu, 14 Feb 2013 08:33:14 +1000 Subject: [Bioperl-l] [ANNOUNCEMENT] BioPerl Future Development In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE24CF5@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE24CF5@CHIMBX5.ad.uillinois.edu> Message-ID: <511C14AA.9030107@gmail.com> On 14/02/13 01:18, Fields, Christopher J wrote: > I*highly* recommend using X.Y versioning for simplicity (e.g. no more 3-point versions) Yes, I support the X.Y versioning as well. Florent From l.m.timmermans at students.uu.nl Wed Feb 13 18:12:06 2013 From: l.m.timmermans at students.uu.nl (Leon Timmermans) Date: Thu, 14 Feb 2013 00:12:06 +0100 Subject: [Bioperl-l] [ANNOUNCEMENT] BioPerl Future Development In-Reply-To: <511C14AA.9030107@gmail.com> References: <118F034CF4C3EF48A96F86CE585B94BF6CE24CF5@CHIMBX5.ad.uillinois.edu> <511C14AA.9030107@gmail.com> Message-ID: On Wed, Feb 13, 2013 at 11:33 PM, Florent Angly wrote: > On 14/02/13 01:18, Fields, Christopher J wrote: >> >> I*highly* recommend using X.Y versioning for simplicity (e.g. no more >> 3-point versions) > > Yes, I support the X.Y versioning as well. > Florent See also: http://www.dagolden.com/index.php/369/version-numbers-should-be-boring/ Leon From daisieh at gmail.com Thu Feb 14 00:21:15 2013 From: daisieh at gmail.com (Daisie Huang) Date: Wed, 13 Feb 2013 21:21:15 -0800 (PST) Subject: [Bioperl-l] Question regarding while loops for reading files In-Reply-To: References: Message-ID: <3cbbba3b-759d-4281-9592-6b690aea92ab@googlegroups.com> I think you need to reset the pointer to the filehandle before you go through the while loop the second time: seek $fh,0,0 On Wednesday, February 13, 2013 6:46:41 PM UTC-8, Tiago Hori wrote: > > Hey Guys, > > I am still at the same place. I am writing these little pieces of code to > try to learn the language better, so any advice would be useful. I am again > parsing through tab delimited files and now trying to find fish from on id > (in these case families AS5 and AS9), retrieve the weights and average > them. When I started I did it for one family and it worked (instead of the > @families I had a scalar $family set to AS5). But really it is more useful > to look at more than one family at time (I should mention that are 2 types > of fish per family one ends in PS , the other doesn't). So I tried to use a > foreach loop to go through the file twice, once with a the search value set > to AS5 and a second time to AS9. It works for AS5, but for some reason, the > foreach loop sets $test to AS9 the second time, but it doesn't go through > the while loop. What am I doing wrong? > > here is the code: > > #! /usr/bin/perl > use strict; > use warnings; > > my $file = $ARGV[0]; > my @family = ('AS5','AS9'); > my $i; > my $ii; > my $test; > > open (my $fh, "<", $file) or die ("Can't open $file: $!"); > > foreach (@family){ > $test = $_; > my @data_weight_2N = (); > my @data_weight_3N = (); > while (<$fh>){ > chomp; > my $line = $_; > my @data = split ("\t", $line); > if ($data[0] !~ /[0-9]*/){ > next;} > elsif ($data[1] eq "ABF09-$test"){ > $i += 1; > push (@data_weight_2N, $data[6]); > }elsif ($data[1] eq "ABF09-".$test."PS"){ > $ii += 1; > push (@data_weight_3N,$data[6]); > } > } > my $mean_2N = &average (\@data_weight_2N); > my $stdev_2N = &stdev (\@data_weight_2N); > my $stderr_2N = ($stdev_2N/sqrt($i)); > > print "These are the the avearge weight, stdev and stderr for $test > 2N:\t", $mean_2N,"\t",$stdev_2N,"\t",$stderr_2N, "\n"; > > my $mean_3N = &average (\@data_weight_3N); > my $stdev_3N = &stdev (\@data_weight_3N); > my $stderr_3N = ($stdev_3N/sqrt($i)); > > print "These are the the avearge weight, stdev and stderr for $test > 3N:\t", $mean_3N,"\t",$stdev_3N,"\t",$stderr_3N, "\n"; > } > > close ($fh); > > sub average{ > my($data) = @_; > if (not @$data) { > print ("Empty array\n"); > return 0; > } > my $total = 0; > foreach (@$data) { > $total += $_; > } > my $average = $total / @$data; > return $average; > } > > sub stdev{ > my($data) = @_; > if(@$data == 1){ > return 0; > } > my $average = &average($data); > my $sqtotal = 0; > foreach(@$data) { > $sqtotal += ($average-$_) ** 2; > } > my $std = ($sqtotal / (@$data-1)) ** 0.5; > return $std; > } > > Thanks, > > T. > > -- > "Education is not to be used to promote obscurantism." - Theodonius > Dobzhansky. > > "Gracias a la vida que me ha dado tanto > Me ha dado el sonido y el abecedario > Con ?l, las palabras que pienso y declaro > Madre, amigo, hermano > Y luz alumbrando la ruta del alma del que estoy amando > > Gracias a la vida que me ha dado tanto > Me ha dado la marcha de mis pies cansados > Con ellos anduve ciudades y charcos > Playas y desiertos, monta?as y llanos > Y la casa tuya, tu calle y tu patio" > > Violeta Parra - Gracias a la Vida > > Tiago S. F. Hori. PhD. > Ocean Science Center-Memorial University of Newfoundland > From sebastien.moretti at unil.ch Thu Feb 14 03:09:06 2013 From: sebastien.moretti at unil.ch (=?ISO-8859-1?Q?S=E9bastien_MORETTI?=) Date: Thu, 14 Feb 2013 09:09:06 +0100 Subject: [Bioperl-l] PhyloXML In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE2532B@CHIMBX5.ad.uillinois.edu> References: <51152591.9010402@unil.ch> <511898E6.7060400@unil.ch> <118F034CF4C3EF48A96F86CE585B94BF6CE24E8D@CHIMBX5.ad.uillinois.edu> <511BBD83.2000708@unil.ch> <118F034CF4C3EF48A96F86CE585B94BF6CE2532B@CHIMBX5.ad.uillinois.edu> Message-ID: <511C9BA2.9000508@unil.ch> >>>>>> # Add annotation >>>>>> $treeio->add_phyloXML_annotation(-obj => $tree, >>>>>> -xml => 'SUMF family', >>>>>> ); >>>>> >>>>> If you really have $treeio in your script in this line and not $tree, then that's at least one problem. But the stack suggests that the above isn't the exact line in your script - can you confirm that? >>>>> >>>>> -hilmar >>>> >>>> I replaced $treeio by $tree in the above line but still get an error. >>>> Don't see what you mean by "the stack suggests that the above isn't the exact line in your script" >>>> >>>> The only think I changed is the length of the xml string I try to insert. But get the same error with an empty xml string. >>>> >>>> >>>> >>>> my $treeio = new Bio::TreeIO(-file => "$infile", >>>> -format => 'phyloxml', >>>> ); >>>> my $tree = $treeio->next_tree; >>>> >>>> # Add annotation >>>> $tree->add_phyloXML_annotation(-obj => $tree, >>>> -xml => 'SUMF family', >>>> ); >>>> >>>> Can't locate object method "add_phyloXML_annotation" via package >>>> "Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, line 1 (#1) >>>> (F) You called a method correctly, and it correctly indicated a package >>>> functioning as a class, but that package doesn't define that particular >>>> method, nor does any of its base classes. See perlobj. >>>> >>>> Uncaught exception from user code: >>>> >>>> at ./add_annotation_to_phyloxml.pl line 40 >>> >>> Will have to look into this. One problem we have is that phyloXML support has dwindled, so if anyone wants to take this on I would be more than happy to help them get started. >>> >>> chris >> >> You mean that BioPerl 1.6.901 has not a full support of PhyloXML ? >> The problem I have is "expected" ? >> >> -- >> S?bastien Moretti > > I think it handles most of phyloXML fine, but the implementation of the parser is a little tricky. I tried cleaning this up a few years back but didn't make much progress. > > The function is in Bio::TreeIO::phyloxml, so the correct call should be (as you previously had it): > > $treeio->add_phyloXML_annotation(-obj => $tree, > -xml => 'SUMF family', > ); > > My guess is that Bio::Tree::Tree was AnnotatableI at one point but that was removed, will have to trace that back. Can you file a bug on this? > > https://redmine.open-bio.org/ > > chris I will fill a bug on this. I'd be happy to try to contribute to the phyloxml code. But don't know how to proceed for BioPerl. -- S?bastien Moretti From hartzell at alerce.com Thu Feb 14 15:04:44 2013 From: hartzell at alerce.com (George Hartzell) Date: Thu, 14 Feb 2013 12:04:44 -0800 Subject: [Bioperl-l] Question regarding while loops for reading files In-Reply-To: <3cbbba3b-759d-4281-9592-6b690aea92ab@googlegroups.com> References: <3cbbba3b-759d-4281-9592-6b690aea92ab@googlegroups.com> Message-ID: <20765.17244.185833.755900@gargle.gargle.HOWL> I think that it's important to get feedback on code that one has written and to try to understand how/what/why someone else has done in their code. To that end.... Since Tiago's using this to learn the language better I can't resist some comments beyond resetting the file handle. For grins I rewrote it using Text::CSV_XS and Statistics::Basic and to take a single pass through the data file using a multilevel data structure. I resisted the urge to rewrite it in Moose. Didn't even have an urge to rewrite it in R. Funny, that.... The script is here Tiago.pl https://gist.github.com/hartzell/4955401 With something like what I think the data looks like here: https://gist.github.com/hartzell/4955570 Even without that big of a rewrite, I had a bunch of local comments which are inline below. Daisie Huang writes: > [...] > On Wednesday, February 13, 2013 6:46:41 PM UTC-8, Tiago Hori wrote: > > > > Hey Guys, > > > > I am still at the same place. I am writing these little pieces of code to > > try to learn the language better, so any advice would be useful. > > [...] > > here is the code: > > > > #! /usr/bin/perl > > use strict; > > use warnings; > > > > my $file = $ARGV[0]; Slightly better would be $filename, so that when you step up to Path::Class you can differentiate a file object from a file name string. > > my @family = ('AS5','AS9'); Better would be @families, plural. See the use of $family below. > > my $i; > > my $ii; As far as I can tell, these are just counting the number of things that you push onto the various arrays. You don't need them, referring to the list in scalar context will give you its size. > > my $test; You use this to hold the name of the family, so it's not particularly evocative. You should also restrict it's scope to within the loop. See the comment for the foreach loop. > > open (my $fh, "<", $file) or die ("Can't open $file: $!"); You made my day, three arg. open *and* you checked for errors. Nice! > > foreach (@family){ Better would be for my $family (@families) { which is evocative and restricts the scope of $family to the for loop (and for is 4 characters shorter than foreach...). > > $test = $_; No longer need this, using $family declared in the for loop with the proper scoping. > > my @data_weight_2N = (); > > my @data_weight_3N = (); > > while (<$fh>){ > > chomp; > > my $line = $_; > > my @data = split ("\t", $line); Don't parse CSV (TSV) files yourself. Get in the habit of using Text::CSV_XS. > > if ($data[0] !~ /[0-9]*/){ > > next;} > > elsif ($data[1] eq "ABF09-$test"){ > > $i += 1; You don't need the counter. > > push (@data_weight_2N, $data[6]); > > }elsif ($data[1] eq "ABF09-".$test."PS"){ > > $ii += 1; You don't need the counter. > > push (@data_weight_3N,$data[6]); > > } > > } > > my $mean_2N = &average (\@data_weight_2N); > > my $stdev_2N = &stdev (\@data_weight_2N); You don't need the ampersands on the subroutine calls. They're old school and just encourage people to make fun of our language for its use of all those funny punctuation marks . > > my $stderr_2N = ($stdev_2N/sqrt($i)); Unless I'm mistaken, this is equivalent my $stderr_2N = ($stdev_2N/sqrt(scalar @data_weight_2N)); and you don't need the counter, the explicit use of scalar there might even be redundant (I'm a coward). You use the same trick in your subroutine defn's below. > > > > print "These are the the avearge weight, stdev and stderr for $test > > 2N:\t", $mean_2N,"\t",$stdev_2N,"\t",$stderr_2N, "\n"; > > > > my $mean_3N = &average (\@data_weight_3N); > > my $stdev_3N = &stdev (\@data_weight_3N); > > my $stderr_3N = ($stdev_3N/sqrt($i)); > > > > print "These are the the avearge weight, stdev and stderr for $test > > 3N:\t", $mean_3N,"\t",$stdev_3N,"\t",$stderr_3N, "\n"; > > } > > > > close ($fh); Ah, rats. You checked whether open worked, you need to do the same thing on close too! close ($fh) or die !$; Or you could just use autodie qw(open close); and then they'll die appropriately when they have to and you don't have to bother with the checking. > > sub average{ > > my($data) = @_; > > if (not @$data) { > > print ("Empty array\n"); > > return 0; > > } > > my $total = 0; > > foreach (@$data) { > > $total += $_; > > } use List::AllUtils qw(sum); # somewhere up at the top of the script... my $total = sum(@$data); if (not defined $total) { print "Empty array\n"; return; } List::AllUtils is your friend. Learn to use it. Your returning 0 for an empty list is probably the wrong thing, isn't it possible to the total to actually be 0? Just return instead. Don't return undef, just return (and let perl take context into account for you). You probably don't actually want to spew "Empty array" out into your output stream, imagine writing a script that postprocesses your output and having to deal with it. If you really need to say it, send it to standard error with print STDERR "Empty array\n"; > > my $average = $total / @$data; > > return $average; If you don't really need the error message, then you can get to my $total = sum(@$data); return unless $total; return $total / @$data; And if an empty data array is *truly* unexpected, maybe you should just die/carp. > > } > > > > sub stdev{ > > my($data) = @_; > > if(@$data == 1){ > > return 0; > > } > > my $average = &average($data); > > my $sqtotal = 0; > > foreach(@$data) { > > $sqtotal += ($average-$_) ** 2; > > } > > my $std = ($sqtotal / (@$data-1)) ** 0.5; > > return $std; > > } Ditto on the use of List::AllUtils, etc... Phew. The only other thing I'd like to see would be an arrangement that let's you write simple tests. A simple sol'n would be to package the entire main part of the code up into e.g. a subroutine that returns a hashref keyed by family, containing a hashref keyed by 2N/3N/... and then you could just: use Test::More; use Tiago qw(summarize); my $output = summarize("test_data.tsv"); is($output->{AS5}->{'2N}, "42", "Got the magic number") # etc... done_testing; Thanks for sharing your code. Keep practicing! g. From carandraug+dev at gmail.com Thu Feb 14 17:13:45 2013 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Thu, 14 Feb 2013 22:13:45 +0000 Subject: [Bioperl-l] bioperl in Google Summer of Code 2013 Message-ID: Hi we got word of it on another project I'm involved with and I was wondering. Is bioperl going to apply for the Google Summer of Code this year? http://www.google-melange.com/gsoc/homepage/google/gsoc2013 Carn? From hlapp at drycafe.net Fri Feb 15 09:28:30 2013 From: hlapp at drycafe.net (Hilmar Lapp) Date: Fri, 15 Feb 2013 09:28:30 -0500 Subject: [Bioperl-l] bioperl in Google Summer of Code 2013 In-Reply-To: References: Message-ID: <50A8D5F3-7248-49C1-AA1D-969C2BCA38A9@drycafe.net> I presume the OBF does as an umbrella organization on behalf of all Bio* projects. If you fancy proposing a project idea or mentoring, now is not a bad time to think about that or looking for co-mentors. -hilmar Sent with a tap. On Feb 14, 2013, at 5:13 PM, Carn? Draug wrote: > Hi > > we got word of it on another project I'm involved with and I was > wondering. Is bioperl going to apply for the Google Summer of Code > this year? > > http://www.google-melange.com/gsoc/homepage/google/gsoc2013 > > Carn? > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From p.j.a.cock at googlemail.com Fri Feb 15 09:47:39 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Fri, 15 Feb 2013 14:47:39 +0000 Subject: [Bioperl-l] bioperl in Google Summer of Code 2013 In-Reply-To: <50A8D5F3-7248-49C1-AA1D-969C2BCA38A9@drycafe.net> References: <50A8D5F3-7248-49C1-AA1D-969C2BCA38A9@drycafe.net> Message-ID: On Fri, Feb 15, 2013 at 2:28 PM, Hilmar Lapp wrote: > I presume the OBF does as an umbrella organization on behalf of all Bio* > projects. If you fancy proposing a project idea or mentoring, now is not a > bad time to think about that or looking for co-mentors. > > -hilmar Yes, the plan is that as in the last few years, the OBF will apply to GSoC and cover for BioPerl, BioJava, BioRuby, Biopython etc. At this stage the Bio* projects would be wise to start coming up with some good project ideas and experienced developers thinking about being a mentor. For potential students, getting involved in the community early is a good idea (e.g. bug reports, or better fixing existing bugs) See also: http://lists.open-bio.org/mailman/listinfo/gsoc http://lists.open-bio.org/mailman/listinfo/gsoc-mentors Peter From cjfields at illinois.edu Fri Feb 15 09:59:43 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Fri, 15 Feb 2013 14:59:43 +0000 Subject: [Bioperl-l] bioperl in Google Summer of Code 2013 In-Reply-To: References: <50A8D5F3-7248-49C1-AA1D-969C2BCA38A9@drycafe.net> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE28328@CHIMBX5.ad.uillinois.edu> On Feb 15, 2013, at 8:47 AM, Peter Cock wrote: > On Fri, Feb 15, 2013 at 2:28 PM, Hilmar Lapp wrote: >> I presume the OBF does as an umbrella organization on behalf of all Bio* >> projects. If you fancy proposing a project idea or mentoring, now is not a >> bad time to think about that or looking for co-mentors. >> >> -hilmar > > Yes, the plan is that as in the last few years, the OBF will apply to > GSoC and cover for BioPerl, BioJava, BioRuby, Biopython etc. At > this stage the Bio* projects would be wise to start coming up with > some good project ideas and experienced developers thinking about > being a mentor. For potential students, getting involved in the > community early is a good idea (e.g. bug reports, or better fixing > existing bugs) > > See also: > http://lists.open-bio.org/mailman/listinfo/gsoc > http://lists.open-bio.org/mailman/listinfo/gsoc-mentors > > Peter At the moment I'm not sure if Rob is heading this up or if the baton will be passed on to someone else. I can't take charge of writing up a proposal at the moment but I can certainly help edit. chris From scott at scottcain.net Fri Feb 15 14:18:37 2013 From: scott at scottcain.net (Scott Cain) Date: Fri, 15 Feb 2013 14:18:37 -0500 Subject: [Bioperl-l] sequence-region directives in gff files In-Reply-To: References: Message-ID: Hi Carn?, Thanks for pointing this out; I was only sort of paying attention to the FeatureIO discussion, and it hadn't occurred to me that my commit was the problem. I believe I've reproduced the functionality from that commit, and I even added a test that makes use of the added method (yes, I know, it surprised me too!). All of the tests now pass for me in the FeatureIO master. I'm putting it on my todo list to check that the Chado loader that makes use of Bio::FeatureIO still works as expected with the new incarnation. Thanks, Scott On Wed, Feb 13, 2013 at 5:22 AM, Carn? Draug wrote: > Hi Scott > > 3 years ago, the code for the Bio::SeqFeatureIO::* modules was split > from bioperl-live into a separate repository[1]. Because the code was > not removed from the bioperl-live repository, people ended up patching > on both sides, leading to 2 branches of development. Last weekend I > merged them back together with the exception of one commit that would > not longer apply[2]. > > This commit was authored by you with the following commit message: > "tiny change to Bio::FeatureIO::gff to allow the gmod chado gff3 bulk > loader to not choke when the gff file has ##sequence-region > directives. The loader is documented not to support this, but now it > will quitely ignore those directives." > > Do you think you could take a look at it? > > Thank you, > Carn? > > [1] https://github.com/bioperl/Bio-FeatureIO > [2] https://github.com/bioperl/bioperl-live/commit/7218728b66ad297953676236077fd0ec757378c0 -- ------------------------------------------------------------------------ Scott Cain, Ph. D. scott at scottcain dot net GMOD Coordinator (http://gmod.org/) 216-392-3087 Ontario Institute for Cancer Research From carandraug+dev at gmail.com Tue Feb 19 13:52:57 2013 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Tue, 19 Feb 2013 18:52:57 +0000 Subject: [Bioperl-l] bioperl in Google Summer of Code 2013 In-Reply-To: References: <50A8D5F3-7248-49C1-AA1D-969C2BCA38A9@drycafe.net> <118F034CF4C3EF48A96F86CE585B94BF6CE28328@CHIMBX5.ad.uillinois.edu> Message-ID: On 15 February 2013 14:28, Hilmar Lapp wrote: > [...] > If you fancy proposing a project idea or mentoring, now is not a bad time to think about that or looking for co-mentors. On 15 February 2013 14:59, Fields, Christopher J wrote: > At the moment I'm not sure if Rob is heading this up or if the baton will be passed on to someone else. I can't take charge of writing up a proposal at the moment but I can certainly help edit. I would like to participate this year as a student. I do not have however, have any bioperl itch that would last a summer to fix. The largest of them is to implement BLAST using NCBI's server. They have made available a SOAP-based BLAST and doing this has been on my todo for ages. Would you suggest any other project for bioperl? Carn? From peymanalavi at yahoo.com Tue Feb 19 16:16:49 2013 From: peymanalavi at yahoo.com (peyman alavi) Date: Tue, 19 Feb 2013 13:16:49 -0800 (PST) Subject: [Bioperl-l] BioGraphics: Bio::SCF installation through cpan fails Message-ID: <1361308609.90384.YahooMailNeo@web120901.mail.ne1.yahoo.com> Hello, I am having problems for a while trying to install the Bio::SCF module on my Vista32. Now, I know that Bio::SCF isn't really a Bioperl module, but I need it for Bio::Graphics, and I thought perhaps other people had experienced the same problem before.? I have installed zlib and io_lib (both their last available versions), but it looks like sth. (presumably with io_lib) is missing. I should be very grateful if someone could tell me what still needs to be done! Here are the paths where the io_lib "library" and "include" directories are installed, and I set them to cpan before trying to install Bio::SCF: o conf makepl_arg ?LIBS=-Lc:/MinGW/msys/1.0/local/lib INC=-Ic:/MinGW/msys/1.0/local/include? And the following is what I get on the STDOUT: ? Set up gcc environment - 4.7.2 [32m cpan shell -- CPAN exploration and modules installation (v1.9800) Enter 'h' for help.[0m ? [32m??? makepl_arg???????? [LIBS=-Lc:/MinGW/msys/1.0/local/lib INC=-Ic:/MinGW/msys/1.0/local/include][0m [32mPlease use 'o conf commit' to make the config permanent![0m ? [32m[0m [32mReading 'D:\Perl\cpan\Metadata'[0m [32m? Database was generated on Sun, 17 Feb 2013 12:17:02 GMT[0m [32mRunning install for module 'Bio::SCF'[0m [32mRunning make for L/LD/LDS/Bio-SCF-1.03.tar.gz[0m [32mChecksum for D:\Perl\cpan\sources\authors\id\L\LD\LDS\Bio-SCF-1.03.tar.gz ok[0m [32mScanning cache D:\Perl/cpan/build for sizes[0m [32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32mDONE[0m [32mBio-SCF-1.03/[0m [32mBio-SCF-1.03/t/[0m [32mBio-SCF-1.03/t/scf.t[0m [32mBio-SCF-1.03/eg/[0m [32mBio-SCF-1.03/eg/write_test_obj.pl[0m [32mBio-SCF-1.03/eg/write_test_tied.pl[0m [32mBio-SCF-1.03/eg/read_test_obj.pl[0m [32mBio-SCF-1.03/eg/read_test_tied.pl[0m [32mBio-SCF-1.03/SCF/[0m [32mBio-SCF-1.03/SCF/Arrays.pm[0m [32mBio-SCF-1.03/DISCLAIMER[0m [32mBio-SCF-1.03/README[0m [32mBio-SCF-1.03/SCF.pm[0m [32mBio-SCF-1.03/SCF.xs[0m [32mBio-SCF-1.03/Changes[0m [32mBio-SCF-1.03/test.scf[0m [32mBio-SCF-1.03/Makefile.PL[0m [32mBio-SCF-1.03/META.yml[0m [32mBio-SCF-1.03/INSTALL[0m [32mBio-SCF-1.03/MANIFEST[0m [32m ? CPAN.pm: Building L/LD/LDS/Bio-SCF-1.03.tar.gz[0m ? Set up gcc environment - 4.7.2 Checking if your kit is complete... Looks good Writing Makefile for Bio::SCF Writing MYMETA.yml and MYMETA.json cp SCF.pm blib\lib\Bio\SCF.pm cp SCF/Arrays.pm blib\lib\Bio\SCF\Arrays.pm D:\Perl\bin\perl.exe D:\Perl\site\lib\ExtUtils\xsubpp? -typemap D:\Perl\lib\ExtUtils\typemap? SCF.xs > SCF.xsc && D:\Perl\bin\perl.exe -MExtUtils::Command -e mv -- SCF.xsc SCF.c Please specify prototyping behavior for SCF.xs (see perlxs manual) c:/MinGW/bin/gcc.exe -c? -Ic:/MinGW/msys/1.0/local/include ???????????? -DNDEBUG -DWIN32 -D_CONSOLE -DNO_STRICT -DHAVE_DES_FCRYPT -DUSE_SITECUSTOMIZE -DPERL_IMPLICIT_CONTEXT -DPERL_IMPLICIT_SYS -DUSE_PERLIO -D_USE_32BIT_TIME_T -DPERL_MSVCRT_READFIX -DHASATTRIBUTE -fno-strict-aliasing -mms-bitfields -O2 ??????? ??-DVERSION=\"1.03\" ??????? -DXS_VERSION=\"1.03\"? "-ID:\Perl\lib\CORE"? -DLITTLE_ENDIAN SCF.c In file included from c:/MinGW/msys/1.0/local/include/io_lib/scf.h:31:0, ???????????????? from SCF.xs:12: c:/MinGW/msys/1.0/local/include/io_lib/mFILE.h:23:0: warning: "MF_APPEND" redefined [enabled by default] In file included from c:\mingw\bin\../lib/gcc/mingw32/4.7.2/../../../../include/windows.h:55:0, ???????????????? from D:\Perl\lib\CORE/win32.h:61, ???????????????? from D:\Perl\lib\CORE/win32thread.h:4, ???????????????? from D:\Perl\lib\CORE/perl.h:2825, ???????????????? from SCF.xs:5: c:\mingw\bin\../lib/gcc/mingw32/4.7.2/../../../../include/winuser.h:131:0: note: this is the location of the previous definition SCF.xs: In function 'XS_Bio__SCF_get_scf_pointer': SCF.xs:35:2: warning: passing argument 3 of '(*Perl_ILIO_ptr((struct PerlInterpreter *)Perl_get_context()))->pNameStat' from incompatible pointer type [enabled by default] SCF.xs:35:2: note: expected 'struct _stati64 *' but argument is of type 'struct stat *' Running Mkbootstrap for Bio::SCF () D:\Perl\bin\perl.exe -MExtUtils::Command -e chmod -- 644 SCF.bs D:\Perl\bin\perl.exe -MExtUtils::Mksymlists \ ???? -e "Mksymlists('NAME'=>\"Bio::SCF\", 'DLBASE' => 'SCF', 'DL_FUNCS' => {? }, 'FUNCLIST' => [], 'IMPORTS' => {? }, 'DL_VARS' => []);" Set up gcc environment - 4.7.2 dlltool --def SCF.def --output-exp dll.exp c:\MinGW\bin\g++.exe -o blib\arch\auto\Bio\SCF\SCF.dll -Wl,--base-file -Wl,dll.base -mdll -L"D:\Perl\lib\CORE" SCF.o?? D:\Perl\lib\CORE\perl512.lib c:\MinGW\lib\libkernel32.a c:\MinGW\lib\libuser32.a c:\MinGW\lib\libgdi32.a c:\MinGW\lib\libwinspool.a c:\MinGW\lib\libcomdlg32.a c:\MinGW\lib\libadvapi32.a c:\MinGW\lib\libshell32.a c:\MinGW\lib\libole32.a c:\MinGW\lib\liboleaut32.a c:\MinGW\lib\libnetapi32.a c:\MinGW\lib\libuuid.a c:\MinGW\lib\libws2_32.a c:\MinGW\lib\libmpr.a c:\MinGW\lib\libwinmm.a c:\MinGW\lib\libversion.a c:\MinGW\lib\libodbc32.a c:\MinGW\lib\libodbccp32.a c:\MinGW\lib\libcomctl32.a c:\MinGW\lib\libmsvcrt.a dll.exp Warning: resolving _VirtualQuery at 12 by linking to _VirtualQuery Use --enable-stdcall-fixup to disable these warnings Use --disable-stdcall-fixup to disable these fixups Warning: resolving _VirtualProtect at 16 by linking to _VirtualProtect Warning: resolving _EnterCriticalSection at 4 by linking to _EnterCriticalSection Warning: resolving _TlsGetValue at 4 by linking to _TlsGetValue Warning: resolving _GetLastError at 0 by linking to _GetLastError Warning: resolving _LeaveCriticalSection at 4 by linking to _LeaveCriticalSection Warning: resolving _DeleteCriticalSection at 4 by linking to _DeleteCriticalSection Warning: resolving _InitializeCriticalSection at 4 by linking to _InitializeCriticalSection SCF.o:SCF.c:(.text+0xf35): undefined reference to `mfreopen' SCF.o:SCF.c:(.text+0xf4b): undefined reference to `mfwrite_scf' SCF.o:SCF.c:(.text+0xf6a): undefined reference to `mfflush' SCF.o:SCF.c:(.text+0xf72): undefined reference to `mfdestroy' SCF.o:SCF.c:(.text+0x1138): undefined reference to `write_scf' SCF.o:SCF.c:(.text+0x16ac): undefined reference to `scf_deallocate' SCF.o:SCF.c:(.text+0x17b1): undefined reference to `mfreopen' SCF.o:SCF.c:(.text+0x17c1): undefined reference to `mfread_scf' SCF.o:SCF.c:(.text+0x19bd): undefined reference to `read_scf' c:/mingw/bin/../lib/gcc/mingw32/4.7.2/../../../../mingw32/bin/ld.exe: SCF.o: bad reloc address 0xa4 in section `.rdata' c:/mingw/bin/../lib/gcc/mingw32/4.7.2/../../../../mingw32/bin/ld.exe: final link failed: Invalid operation collect2.exe: error: ld returned 1 exit status dmake.exe:? Error code 129, while making 'blib\arch\auto\Bio\SCF\SCF.dll' [32m? LDS/Bio-SCF-1.03.tar.gz[0m [31m? D:\Perl\site\bin\dmake.exe -- NOT OK[0m [32mRunning make test[0m [32m? Can't test without successful make[0m [32mRunning make install[0m [32m? Make had returned bad status, install seems impossible[0m [32mFailed during this command: ?LDS/Bio-SCF-1.03.tar.gz????????????????????? : make NO[0m [32m[0m [31mWarning: Configuration not saved.[0m [32mLockfile removed.[0m ? ? ?Thanks in advance for any useful suggestions/help!! Peyman From scott at scottcain.net Tue Feb 19 18:39:44 2013 From: scott at scottcain.net (Scott Cain) Date: Tue, 19 Feb 2013 18:39:44 -0500 Subject: [Bioperl-l] BioGraphics: Bio::SCF installation through cpan fails In-Reply-To: <1361308609.90384.YahooMailNeo@web120901.mail.ne1.yahoo.com> References: <1361308609.90384.YahooMailNeo@web120901.mail.ne1.yahoo.com> Message-ID: <777246AB-2EF0-403D-9652-8EA8390D5C53@scottcain.net> Hi Peyman, I have no idea what might be required to get staden and Bio::SCF installed on a windows machine; you have my sympathies for having to go through it. But what I wanted to touch on was what you wrote, that is, that you "need" it for Bio::Graphics. I just wanted to point out that you don't need it unless you want to be able to display traces from ABI sequencers (which most people don't really care to do these days). Bioi::SCF is listed as a recommended module, not a required one. Scott Sent from my iPad On Feb 19, 2013, at 4:16 PM, peyman alavi wrote: > Hello, > I am having > problems for a while trying to install the Bio::SCF module on my Vista32. Now, I know that Bio::SCF isn't really a Bioperl module, but I need it for Bio::Graphics, and I thought perhaps other people had experienced the same problem before. I > have installed zlib and io_lib (both their last available versions), but it > looks like sth. (presumably with io_lib) is missing. I should be very grateful > if someone could tell me what still needs to be done! > Here are > the paths where the io_lib "library" and "include" directories are installed, and I > set them to cpan before trying to install Bio::SCF: > o conf > makepl_arg ?LIBS=-Lc:/MinGW/msys/1.0/local/lib INC=-Ic:/MinGW/msys/1.0/local/include? > And the > following is what I get on the STDOUT: > > Set up gcc environment - 4.7.2 > [32m > cpan shell -- CPAN exploration and modules installation (v1.9800) > Enter 'h' for help.[0m > > [32m makepl_arg [LIBS=-Lc:/MinGW/msys/1.0/local/lib > INC=-Ic:/MinGW/msys/1.0/local/include][0m > [32mPlease use 'o conf commit' to make the config permanent![0m > > [32m[0m > [32mReading 'D:\Perl\cpan\Metadata'[0m > [32m Database was generated on > Sun, 17 Feb 2013 12:17:02 GMT[0m > [32mRunning install for module 'Bio::SCF'[0m > [32mRunning make for L/LD/LDS/Bio-SCF-1.03.tar.gz[0m > [32mChecksum for > D:\Perl\cpan\sources\authors\id\L\LD\LDS\Bio-SCF-1.03.tar.gz ok[0m > [32mScanning cache D:\Perl/cpan/build for sizes[0m > [32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32mDONE[0m > [32mBio-SCF-1.03/[0m > [32mBio-SCF-1.03/t/[0m > [32mBio-SCF-1.03/t/scf.t[0m > [32mBio-SCF-1.03/eg/[0m > [32mBio-SCF-1.03/eg/write_test_obj.pl[0m > [32mBio-SCF-1.03/eg/write_test_tied.pl[0m > [32mBio-SCF-1.03/eg/read_test_obj.pl[0m > [32mBio-SCF-1.03/eg/read_test_tied.pl[0m > [32mBio-SCF-1.03/SCF/[0m > [32mBio-SCF-1.03/SCF/Arrays.pm[0m > [32mBio-SCF-1.03/DISCLAIMER[0m > [32mBio-SCF-1.03/README[0m > [32mBio-SCF-1.03/SCF.pm[0m > [32mBio-SCF-1.03/SCF.xs[0m > [32mBio-SCF-1.03/Changes[0m > [32mBio-SCF-1.03/test.scf[0m > [32mBio-SCF-1.03/Makefile.PL[0m > [32mBio-SCF-1.03/META.yml[0m > [32mBio-SCF-1.03/INSTALL[0m > [32mBio-SCF-1.03/MANIFEST[0m > [32m > CPAN.pm: Building > L/LD/LDS/Bio-SCF-1.03.tar.gz[0m > > Set up gcc environment - 4.7.2 > Checking if your kit is complete... > Looks good > Writing Makefile for Bio::SCF > Writing MYMETA.yml and MYMETA.json > cp SCF.pm blib\lib\Bio\SCF.pm > cp SCF/Arrays.pm blib\lib\Bio\SCF\Arrays.pm > D:\Perl\bin\perl.exe D:\Perl\site\lib\ExtUtils\xsubpp -typemap D:\Perl\lib\ExtUtils\typemap SCF.xs > SCF.xsc && > D:\Perl\bin\perl.exe -MExtUtils::Command -e mv -- SCF.xsc SCF.c > Please specify prototyping behavior for SCF.xs (see perlxs manual) > c:/MinGW/bin/gcc.exe -c -Ic:/MinGW/msys/1.0/local/include -DNDEBUG > -DWIN32 -D_CONSOLE -DNO_STRICT -DHAVE_DES_FCRYPT -DUSE_SITECUSTOMIZE > -DPERL_IMPLICIT_CONTEXT -DPERL_IMPLICIT_SYS -DUSE_PERLIO -D_USE_32BIT_TIME_T > -DPERL_MSVCRT_READFIX -DHASATTRIBUTE -fno-strict-aliasing -mms-bitfields -O2 -DVERSION=\"1.03\" -DXS_VERSION=\"1.03\" "-ID:\Perl\lib\CORE" -DLITTLE_ENDIAN SCF.c > In file included from c:/MinGW/msys/1.0/local/include/io_lib/scf.h:31:0, > from SCF.xs:12: > c:/MinGW/msys/1.0/local/include/io_lib/mFILE.h:23:0: warning: > "MF_APPEND" redefined [enabled by default] > In file included from > c:\mingw\bin\../lib/gcc/mingw32/4.7.2/../../../../include/windows.h:55:0, > from > D:\Perl\lib\CORE/win32.h:61, > from > D:\Perl\lib\CORE/win32thread.h:4, > from > D:\Perl\lib\CORE/perl.h:2825, > from SCF.xs:5: > c:\mingw\bin\../lib/gcc/mingw32/4.7.2/../../../../include/winuser.h:131:0: > note: this is the location of the previous definition > SCF.xs: In function 'XS_Bio__SCF_get_scf_pointer': > SCF.xs:35:2: warning: passing argument 3 of '(*Perl_ILIO_ptr((struct > PerlInterpreter *)Perl_get_context()))->pNameStat' from incompatible pointer > type [enabled by default] > SCF.xs:35:2: note: expected 'struct _stati64 *' but argument is of type > 'struct stat *' > Running Mkbootstrap for Bio::SCF () > D:\Perl\bin\perl.exe -MExtUtils::Command -e chmod -- 644 SCF.bs > D:\Perl\bin\perl.exe -MExtUtils::Mksymlists \ > -e > "Mksymlists('NAME'=>\"Bio::SCF\", 'DLBASE' => 'SCF', > 'DL_FUNCS' => { }, 'FUNCLIST' => > [], 'IMPORTS' => { }, 'DL_VARS' => > []);" > Set up gcc environment - 4.7.2 > dlltool --def SCF.def --output-exp dll.exp > c:\MinGW\bin\g++.exe -o blib\arch\auto\Bio\SCF\SCF.dll -Wl,--base-file > -Wl,dll.base -mdll -L"D:\Perl\lib\CORE" SCF.o D:\Perl\lib\CORE\perl512.lib > c:\MinGW\lib\libkernel32.a c:\MinGW\lib\libuser32.a c:\MinGW\lib\libgdi32.a > c:\MinGW\lib\libwinspool.a c:\MinGW\lib\libcomdlg32.a c:\MinGW\lib\libadvapi32.a > c:\MinGW\lib\libshell32.a c:\MinGW\lib\libole32.a c:\MinGW\lib\liboleaut32.a > c:\MinGW\lib\libnetapi32.a c:\MinGW\lib\libuuid.a c:\MinGW\lib\libws2_32.a > c:\MinGW\lib\libmpr.a c:\MinGW\lib\libwinmm.a c:\MinGW\lib\libversion.a > c:\MinGW\lib\libodbc32.a c:\MinGW\lib\libodbccp32.a c:\MinGW\lib\libcomctl32.a > c:\MinGW\lib\libmsvcrt.a dll.exp > Warning: resolving _VirtualQuery at 12 by linking to _VirtualQuery > Use --enable-stdcall-fixup to disable these warnings > Use --disable-stdcall-fixup to disable these fixups > Warning: resolving _VirtualProtect at 16 by linking to _VirtualProtect > Warning: resolving _EnterCriticalSection at 4 by linking to > _EnterCriticalSection > Warning: resolving _TlsGetValue at 4 by linking to _TlsGetValue > Warning: resolving _GetLastError at 0 by linking to _GetLastError > Warning: resolving _LeaveCriticalSection at 4 by linking to > _LeaveCriticalSection > Warning: resolving _DeleteCriticalSection at 4 by linking to > _DeleteCriticalSection > Warning: resolving _InitializeCriticalSection at 4 by linking to > _InitializeCriticalSection > SCF.o:SCF.c:(.text+0xf35): undefined reference to `mfreopen' > SCF.o:SCF.c:(.text+0xf4b): undefined reference to `mfwrite_scf' > SCF.o:SCF.c:(.text+0xf6a): undefined reference to `mfflush' > SCF.o:SCF.c:(.text+0xf72): undefined reference to `mfdestroy' > SCF.o:SCF.c:(.text+0x1138): undefined reference to `write_scf' > SCF.o:SCF.c:(.text+0x16ac): undefined reference to `scf_deallocate' > SCF.o:SCF.c:(.text+0x17b1): undefined reference to `mfreopen' > SCF.o:SCF.c:(.text+0x17c1): undefined reference to `mfread_scf' > SCF.o:SCF.c:(.text+0x19bd): undefined reference to `read_scf' > c:/mingw/bin/../lib/gcc/mingw32/4.7.2/../../../../mingw32/bin/ld.exe: > SCF.o: bad reloc address 0xa4 in section `.rdata' > c:/mingw/bin/../lib/gcc/mingw32/4.7.2/../../../../mingw32/bin/ld.exe: > final link failed: Invalid operation > collect2.exe: error: ld returned 1 exit status > dmake.exe: Error code 129, while > making 'blib\arch\auto\Bio\SCF\SCF.dll' > [32m LDS/Bio-SCF-1.03.tar.gz[0m > [31m D:\Perl\site\bin\dmake.exe > -- NOT OK[0m > [32mRunning make test[0m > [32m Can't test without successful > make[0m > [32mRunning make install[0m > [32m Make had returned bad > status, install seems impossible[0m > [32mFailed during this command: > LDS/Bio-SCF-1.03.tar.gz : make NO[0m > [32m[0m > [31mWarning: Configuration not saved.[0m > [32mLockfile removed.[0m > > > Thanks in advance for any useful > suggestions/help!! > Peyman > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From anngregory at email.arizona.edu Wed Feb 20 00:20:41 2013 From: anngregory at email.arizona.edu (Ann Gregory) Date: Tue, 19 Feb 2013 22:20:41 -0700 Subject: [Bioperl-l] Problem Parsing BLAST output to annotate FASTA file Message-ID: Hi BioPerl, I am having issues with a BioPerl script. I have a blastxml file from a blastx blast and the original multifasta file containing the original nucleotides sequences. I want to take the blast result (ie. the blast description) and annotate my multifasta file. I have written 2 while loops that extract the blast descriptions as well as the nucleotide sequence from the multifasta file. My problem is that I cannot incorporate one of the while loops into the other without loosing the loop property of one of the loops. I would like to take the 1st blast description, then the 1st nucleotide sequence, then the 2nd blast description, then the 2nd nucleotide sequence and so on...just can figure out how to alternate the results. See script below: use warnings; use strict; use Bio::SearchIO; use Bio::SeqIO; my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file => "$ARGV[0]"); while (my $result = $search_in->next_result) { while (my $hit = $result->next_hit) { while (my $hsp = $hit->next_hsp) { my $qd = $hit->description; print $qd, "\n"; } } } my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]"); while (my $seqobj = $seqio->next_seq) { my $nuc = $seqobj->seq(); print $nuc, "\n"; }-- Ann (Nina) Gregory Graduate Student Rich Lab / Sullivan Lab Soil, Water, Environmental Science Department University of Arizona From yonexhalaolv at gmail.com Wed Feb 20 04:17:12 2013 From: yonexhalaolv at gmail.com (Sebastian Lau) Date: Wed, 20 Feb 2013 01:17:12 -0800 (PST) Subject: [Bioperl-l] =?utf-8?q?failed_to_install_via_fink=EF=BC=9Ano_packa?= =?utf-8?q?ge_found_for_specification_=27bioperl-pm5100=27!?= Message-ID: <84fa1bcb-a39f-4847-bff2-e3a9c2b909ea@googlegroups.com> *Hi guys,* * * *I just about to install bioperl on my MacOS 10.7.5 via fink. but after typing the command, fink said it couldn't find any package:* fangmatoMacBook-Pro:~ yoyo$ fink install bioperl-pm5100 Information about 6901 packages read in 1 seconds. Failed: no package found for specification 'bioperl-pm5100'! fangmatoMacBook-Pro:~ yoyo$ fink install bioperl-pm588 Information about 6901 packages read in 1 seconds. Failed: no package found for specification 'bioperl-pm588'! fangmatoMacBook-Pro:~ yoyo$ fink install bioperl-pm586 Information about 6901 packages read in 1 seconds. Failed: no package found for specification 'bioperl-pm586'! *I followed the instruction on wiki. I don't know what's wrong with it. Thanks for your help.* From awitney at sgul.ac.uk Wed Feb 20 10:22:51 2013 From: awitney at sgul.ac.uk (Adam Witney) Date: Wed, 20 Feb 2013 15:22:51 +0000 Subject: [Bioperl-l] Problem Parsing BLAST output to annotate FASTA file In-Reply-To: References: Message-ID: <5124EA4B.5020409@sgul.ac.uk> Hi Ann, On 20/02/2013 05:20, Ann Gregory wrote: > Hi BioPerl, > > I am having issues with a BioPerl script. I have a blastxml file from a > blastx blast and the original multifasta file containing the original > nucleotides sequences. > > I want to take the blast result (ie. the blast description) and annotate my > multifasta file. > > I have written 2 while loops that extract the blast descriptions as well as > the nucleotide sequence from the multifasta file. > > My problem is that I cannot incorporate one of the while loops into the > other without loosing the loop property of one of the loops. I would like > to take the 1st blast description, then the 1st nucleotide sequence, then > the 2nd blast description, then the 2nd nucleotide sequence and so > on...just can figure out how to alternate the results. > > See script below: > > > use warnings; > use strict; > use Bio::SearchIO; > use Bio::SeqIO; > > > my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file => > "$ARGV[0]"); > while (my $result = $search_in->next_result) { > while (my $hit = $result->next_hit) { > while (my $hsp = $hit->next_hsp) { > my $qd = $hit->description; > print $qd, "\n"; > } > } > } > > my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]"); > while (my $seqobj = $seqio->next_seq) { > my $nuc = $seqobj->seq(); > print $nuc, "\n"; > }-- I think what you are proposing assumes that the loop over the BLAST results will come back in the same order as the loop over the Fasta file, this may be the case, but I'm not sure its something I would rely on. Anyway, I would loop over the BLAST results, storing the relevant data to an array or hash and then loop over the fasta file to put the two together. eg: my $blast_data; while ( ... blast data ... ) { ... $blast_data->{$qd} = ... } while ( my $seqobj = $seqio->next_seq ) { my $id = $seqobj->id; print $blast_data->{$id}."\n"; } something along those lines... or have i misunderstood you? if so can you provide some more details, like what do you want your output to look like? HTH Adam From andreas.leimbach at uni-wuerzburg.de Wed Feb 20 11:24:50 2013 From: andreas.leimbach at uni-wuerzburg.de (Andreas Leimbach) Date: Wed, 20 Feb 2013 17:24:50 +0100 Subject: [Bioperl-l] Problem Parsing BLAST output to annotate FASTA file In-Reply-To: References: Message-ID: <5124F8D2.4020904@uni-wuerzburg.de> oops, I just realized I had one loop to much in there. Adam is correct. Sorry. The last part of the code I send you should look like this: my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]"); while (my $seqobj = $seqio->next_seq) { print ">$hits{$seqobj->display_id}\n"; my $nuc = $seqobj->seq(); print $nuc, "\n"; } Cheers, Andreas -- Andreas Leimbach Universit?t M?nster Institut f?r Hygiene Mendelstr. 7 D-48149 M?nster Germany Tel.: +49 (0)551 39 3843 E-Mail: andreas.leimbach at uni-wuerzburg.de On 20.2.13 06:20, Ann Gregory wrote: > Hi BioPerl, > > I am having issues with a BioPerl script. I have a blastxml file from a > blastx blast and the original multifasta file containing the original > nucleotides sequences. > > I want to take the blast result (ie. the blast description) and annotate my > multifasta file. > > I have written 2 while loops that extract the blast descriptions as well as > the nucleotide sequence from the multifasta file. > > My problem is that I cannot incorporate one of the while loops into the > other without loosing the loop property of one of the loops. I would like > to take the 1st blast description, then the 1st nucleotide sequence, then > the 2nd blast description, then the 2nd nucleotide sequence and so > on...just can figure out how to alternate the results. > > See script below: > > > use warnings; > use strict; > use Bio::SearchIO; > use Bio::SeqIO; > > > my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file => > "$ARGV[0]"); > while (my $result = $search_in->next_result) { > while (my $hit = $result->next_hit) { > while (my $hsp = $hit->next_hsp) { > my $qd = $hit->description; > print $qd, "\n"; > } > } > } > > my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]"); > while (my $seqobj = $seqio->next_seq) { > my $nuc = $seqobj->seq(); > print $nuc, "\n"; > }-- > Ann (Nina) Gregory > Graduate Student > Rich Lab / Sullivan Lab > Soil, Water, Environmental Science Department > University of Arizona > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From andreas.leimbach at uni-wuerzburg.de Wed Feb 20 11:14:29 2013 From: andreas.leimbach at uni-wuerzburg.de (Andreas Leimbach) Date: Wed, 20 Feb 2013 17:14:29 +0100 Subject: [Bioperl-l] Problem Parsing BLAST output to annotate FASTA file In-Reply-To: References: Message-ID: <5124F665.5050602@uni-wuerzburg.de> Hi Ann, I agree with Adam, but I was already writing my email, while his came in. Hope it helps: I hope I understand correctly what you want to do. Just to clarify, you queried a protein blast database with blastx and nucleotide queries. Now you want to associate the protein description for the FIRST blast hit with the corresponding nucleotide fasta file. Is that correct? You have to put the two while loops into one another. Or associate the blast hits with the query descriptions. But it's not feasible to take the first blast hit and the first nucleotide fasta seq, then the 2nd of both etc, as Adam already pointed out. You would have to iterate through both at the same time. I.e. take the first blast hit, then iterate through the nucleotide fasta until you find the hit. Then take the 2nd blast hit and iterate through the nucleotide fasta etc. It's probably easiest to do this in a hash. Something along the lines of (not tested I just punched that in the E-Mail): my %hits; my $hit_desc; my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file => "$ARGV[0]"); while (my $result = $search_in->next_result) { while (my $hit = $result->next_hit) { while (my $hsp = $hit->next_hsp) { if ($hit->description eq $hit_desc) { # Only want the first blast hit next; } my $hit_desc = $hit->description; $hits{$result->query_description} = $hit_desc; } } } my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]"); foreach my $query (keys %hits) { while (my $seqobj = $seqio->next_seq) { if ($seqobj->display_id eq $query) { print ">$hits{$query}\n"; my $nuc = $seqobj->seq(); print $nuc, "\n"; } You might want to put some evalue cutoff in there to only score significant hits. Also if your nucleotide query multi-fasta file is very large, you might consider creating an index first: http://www.bioperl.org/wiki/HOWTO:Local_Databases#Bio::Index Hope that helps! Cheers, Andreas P.S.: Please next time include version numbers for BioPerl and Perl and a little more detail what you want to do. ;-) -- Andreas Leimbach Universit?t M?nster Institut f?r Hygiene Mendelstr. 7 D-48149 M?nster Germany Tel.: +49 (0)551 39 3843 E-Mail: andreas.leimbach at uni-wuerzburg.de On 20.2.13 06:20, Ann Gregory wrote: > Hi BioPerl, > > I am having issues with a BioPerl script. I have a blastxml file from a > blastx blast and the original multifasta file containing the original > nucleotides sequences. > > I want to take the blast result (ie. the blast description) and annotate my > multifasta file. > > I have written 2 while loops that extract the blast descriptions as well as > the nucleotide sequence from the multifasta file. > > My problem is that I cannot incorporate one of the while loops into the > other without loosing the loop property of one of the loops. I would like > to take the 1st blast description, then the 1st nucleotide sequence, then > the 2nd blast description, then the 2nd nucleotide sequence and so > on...just can figure out how to alternate the results. > > See script below: > > > use warnings; > use strict; > use Bio::SearchIO; > use Bio::SeqIO; > > > my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file => > "$ARGV[0]"); > while (my $result = $search_in->next_result) { > while (my $hit = $result->next_hit) { > while (my $hsp = $hit->next_hsp) { > my $qd = $hit->description; > print $qd, "\n"; > } > } > } > > my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]"); > while (my $seqobj = $seqio->next_seq) { > my $nuc = $seqobj->seq(); > print $nuc, "\n"; > }-- > Ann (Nina) Gregory > Graduate Student > Rich Lab / Sullivan Lab > Soil, Water, Environmental Science Department > University of Arizona > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From andreas.leimbach at uni-wuerzburg.de Wed Feb 20 12:00:51 2013 From: andreas.leimbach at uni-wuerzburg.de (Andreas Leimbach) Date: Wed, 20 Feb 2013 18:00:51 +0100 Subject: [Bioperl-l] Problem Parsing BLAST output to annotate FASTA file In-Reply-To: References: <5124F8D2.4020904@uni-wuerzburg.de> Message-ID: <51250143.9050503@uni-wuerzburg.de> Hey Ann, damn, it 's not my best day ... Anyways, I wouldn't work with List::MoreUtils's each_array function, as this assumes that the blast hits and the nucleotide queries are in the same order (as Adam pointed out). Rather use a hash which associates a key to a certain value. Also, the hash can be used to skip sequences that have no hits. Here's my new version: my %hits; my $hit_desc; my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file => "$ARGV[0]"); while (my $result = $search_in->next_result) { while (my $hit = $result->next_hit) { while (my $hsp = $hit->next_hsp) { $hits{$result->query_description} = $hit->description; # hash: associate query_desc (key) with hit_desc (value) last; # jump out of the while loop; this should resolve getting only the first hit } last; # see above } } my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]"); while (my $seqobj = $seqio->next_seq) { if ($hits{$seqobj->display_id}) { # only true if display_id associated with hit_desc and should skip seqs without hits print ">$hits{$seqobj->display_id}\n"; my $nuc = $seqobj->seq(); print $nuc, "\n"; } } Cheers, Andreas P.S.: I redirected your mail to the BioPerl mailing list, others might profit from my mistakes ;-) ... -- Andreas Leimbach Universit?t M?nster Institut f?r Hygiene Mendelstr. 7 D-48149 M?nster Germany Tel.: +49 (0)551 39 3843 E-Mail: andreas.leimbach at uni-wuerzburg.de On 20.2.13 17:35, Ann Gregory wrote: > Hi Andreas, > > Thanks for you help! I don't understand how this gets the first blast hit: > > if ($hit->description eq $hit_desc) { # Only want the first blast hit > next; > } > > I tried this and seems to be working...but I can't get the 1st blast hit > or skip the sequences that had no hits. Do you know any quick fixes? > > * > use warnings; > use strict; > use Bio::SearchIO; > use Bio::SeqIO; > use List::MoreUtils qw(each_array); > > my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file => > "$ARGV[0]"); > my @ids; > while (my $result = $search_in->next_result) { > while (my $hit = $result->next_hit) { > while (my $hsp = $hit->next_hsp) { > my $match = $result->num_hits; > push(@ids, $qd); > } > } > } > } > > my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]"); > my @seqs; > while (my $seqobj = $seqio->next_seq) { > my $nuc = $seqobj->seq(); > push(@seqs, $nuc); > } > > my $it = each_array(@ids, at seqs); > while(my($ids,$seqs)=$it->()){ > print $ids, "\n", $seqs, "\n"; > } > * > > Thanks again! > ~Ann > > On Wed, Feb 20, 2013 at 9:24 AM, Andreas Leimbach > > wrote: > > oops, I just realized I had one loop to much in there. Adam is > correct. Sorry. > > The last part of the code I send you should look like this: > > > my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]"); > while (my $seqobj = $seqio->next_seq) { > print ">$hits{$seqobj->display_id}\__n"; > > my $nuc = $seqobj->seq(); > print $nuc, "\n"; > } > > > Cheers, > Andreas > > > -- > Andreas Leimbach > Universit?t M?nster > Institut f?r Hygiene > Mendelstr. 7 > D-48149 M?nster > Germany > > Tel.: +49 (0)551 39 3843 > E-Mail: andreas.leimbach at uni-__wuerzburg.de > > > On 20.2.13 06:20, Ann Gregory wrote: > > Hi BioPerl, > > I am having issues with a BioPerl script. I have a blastxml file > from a > blastx blast and the original multifasta file containing the > original > nucleotides sequences. > > I want to take the blast result (ie. the blast description) and > annotate my > multifasta file. > > I have written 2 while loops that extract the blast descriptions > as well as > the nucleotide sequence from the multifasta file. > > My problem is that I cannot incorporate one of the while loops > into the > other without loosing the loop property of one of the loops. I > would like > to take the 1st blast description, then the 1st nucleotide > sequence, then > the 2nd blast description, then the 2nd nucleotide sequence and so > on...just can figure out how to alternate the results. > > See script below: > > > use warnings; > use strict; > use Bio::SearchIO; > use Bio::SeqIO; > > > my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file => > "$ARGV[0]"); > while (my $result = $search_in->next_result) { > while (my $hit = $result->next_hit) { > while (my $hsp = $hit->next_hsp) { > my $qd = $hit->description; > print $qd, "\n"; > } > } > } > > my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => > "$ARGV[1]"); > while (my $seqobj = $seqio->next_seq) { > my $nuc = $seqobj->seq(); > print $nuc, "\n"; > }-- > Ann (Nina) Gregory > Graduate Student > Rich Lab / Sullivan Lab > Soil, Water, Environmental Science Department > University of Arizona > _________________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/__mailman/listinfo/bioperl-l > > > > > > -- > Ann (Nina) Gregory > Graduate Student > Rich Lab / Sullivan Lab > Soil, Water, Environmental Science Department > University of Arizona > > > From cjfields at illinois.edu Wed Feb 20 13:24:58 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 20 Feb 2013 18:24:58 +0000 Subject: [Bioperl-l] Problem Parsing BLAST output to annotate FASTA file In-Reply-To: <51250143.9050503@uni-wuerzburg.de> References: <5124F8D2.4020904@uni-wuerzburg.de> <51250143.9050503@uni-wuerzburg.de> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE2EB4A@CHIMBX5.ad.uillinois.edu> If this is meant to be something done using the same FASTA files for a bunch of BLAST reports, might be worth setting up a flat file index and using that to look up and grab the sequences; it should be a LOT faster, just the first pass (generation of the initial index) would take a little time. Look at Bio::DB::Fasta for an example. chris On Feb 20, 2013, at 11:00 AM, Andreas Leimbach wrote: > Hey Ann, > > damn, it 's not my best day ... Anyways, I wouldn't work with List::MoreUtils's each_array function, as this assumes that the blast hits and the nucleotide queries are in the same order (as Adam pointed out). Rather use a hash which associates a key to a certain value. Also, the hash can be used to skip sequences that have no hits. > Here's my new version: > > my %hits; > my $hit_desc; > my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file => > "$ARGV[0]"); > while (my $result = $search_in->next_result) { > while (my $hit = $result->next_hit) { > while (my $hsp = $hit->next_hsp) { > $hits{$result->query_description} = $hit->description; # hash: associate query_desc (key) with hit_desc (value) > last; # jump out of the while loop; this should resolve getting only the first hit > } > last; # see above > } > } > > > my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]"); > while (my $seqobj = $seqio->next_seq) { > if ($hits{$seqobj->display_id}) { # only true if display_id associated with hit_desc and should skip seqs without hits > print ">$hits{$seqobj->display_id}\n"; > my $nuc = $seqobj->seq(); > print $nuc, "\n"; > } > } > > Cheers, > Andreas > > P.S.: I redirected your mail to the BioPerl mailing list, others might profit from my mistakes ;-) ... > > -- > Andreas Leimbach > Universit?t M?nster > Institut f?r Hygiene > Mendelstr. 7 > D-48149 M?nster > Germany > > Tel.: +49 (0)551 39 3843 > E-Mail: andreas.leimbach at uni-wuerzburg.de > > On 20.2.13 17:35, Ann Gregory wrote: >> Hi Andreas, >> >> Thanks for you help! I don't understand how this gets the first blast hit: >> >> if ($hit->description eq $hit_desc) { # Only want the first blast hit >> next; >> } >> >> I tried this and seems to be working...but I can't get the 1st blast hit >> or skip the sequences that had no hits. Do you know any quick fixes? >> >> * >> use warnings; >> use strict; >> use Bio::SearchIO; >> use Bio::SeqIO; >> use List::MoreUtils qw(each_array); >> >> my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file => >> "$ARGV[0]"); >> my @ids; >> while (my $result = $search_in->next_result) { >> while (my $hit = $result->next_hit) { >> while (my $hsp = $hit->next_hsp) { >> my $match = $result->num_hits; >> push(@ids, $qd); >> } >> } >> } >> } >> >> my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]"); >> my @seqs; >> while (my $seqobj = $seqio->next_seq) { >> my $nuc = $seqobj->seq(); >> push(@seqs, $nuc); >> } >> >> my $it = each_array(@ids, at seqs); >> while(my($ids,$seqs)=$it->()){ >> print $ids, "\n", $seqs, "\n"; >> } >> * >> >> Thanks again! >> ~Ann >> >> On Wed, Feb 20, 2013 at 9:24 AM, Andreas Leimbach >> > > wrote: >> >> oops, I just realized I had one loop to much in there. Adam is >> correct. Sorry. >> >> The last part of the code I send you should look like this: >> >> >> my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]"); >> while (my $seqobj = $seqio->next_seq) { >> print ">$hits{$seqobj->display_id}\__n"; >> >> my $nuc = $seqobj->seq(); >> print $nuc, "\n"; >> } >> >> >> Cheers, >> Andreas >> >> >> -- >> Andreas Leimbach >> Universit?t M?nster >> Institut f?r Hygiene >> Mendelstr. 7 >> D-48149 M?nster >> Germany >> >> Tel.: +49 (0)551 39 3843 >> E-Mail: andreas.leimbach at uni-__wuerzburg.de >> >> >> On 20.2.13 06:20, Ann Gregory wrote: >> >> Hi BioPerl, >> >> I am having issues with a BioPerl script. I have a blastxml file >> from a >> blastx blast and the original multifasta file containing the >> original >> nucleotides sequences. >> >> I want to take the blast result (ie. the blast description) and >> annotate my >> multifasta file. >> >> I have written 2 while loops that extract the blast descriptions >> as well as >> the nucleotide sequence from the multifasta file. >> >> My problem is that I cannot incorporate one of the while loops >> into the >> other without loosing the loop property of one of the loops. I >> would like >> to take the 1st blast description, then the 1st nucleotide >> sequence, then >> the 2nd blast description, then the 2nd nucleotide sequence and so >> on...just can figure out how to alternate the results. >> >> See script below: >> >> >> use warnings; >> use strict; >> use Bio::SearchIO; >> use Bio::SeqIO; >> >> >> my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file => >> "$ARGV[0]"); >> while (my $result = $search_in->next_result) { >> while (my $hit = $result->next_hit) { >> while (my $hsp = $hit->next_hsp) { >> my $qd = $hit->description; >> print $qd, "\n"; >> } >> } >> } >> >> my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => >> "$ARGV[1]"); >> while (my $seqobj = $seqio->next_seq) { >> my $nuc = $seqobj->seq(); >> print $nuc, "\n"; >> }-- >> Ann (Nina) Gregory >> Graduate Student >> Rich Lab / Sullivan Lab >> Soil, Water, Environmental Science Department >> University of Arizona >> _________________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/__mailman/listinfo/bioperl-l >> >> >> >> >> >> -- >> Ann (Nina) Gregory >> Graduate Student >> Rich Lab / Sullivan Lab >> Soil, Water, Environmental Science Department >> University of Arizona >> >> >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From carandraug+dev at gmail.com Mon Feb 25 05:08:23 2013 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Mon, 25 Feb 2013 10:08:23 +0000 Subject: [Bioperl-l] module for description of sequence variants (where to place code) Message-ID: Hi I'm writing a perl module to write a description of the variance between 2 sequences as described on http://www.hgvs.org/mutnomen/recs-prot.html Basically, given 2 sequences, would returns something like "p.Lys2del p.His25_Met26insGln" if those are the differences. It also accounts for the existence of - characters on the sequences that may come from their alignment. My question is, where on the project tree should I place the module? Also, is there something already written that would convert from 1 to 3 letter code? Carn? From andreas.leimbach at uni-wuerzburg.de Mon Feb 25 05:32:43 2013 From: andreas.leimbach at uni-wuerzburg.de (Andreas Leimbach) Date: Mon, 25 Feb 2013 11:32:43 +0100 Subject: [Bioperl-l] module for description of sequence variants (where to place code) In-Reply-To: References: Message-ID: <512B3DCB.7050008@uni-wuerzburg.de> Hi Carn?, for your last question: You can convert aa strings from one to three letter code with 'Bio::SeqUtils'. Cheers, Andreas -- Andreas Leimbach Universit?t M?nster Institut f?r Hygiene Mendelstr. 7 D-48149 M?nster Germany Tel.: +49 (0)551 39 3843 E-Mail: andreas.leimbach at uni-wuerzburg.de On 25.2.13 11:08, Carn? Draug wrote: > Hi > > I'm writing a perl module to write a description of the variance > between 2 sequences as described on > http://www.hgvs.org/mutnomen/recs-prot.html > > Basically, given 2 sequences, would returns something like "p.Lys2del > p.His25_Met26insGln" if those are the differences. It also accounts > for the existence of - characters on the sequences that may come from > their alignment. > > My question is, where on the project tree should I place the module? > > Also, is there something already written that would convert from 1 to > 3 letter code? > > Carn? > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From genehack at genehack.org Wed Feb 27 19:57:48 2013 From: genehack at genehack.org (John SJ Anderson) Date: Wed, 27 Feb 2013 16:57:48 -0800 Subject: [Bioperl-l] YAPC talks? Message-ID: Hi - Is there anyone that was planning on submitting a Bioperl talk to YAPC::NA? In an unrelated conversation, one of the organizers expressed an interest in getting a Bioperl talk this year. If no one else is planning on a talk submission, Jay Hannah (aka deafferret) and I are promising/threatening a tag-team style "Bioperl rules / Bioperl sucks" overview/state of the dist style talk... thanks, john. From cjfields at illinois.edu Wed Feb 27 21:48:55 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Thu, 28 Feb 2013 02:48:55 +0000 Subject: [Bioperl-l] YAPC talks? In-Reply-To: References: Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6E705CD3@CHIMBX5.ad.uillinois.edu> At the moment I personally have no plans on going, but I think a no-holds-barred bioperl talk is a good idea. chris On Feb 27, 2013, at 6:57 PM, John SJ Anderson wrote: > Hi - > > Is there anyone that was planning on submitting a Bioperl talk to > YAPC::NA? In an unrelated conversation, one of the organizers > expressed an interest in getting a Bioperl talk this year. > > If no one else is planning on a talk submission, Jay Hannah (aka > deafferret) and I are promising/threatening a tag-team style "Bioperl > rules / Bioperl sucks" overview/state of the dist style talk... > > thanks, > john. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From hlapp at drycafe.net Wed Feb 27 22:20:34 2013 From: hlapp at drycafe.net (Hilmar Lapp) Date: Wed, 27 Feb 2013 22:20:34 -0500 Subject: [Bioperl-l] YAPC talks? In-Reply-To: References: Message-ID: <42C1F1B8-FE26-43A8-B601-E80D17D215EC@drycafe.net> On Feb 27, 2013, at 7:57 PM, John SJ Anderson wrote: > Jay Hannah (aka deafferret) and I are promising/threatening a tag-team style "Bioperl > rules / Bioperl sucks" overview/state of the dist style talk... Please videotape. I'll be sure to watch and promote it :-) -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : =========================================================== From saladi1 at illinois.edu Thu Feb 28 01:58:20 2013 From: saladi1 at illinois.edu (Shyam Saladi) Date: Wed, 27 Feb 2013 22:58:20 -0800 Subject: [Bioperl-l] EUtilities Cookbook - Accn to gi Message-ID: Hi, I think that rettype for the section "Get GIs for a list of accessions" should be -rettype => 'gi'); instead of 'gilist' as it is now. I think this change is due to a change in NCBI eutils. webpage: http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook#Get_GIs_for_a_list_of_accessions Thanks, Shyam From fossandonc at hotmail.com Thu Feb 28 10:36:34 2013 From: fossandonc at hotmail.com (=?iso-8859-1?Q?Francisco_J._Ossand=F3n?=) Date: Thu, 28 Feb 2013 12:36:34 -0300 Subject: [Bioperl-l] Fix for Bug #3376 broke somewhere else Message-ID: Hi, I was re-checking Bug #3302 using the Bio::SearchIO modules of the repository and found that now it can't parse a Hmmer2 file that was previously fine. After tracking the problem, I discovered that a change in a regular expression to fix another bug broke the parse. The fix for the Bug #3376 consisted in adding an extra condition to omit lines where end of domain indicator is split across lines (https://redmine.open-bio.org/issues/3376): TEST: domain 1 of 1, from 8 to 97: score 184.7, E = 2.5e-56 *->svfqqqqssksttgstvtAiAiAigYRYRYRAvtWnsGsLssGvnDn sv+qqqq+ + +vtAiAiAigYRYRYRAv Wn GsLs G nDn Test 8 SVYQQQQGGSA----MVTAIAIAIGYRYRYRAVVWNKGSLSTGTNDN 50 DnDqqsdgLYtiYYsvtvpssslpsqtviHHHaHkasstkiiikiePr<- DnDq +d LYtiYYsvtv +ss+p q+v+HHHaH+asstkiiiki P Test 51 DNDQAAD-LYTIYYSVTVSASSWPGQSVTHHHAHPASSTKIIIKIAPS 97 * Test - - This case is characterized by the 2 dashes in the line... So the expression added in hmmer2.pm - ?next_result? (https://github.com/bioperl/bioperl-live/commit/142e5d79e3a6593db32bf0af9904 8f47d01bd3f2): elsif (CORE::length($_) == 0 || ( $count != 1 && /^\s+$/o ) || /^\s+\-?\*\s*$/ || /^.+\-\s+\-\s*$/ ) ### <--- This regex was designed for bug 3376 { next; } But the expression used is too broad because it uses the "^.+" just before the 2 dashes, and it broke these lines parsing, where is full of dashes: KyACrqCdtiVQAPaPakpIErGiptaGLLArvlVSKyaEHlPLYRQsEI lcl|gi|340 - -------------------------------------------------- - yaRqGVeiaRstLadWVgrtgarLaPLvdALaeyVLkeGklHADeTPVqV +i s L V++ + r lcl|gi|340 60938 ------AIMISGLIHGVSARCLRF-------------------------- 60955 I think a reasonable fix that still fixes the original bug and restore the function for this case is to add an extra \s+ in the regex just before the first dash, so the expression makes sure that the first dash is the one that comes AFTER the description (and is replacing the usual coordinate number) and is not the last of an alignment or a series of dashes like the one above: elsif (CORE::length($_) == 0 || ( $count != 1 && /^\s+$/o ) || /^\s+\-?\*\s*$/ || /^.+\s+\-\s+\-\s*$/ ) ### <--- Tweaked regex { next; } I tested it and it works fine, hope you find the fix acceptable. Cheers, -- Francisco J. Ossandon Bioinformatician. Ph.D. Candidate, University Andres Bello. Center for Bioinformatics and Genome Biology, Fundacion Ciencia para la Vida. Santiago, Chile. www.cienciavida.cl/CBGB.htm From PDagosto at edgebio.com Mon Feb 25 11:50:34 2013 From: PDagosto at edgebio.com (Phil Dagosto) Date: Mon, 25 Feb 2013 16:50:34 +0000 Subject: [Bioperl-l] Error when running Build.PL Message-ID: Greetings, I downloaded BioPerl 1.6.1 from this location: http://www.bioperl.org/wiki/Getting_BioPerl When I ran Build.PL with all of the default settings chosen in the interactive mode I got the following error message: Could not get valid metadata. Error is: Invalid metadata structure. Errors: 'Perl_5' for 'license' does not have a URL scheme (resources -> license) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::FeatureIO::gff -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::WebAgent -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::EUtilParameters -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::OntologyIO::InterProParser -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Biblio::IO::medlinexml -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::strider -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::PhyloNetwork::RandomFactory -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::Analysis::DNA::ESEfinder -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::game::gameSubs -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::FeatureIO::interpro -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::GFF::Adaptor::berkeleydb -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::entrezgene -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::tinyseq -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::chadoxml -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::game::gameWriter -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::FileCache -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::bsml_sax -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::Primer3 -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::GFF::Adaptor::ace -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::PopGen::HtSNP -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tree::Compatible -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::Ace -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::Taxonomy::entrez -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::agave -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::PopGen::TagHaplotype -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::SeqFeature::Store::FeatureFileLoader -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::* -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::Analysis::Protein* -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SearchIO::blastxml -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::EUtilities -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tree::Draw::Cladogram -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::SeqPattern -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::tigrxml -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqFeature::Collection -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Draw::Pictogram -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SearchIO::Writer::BSMLResultWriter -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::Query::HIVQuery -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::TreeIO::svggraph -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::Biblio::eutils -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::PhyloNetwork -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::SeqPattern::BackTranslate -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::Query::GenBank -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Variation::IO::xml -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::PhyloNetwork::GraphViz -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqFeature::Annotated -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::NCBIHelper -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::HIV -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::Analysis::DNA* -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::Run::RemoteBlast -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::excel -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::ClusterIO::dbsnp -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Microarray::Tools::ReseqChip -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::Biblio::soap -> requires) [Validation: 1.2] at /usr/local/lib/perl5/5.10.1/Module/Build/Base.pm line 4559 Could not create MYMETA files Creating new 'Build' script for 'BioPerl' version '1.006001' I have no idea whether this is a problem or not or if I can proceed. Also, I'm confused by the version number referenced in the last line. 1.006001 is our current version - I thought I was installing version 1.6.1. Are these version numbers equivalent, i.e., are the zeros not meaningful?. I was actually looking for version 1.2.3 (or greater) - where can I find that? Thanks, Phil Phil Dagosto Sr. Software Engineer Edge Bio 201 Perry Parkway, Suite 5 Gaithersburg, MD 20850 pdagosto at edgebio.com (240) 912-8669 From chapmanb at 50mail.com Thu Feb 28 21:30:01 2013 From: chapmanb at 50mail.com (Brad Chapman) Date: Thu, 28 Feb 2013 21:30:01 -0500 Subject: [Bioperl-l] Coming soon: BOSC/Broad Hackathon, BOSC Codefest Message-ID: <874ngvua1i.fsf@fastmail.fm> Hi all; There are some upcoming coding events and conferences of interest to open source biology programmers: - BOSC/Broad Interoperability Hackathon -- This is a two day coding session at the Broad Institute in Cambridge, MA on April 7-8 focused on improving tool interoperability. Sign up and details: http://j.mp/XJT6ew - Codefest at the Bioinformatics Open Source Conference -- This year BOSC is taking place in Berlin from July 19-20 and we'll have a two day coding session before the conference. This is the 4th year of Codefests and they've proven to be a productive and fun time to work collectively on open source projects. Sign up and details: http://www.open-bio.org/wiki/Codefest_2013 BOSC conference: http://www.open-bio.org/wiki/BOSC_2013 Here are the key dates for the events and abstracts: April 7-8, 2013: BOSC/Broad Interoperability Hackathon, Cambridge, MA April 12, 2013: BOSC abstracts due July 17-18, 2013: Codefest 2013, Berlin July 19-20, 2013: BOSC 2013, Berlin Looking forward to seeing everyone this spring and summer for plenty of fun science and code, Brad From jason.stajich at gmail.com Fri Feb 1 01:58:57 2013 From: jason.stajich at gmail.com (Jason Stajich) Date: Thu, 31 Jan 2013 22:58:57 -0800 Subject: [Bioperl-l] Bioperl-l Digest, Vol 117, Issue 13 In-Reply-To: <575B184F-C051-4FEF-9BEB-7AB98E3C52A6@yahoo.com> References: <575B184F-C051-4FEF-9BEB-7AB98E3C52A6@yahoo.com> Message-ID: Dan - I think the answer is yes if others are doing it - I am not in a position to be much of a main coder. I don't know which format you speak of here or if you had to write something for the text blast changes or something else. Specific bug reports on formats that aren't working is always helpful. The XML format has been pretty stable so I would suggest that if you are simply parsing reports not looking at them. Chris posted instructions on how to contribute and the move to github simplifies this. That you had to write a whole new parser seems probably a bit severe - I hope that in the future people can speak to the problems sooner. If I hit a wall with something I can't do I usually write the code to fix it and contribute it back but I don't play follow-the-format-changes with the tools anymore, but hopefully others like yourself can make the contributions. If you speak to the response I made to the question below, I don't think anyone will be trying and support the NCBI's additional markups that refer to the upstream and downstream features as they are laid out in the text files without some serious effort. Perhaps in the future that information will be reported in the XML format and thus be more parseable. best wishes, Jason On Jan 30, 2013, at 1:40 PM, Dan kilburn wrote: > Hi Jason, > > Are there any plans to keep SearchIO up to date with ncbi blast? I know they change formats ridiculously often, but I had to write my own parser to get sequence identity, which I would rather not have done. I realize that this job would be a big load on anyone who takes it, but it's so fundamental. Maybe I can help. > > --Dan > Sent from my iPhone > > On Jan 30, 2013, at 12:00 PM, bioperl-l-request at lists.open-bio.org wrote: > >> Send Bioperl-l mailing list submissions to >> bioperl-l at lists.open-bio.org >> >> To subscribe or unsubscribe via the World Wide Web, visit >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> or, via email, send a message with subject or body 'help' to >> bioperl-l-request at lists.open-bio.org >> >> You can reach the person managing the list at >> bioperl-l-owner at lists.open-bio.org >> >> When replying, please edit your Subject line so it is more specific >> than "Re: Contents of Bioperl-l digest..." >> >> >> Today's Topics: >> >> 1. Re: Parsing Blast-Report extracting "Features flanking .." >> (Jason Stajich) >> >> >> ---------------------------------------------------------------------- >> >> Message: 1 >> Date: Tue, 29 Jan 2013 11:00:16 -0800 >> From: Jason Stajich >> Subject: Re: [Bioperl-l] Parsing Blast-Report extracting "Features >> flanking .." >> To: buschj at hhu.de >> Cc: bioperl-l at lists.open-bio.org >> Message-ID: <6E83E3F3-C304-4DC4-9A11-FE1CA90F207D at gmail.com> >> Content-Type: text/plain; charset=us-ascii >> >> We don't parse the NCBI feature info from the BLAST reports per your query. To look up a specific feature you can use Bio::DB::GenBank to query for sequence from a specific feature by accession number - see the HOWTOs for that. >> >> However, most people use tools that generate SAM/BAM files with short reads - then you can use a tool like bedtools to find overlaps of reads with the locations of features. >> >> basically: >> - download the genome and GFF for arabidopsis >> - align your sRNA to the genome with a short read aligner - bowtie, bwa, others >> - convert your sam to bam file with SAMtools or picard >> - compare the location of features with the reads to get expression summaries or individuals reads with BEDTools >> >> >> On Jan 25, 2013, at 2:20 AM, jobu wrote: >> >>> Am 22.01.2013 19:03, schrieb Mgavi Brathwaite: >>>> What upstream and downstream elements are you interested in? >>> >>> >>> I've got a huge pile of short RNA reads. >>> Part of the question now is whether those RNA fragments originate from >>> siRNA events, >>> or may represent miRNAs / parts of pre-miRNAs. >>> >>> So I did an online blast search against database nt. >>> The resulting report quite often just gives subject information like this: >>> >>> ----- >>>> gb|CP002686.1| Arabidopsis thaliana chromosome 3, complete sequence >>> Length=23459830 >>> ----- >>> >>> Now I would like to get the hit's neighbouring regions for further >>> analysis. >>> Preferably I would like to do that in an automized way, but the only >>> possible action with this kind of subject gi | description would be to >>> fetch the entire chromosomal sequence I guess ? >>> >>> However, >>> right below the line above, the report states more precisely: >>> >>> ------ >>> Features flanking this part of subject sequence: >>> 8872 bp at 5' side: cytochrome P450 90B1 >>> 402 bp at 3' side: U1 small nuclear ribonucleoprotein-70K >>> ------ >>> >>> Still I would like to have the possibility to automatically fetch the >>> subject's sequence(s), >>> as of now I think parsing the report with SearchIO won't let me aquire >>> that information, because SearchIO does not recognize report sections >>> like those. >>> >>> I hope I did not miss any of SearchIOs capabilities, but I could not >>> find any method covering my wish?! >>> >>> Right now maybe the only way to get the information I want is to >>> construct my own parser and write it out into a separate file, which in >>> turn again I could read into a hash before processing the Blast-Report >>> with SearchIO to combine both data for further automized work. >>> >>> I am aware though that even successfully getting the flanking features >>> would leave me with the more or less wide intergenic gap my hsp is >>> located in. >>> >>> However I'm in need of a way to get the flanking features including >>> their annotation and the region spanning between them. >>> But I hope I do not have to get complete sequences to accomplish that, >>> as this would be kind of an overkill. >>> >>> with kind regards >>> Jochen >>> >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> Jason Stajich >> jason.stajich at gmail.com >> jason at bioperl.org >> >> >> >> >> ------------------------------ >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> End of Bioperl-l Digest, Vol 117, Issue 13 >> ****************************************** > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Jason Stajich jason.stajich at gmail.com jason at bioperl.org From dr_kilburn59 at yahoo.com Fri Feb 1 09:25:34 2013 From: dr_kilburn59 at yahoo.com (Dan Kilburn) Date: Fri, 1 Feb 2013 06:25:34 -0800 (PST) Subject: [Bioperl-l] Bioperl-l Digest, Vol 117, Issue 13 In-Reply-To: References: <575B184F-C051-4FEF-9BEB-7AB98E3C52A6@yahoo.com> Message-ID: <1359728734.27412.YahooMailNeo@web162006.mail.bf1.yahoo.com> Hi Jason, ? Thanks for?the detailed feedback.? The real reason I had to write my own parser is that even with close, repeated support from NCBI we couldn't get XML output with short_web_blast.pl?because the parameter that turns on XML output was not functioning (they've probably fixed it by now), and I had to crank out a parser asap to support a job talk. ? I don't think the upstream and downstream feature reports are particulalry useful, becase in mammals they tend to be so far away that they are not likely to be biologically relevant.? But the internal motif reports are useful, maybe especially if you are blasting short reads, like I was.? A 16-mer preserved domain hit is really good if you're blasting 18-mer Illumina short reads, like I was. ? As far as my involvement goes, I got diagnosed with cancer on Wednesday, so I'll be taking a step back until next week's surgery and taking a lot a deep breaths.? On the other hand, this just makes me more motivated: I've been thinking alot about time, and timely contributions, the last two days. ? Cheers, Dan ________________________________ From: Jason Stajich To: Dan kilburn Cc: "bioperl-l at lists.open-bio.org" Sent: Friday, February 1, 2013 1:58 AM Subject: Re: [Bioperl-l] Bioperl-l Digest, Vol 117, Issue 13 Dan -? I think the answer is yes if others are doing it - I am not in a position to be much of a main coder. I don't know which format you speak of here or if you had to write something for the text blast changes or something else. ?Specific bug reports on formats that aren't working is always helpful. ?The XML format has been pretty stable so I would suggest that if you are simply parsing reports not looking at them. Chris posted instructions on how to contribute and the move to github simplifies this. ?That you had to write a whole new parser seems probably a bit severe - I hope that in the future people can speak to the problems sooner. If I hit a wall with something I can't do I usually write the code to fix it and contribute it back but I don't play follow-the-format-changes with the tools anymore, but hopefully others like yourself can make the contributions. If you speak to the response I made to the question below, I don't think anyone will be trying and support the NCBI's additional markups that refer to the upstream and downstream features as they are laid out in the text files without some serious effort. Perhaps in the future that information will be reported in the XML format and thus be more parseable. best wishes, Jason On Jan 30, 2013, at 1:40 PM, Dan kilburn wrote: Hi Jason, > >Are there any plans to keep SearchIO up to date with ncbi blast? I know they change formats ridiculously often, but I had to write my own parser to get sequence identity, which I would rather not have done. I realize that this job would be a big load on anyone who takes it, but it's so fundamental. Maybe I can help. > >--Dan >Sent from my iPhone > >On Jan 30, 2013, at 12:00 PM, bioperl-l-request at lists.open-bio.org wrote: > > >Send Bioperl-l mailing list submissions to >>??bioperl-l at lists.open-bio.org >> >>To subscribe or unsubscribe via the World Wide Web, visit >>??http://lists.open-bio.org/mailman/listinfo/bioperl-l >>or, via email, send a message with subject or body 'help' to >>??bioperl-l-request at lists.open-bio.org >> >>You can reach the person managing the list at >>??bioperl-l-owner at lists.open-bio.org >> >>When replying, please edit your Subject line so it is more specific >>than "Re: Contents of Bioperl-l digest..." >> >> >>Today's Topics: >> >>?1. Re: ?Parsing Blast-Report extracting "Features flanking ???.." >>????(Jason Stajich) >> >> >>---------------------------------------------------------------------- >> >>Message: 1 >>Date: Tue, 29 Jan 2013 11:00:16 -0800 >>From: Jason Stajich >>Subject: Re: [Bioperl-l] Parsing Blast-Report extracting "Features >>??flanking ???.." >>To: buschj at hhu.de >>Cc: bioperl-l at lists.open-bio.org >>Message-ID: <6E83E3F3-C304-4DC4-9A11-FE1CA90F207D at gmail.com> >>Content-Type: text/plain; ???charset=us-ascii >> >>We don't parse the NCBI feature info from the BLAST reports per your query. To look up a specific feature you can use Bio::DB::GenBank to query for sequence from a specific feature by accession number - see the HOWTOs for that. >> >>However, most people use tools that generate SAM/BAM files with short reads - then you can use a tool like bedtools to find overlaps of reads with the locations of features. >> >>basically: >>- download the genome and GFF for arabidopsis >>- align your sRNA to the genome with a short read aligner - bowtie, bwa, others >>- convert your sam to bam file with SAMtools or picard >>- compare the location of features with the reads to get expression summaries or individuals reads with BEDTools >> >> >>On Jan 25, 2013, at 2:20 AM, jobu wrote: >> >> >>Am 22.01.2013 19:03, schrieb Mgavi Brathwaite: >>> >>>What upstream and downstream elements are you interested in? >>>> >>> >>>I've got a huge pile of short RNA reads. >>>Part of the question now is whether those RNA fragments originate from >>>siRNA events, >>>or may represent miRNAs / parts of pre-miRNAs. >>> >>>So I did an online ?blast search against database nt. >>>The resulting report quite often just gives subject information like this: >>> >>>----- >>> >>>gb|CP002686.1| Arabidopsis thaliana chromosome 3, complete sequence >>>>Length=23459830 >>>----- >>> >>>Now I would like to get the hit's neighbouring regions ?for further >>>analysis. >>>Preferably I would like to do that ?in an automized way, but the only >>>possible action with this kind of subject gi | description would be to >>>fetch the entire chromosomal ?sequence I guess ? >>> >>>However, >>>right below the line above, the report states more precisely: >>> >>>------ >>>Features flanking this part of subject sequence: >>>8872 bp at 5' side: cytochrome P450 90B1 >>>402 bp at 3' side: U1 small nuclear ribonucleoprotein-70K >>>------ >>> >>>Still I would like to have the possibility to automatically fetch the >>>subject's sequence(s), >>>as of now I think ?parsing the report with SearchIO won't let me aquire >>>that information, because SearchIO does not recognize report sections >>>like those. >>> >>>I hope I did not miss any of SearchIOs capabilities, but I could not >>>find any method covering my wish?! >>> >>>Right now maybe the only way to get the information I want is to >>>construct my own parser and write it out into a separate file, which in >>>turn again ?I could read into a hash before processing the Blast-Report >>>with SearchIO to combine both data for further automized work. >>> >>>I am aware though that even successfully getting the flanking features >>>would leave me with the more or less wide ?intergenic gap my hsp is >>>located in. >>> >>>However I'm in need of a way to get the flanking features including >>>their annotation and the region spanning between them. >>>But I hope I do not have to get complete sequences to accomplish that, >>>as this would be kind of an overkill. >>> >>>with kind regards >>>Jochen >>> >>> >>> >>>_______________________________________________ >>>Bioperl-l mailing list >>>Bioperl-l at lists.open-bio.org >>>http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>Jason Stajich >>jason.stajich at gmail.com >>jason at bioperl.org >> >> >> >> >>------------------------------ >> >>_______________________________________________ >>Bioperl-l mailing list >>Bioperl-l at lists.open-bio.org >>http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >>End of Bioperl-l Digest, Vol 117, Issue 13 >>****************************************** >> >_______________________________________________ >Bioperl-l mailing list >Bioperl-l at lists.open-bio.org >http://lists.open-bio.org/mailman/listinfo/bioperl-l > Jason Stajich jason.stajich at gmail.com jason at bioperl.org From carandraug+dev at gmail.com Sat Feb 2 20:44:31 2013 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Sun, 3 Feb 2013 01:44:31 +0000 Subject: [Bioperl-l] TCofee does not accept named arguments and issue with output option Message-ID: Hi the TCoffee module does not options of the named argument type: -arg => option one needs to do like 'arg' => option Is there a special reason for this? I tracked down this to the commit 7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e 12 years ago[1]. A comment on the code actually says "don't want named parameters"[2] (though the commit message sounds pretty innocuous "migrated to new Bio::Root::RootI chained new"). Is there a reason for this? The rest of bioperl has no issue with named parameters, and the API should be the same as Clustalw which also has no problem with it. This is very easy to fix, I can submit a pull request no problem. Also, shouldn't the code complain in the case of non-supported options? Took me a very long time to find out the problem because there was no complaints coming from the code. There is also a problem with the way it handles the output option. I'll have to look closer into it, but the documentation is simply incorrect. "'output' => 'fasta_aln'" gives an error while just 'fasta' (undocumented), works fine. Carn? [1] https://github.com/carandraug/bioperl-run/commit/7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e [2] https://github.com/carandraug/bioperl-run/commit/7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e#L0R374 From cjfields at illinois.edu Sun Feb 3 16:54:51 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Sun, 3 Feb 2013 21:54:51 +0000 Subject: [Bioperl-l] TCofee does not accept named arguments and issue with output option In-Reply-To: References: Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE14D30@CHIMBX5.ad.uillinois.edu> Carn?, On Feb 2, 2013, at 7:44 PM, Carn? Draug wrote: > Hi > > the TCoffee module does not options of the named argument type: > > -arg => option > > one needs to do like > > 'arg' => option > > Is there a special reason for this? I tracked down this to the commit > > 7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e > > 12 years ago[1]. A comment on the code actually says "don't want named > parameters"[2] (though the commit message sounds pretty innocuous > "migrated to new Bio::Root::RootI chained new"). Is there a reason for > this? The rest of bioperl has no issue with named parameters, and the > API should be the same as Clustalw which also has no problem with it. > This is very easy to fix, I can submit a pull request no problem. IIRC the reasoning behind this was to differentiate Bioperl parameters from command-specific ones. This decision predates my involvement w/ core dev, but my general feeling is that anything that is an object attribute (regardless whether it is a direct representation of a value passed to a wrapped program or not) should be preceded by '-' for consistency. The downside of big changes like this: potential backwards compatibility issues. Such changes would need to be tested out rigorously, as there are a ton of old scripts that would potentially break with a direct change. I don't have a problem breaking this with a bioperl 2.0 release, though. > Also, shouldn't the code complain in the case of non-supported > options? Took me a very long time to find out the problem because > there was no complaints coming from the code. Yes, it should complain when options are given that do not make sense, some validation would help there. With some modules this might be a side-effect of using AUTOLOAD or simply not checking the parameters. > There is also a problem with the way it handles the output option. > I'll have to look closer into it, but the documentation is simply > incorrect. "'output' => 'fasta_aln'" gives an error while just 'fasta' > (undocumented), works fine. That's entirely possible. > Carn? > [1] https://github.com/carandraug/bioperl-run/commit/7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e > [2] https://github.com/carandraug/bioperl-run/commit/7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e#L0R374 As an aside, there are a few downsides of trying to implement command-line parameters as perl object attributes (getter/setter), one being that many can't be directly represented as an object attribute (namely, anything that can't be a getter/setter named subroutine, such as those having hyphens, starting with a number, etc) so you have to hack your way around it. Infernal was this way IIRC. Maybe these should just be simply stored as a semi-validated set of key-value pairs. chris From carandraug+dev at gmail.com Sun Feb 3 23:34:22 2013 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Mon, 4 Feb 2013 04:34:22 +0000 Subject: [Bioperl-l] TCofee does not accept named arguments and issue with output option In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE14D30@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE14D30@CHIMBX5.ad.uillinois.edu> Message-ID: On 3 February 2013 21:54, Fields, Christopher J wrote: > On Feb 2, 2013, at 7:44 PM, Carn? Draug wrote: > >> Hi >> >> the TCoffee module does not options of the named argument type: >> >> -arg => option >> >> one needs to do like >> >> 'arg' => option >> >> Is there a special reason for this? I tracked down this to the commit >> >> 7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e >> >> 12 years ago[1]. A comment on the code actually says "don't want named >> parameters"[2] (though the commit message sounds pretty innocuous >> "migrated to new Bio::Root::RootI chained new"). Is there a reason for >> this? The rest of bioperl has no issue with named parameters, and the >> API should be the same as Clustalw which also has no problem with it. >> This is very easy to fix, I can submit a pull request no problem. > > IIRC the reasoning behind this was to differentiate Bioperl parameters from command-specific ones. This decision predates my involvement w/ core dev, but my general feeling is that anything that is an object attribute (regardless whether it is a direct representation of a value passed to a wrapped program or not) should be preceded by '-' for consistency. > > The downside of big changes like this: potential backwards compatibility issues. Such changes would need to be tested out rigorously, as there are a ton of old scripts that would potentially break with a direct change. I don't have a problem breaking this with a bioperl 2.0 release, though. Should passing the tests be enough? There's one for TCofee. At the moment I don't see how this would cause compatibility issues, we are adding an option, not removing it. But the comment on the code, stating plainly that the -param API was not wanted caught me by surpise and why I'm asking. > As an aside, there are a few downsides of trying to implement command-line parameters as perl object attributes (getter/setter), one being that many can't be directly represented as an object attribute (namely, anything that can't be a getter/setter named subroutine, such as those having hyphens, starting with a number, etc) so you have to hack your way around it. Infernal was this way IIRC. Maybe these should just be simply stored as a semi-validated set of key-value pairs. >From a quick glance at the list of TCoffee parameters I don't at the moment see any that should cause problem. I have submitted a bug report[1] which mentions some other issues I found with TCoffee. If someone could comment on them would be great and I can start fixing it. Carn? [1] https://redmine.open-bio.org/issues/3406 From whereverroadgoes at gmail.com Mon Feb 4 10:39:19 2013 From: whereverroadgoes at gmail.com (Slym) Date: Mon, 4 Feb 2013 07:39:19 -0800 (PST) Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases Message-ID: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com> The result I get is: Number of bases of type A = Number of bases of type C = Number of bases of type G = Number of bases of type T = i.e. There's no expected values. Please help! #! /usr/bin/perl use Bio::Tools::SeqStats; use Bio::Seq; open (FILE, "seq.fasta"); @array = ; # Removing first line of fasta shift (@array); $array = join('', at array); open (FILE2, ">>seq2.fasta"); print FILE2 "$array"; $seqobj = Bio::PrimarySeq->new( -file => "sekw2.fasta", - alphabet => 'dna',); my $seq_stats = Bio::Tools::SeqStats->new(-seq=>$seqobj); my $monomer_ref = $seq_stats->count_monomers(); foreach $base (sort keys %$monomer_ref) { print "Liczba zasad typu ", $base," = ", $monomer_ref{$base},"\n"; } From hamish.mcwilliam at bioinfo-user.org.uk Mon Feb 4 11:59:16 2013 From: hamish.mcwilliam at bioinfo-user.org.uk (Hamish McWilliam) Date: Mon, 4 Feb 2013 16:59:16 +0000 Subject: [Bioperl-l] Where to get BLASTCLUST or equivalent? In-Reply-To: References: <200305311150.h4VBopn2019091@localhost.localdomain> Message-ID: BLASTCLUST is part of the legacy NCBI BLAST package (not NCBI BLAST+) and can be obtained from: ftp://ftp.ncbi.nlm.nih.gov/blast/executables/release/LATEST As Robert notes there are many other tools which can be used to perform sequence clustering, Wikipedia has a Sequence Clustering article (http://en.wikipedia.org/wiki/Sequence_clustering) which lists some of the most commonly used. All the best, Hamish On 1 February 2013 04:15, Rob wrote: > Cyril C.C. Chua bmb.leeds.ac.uk> writes: > >> >> Hi, >> >> I have some difficulty in sourcing for BLASTCLUST or related >> programs/mods. Does any1 know exactly how to locate them? >> >> Regards >> >> Cyril Chua >> > > > Hi Cyril, > > I heard of the following programmes that might do similar things (I HAVEN'T > used any of them yet): > > Afree - http://www.vicbioinformatics.com/software.afree.shtml > Uclust - http://drive5.com/uclust/uclust_userguide_2_1.pdf > Usearch - http://www.drive5.com/usearch/ > DomClust - http://mbgd.genome.ad.jp/domclust/ > > or > > Check this: > > http://ppod.princeton.edu/help/help_tech.html > > God bless, > > > Robert > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- ---- "Saying the internet has changed dramatically over the last five years is clich? ? the internet is always changing dramatically" - Craig Labovitz, Arbor Networks. From whereverroadgoes at gmail.com Mon Feb 4 12:34:10 2013 From: whereverroadgoes at gmail.com (Slym) Date: Mon, 4 Feb 2013 09:34:10 -0800 (PST) Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases In-Reply-To: References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com> Message-ID: Thanks Roy, It still doesn't seem to produce anything. :/ From roy.chaudhuri at gmail.com Mon Feb 4 12:51:03 2013 From: roy.chaudhuri at gmail.com (Roy Chaudhuri) Date: Mon, 4 Feb 2013 17:51:03 +0000 Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases In-Reply-To: References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com> Message-ID: Sorry, I'd missed another problem in your code - you are trying to load a fasta file using Bio::PrimarySeq. To read sequence data from a file you should use Bio::SeqIO, see: http://www.bioperl.org/wiki/HOWTO:Beginners#Retrieving_a_sequence_from_a_file http://www.bioperl.org/wiki/HOWTO:SeqIO Cheers, Roy. From asjo at koldfront.dk Mon Feb 4 12:58:25 2013 From: asjo at koldfront.dk (Adam =?iso-8859-1?Q?Sj=F8gren?=) Date: Mon, 04 Feb 2013 18:58:25 +0100 Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases In-Reply-To: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com> (Slym's message of "Mon, 4 Feb 2013 07:39:19 -0800 (PST)") References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com> Message-ID: <8738xc2c72.fsf@topper.koldfront.dk> On Mon, 4 Feb 2013 07:39:19 -0800 (PST), Slym wrote: > #! /usr/bin/perl > use Bio::Tools::SeqStats; > use Bio::Seq; It can be a good idea to add "use strict; use warnings;" to the top of your script. At least two problems in your program would have been caught by perl if you had. > open (FILE, "seq.fasta"); Using (global) literal filehandles and the two parameter open() is somewhat outdated, a more current way to do it could be: open my $fh, '<', 'seq.fasta'; > @array = ; > # Removing first line of fasta > shift (@array); > $array = join('', at array); > open (FILE2, ">>seq2.fasta"); > print FILE2 "$array"; Note that you are writing just the sequence to your seq2.fasta file here, so the new file isn't really a fasta file. > $seqobj = Bio::PrimarySeq->new( -file => "sekw2.fasta", > - alphabet => 'dna',); Bio::PrimarySeq doesn't take a '-file' parameter. Also, note that the filename is different than before "sekw2" vs. "seq2"! Either you should use Bio::SeqIO with a '-file' parameter, or you can use Bio::PrimarySeq with a '-seq' parameter. > my $seq_stats = Bio::Tools::SeqStats->new(-seq=>$seqobj); > my $monomer_ref = $seq_stats->count_monomers(); > foreach $base (sort keys %$monomer_ref) { > print "Liczba zasad typu ", $base," = ", $monomer_ref{$base},"\n"; Here you wanted $monomer_ref->{$base}, as %monomer_ref isn't mentioned anywhere else. > } Here is a complete version of your script - I chose to use Bio::SeqIO - that works: #!/usr/bin/perl use strict; use warnings; use Bio::SeqIO; use Bio::Tools::SeqStats; my $io=Bio::SeqIO->new(-file=>'seq.fasta', -alphabet=>'dna'); my $seqobj=$io->next_seq; # Get the first sequence from the file my $seq_stats = Bio::Tools::SeqStats->new(-seq=>$seqobj); my $monomer_ref = $seq_stats->count_monomers(); foreach my $base (sort keys %$monomer_ref) { print "Liczba zasad typu ", $base," = ", $monomer_ref->{$base},"\n"; } E.g.: $ cat seq.fasta >test aaaacccggt $ ./slym.pl Liczba zasad typu A = 4 Liczba zasad typu C = 3 Liczba zasad typu G = 2 Liczba zasad typu T = 1 $ Best regards, Adam -- "Grittings. Ma nam is Kahlfin." Adam Sj?gren asjo at koldfront.dk From whereverroadgoes at gmail.com Mon Feb 4 13:02:29 2013 From: whereverroadgoes at gmail.com (Slym) Date: Mon, 4 Feb 2013 10:02:29 -0800 (PST) Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases In-Reply-To: References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com> Message-ID: The thing is, if I use Bio::SeqIO then Bio::Tools::SeqStats produces an error (saying that it wants input provided by Bio::PrimarySeq). (btw in this line $seqobj = Bio::PrimarySeq->new( -file => "sekw2.fasta", - alphabet => 'dna',); there's a typo "sekw2" instead of "seq2" but this is correct in my original code). From whereverroadgoes at gmail.com Mon Feb 4 13:02:29 2013 From: whereverroadgoes at gmail.com (Slym) Date: Mon, 4 Feb 2013 10:02:29 -0800 (PST) Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases In-Reply-To: References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com> Message-ID: The thing is, if I use Bio::SeqIO then Bio::Tools::SeqStats produces an error (saying that it wants input provided by Bio::PrimarySeq). (btw in this line $seqobj = Bio::PrimarySeq->new( -file => "sekw2.fasta", - alphabet => 'dna',); there's a typo "sekw2" instead of "seq2" but this is correct in my original code). From cjfields at illinois.edu Mon Feb 4 13:54:39 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Mon, 4 Feb 2013 18:54:39 +0000 Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases In-Reply-To: References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE161ED@CHIMBX5.ad.uillinois.edu> Please make sure and read both Roy's and Adam's responses all the way through; Bio::SeqIO is not a sequence object but the front-end for format parsing (e.g. FASTA, etc). Bio::PrimarySeq does not have a '-file' parameter, Bio::SeqIO does. If SeqStats truly doesn't work with Bio::Seq we can fix that, but according to Adam he has tested using Bio::SeqIO out and it seems to work. chris On Feb 4, 2013, at 12:02 PM, Slym wrote: > The thing is, if I use Bio::SeqIO then Bio::Tools::SeqStats produces an > error (saying that it wants input provided by Bio::PrimarySeq). > (btw in this line > $seqobj = Bio::PrimarySeq->new( -file => "sekw2.fasta", - alphabet => > 'dna',); > there's a typo "sekw2" instead of "seq2" but this is correct in my original > code). > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From asjo at koldfront.dk Mon Feb 4 15:00:32 2013 From: asjo at koldfront.dk (Adam =?iso-8859-1?Q?Sj=F8gren?=) Date: Mon, 04 Feb 2013 21:00:32 +0100 Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases In-Reply-To: (Slym's message of "Mon, 4 Feb 2013 10:02:29 -0800 (PST)") References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com> Message-ID: <87txpr26jj.fsf@topper.koldfront.dk> On Mon, 4 Feb 2013 10:02:29 -0800 (PST), Slym wrote: > The thing is, if I use Bio::SeqIO then Bio::Tools::SeqStats produces an > error (saying that it wants input provided by Bio::PrimarySeq). That sounds like you forgot to call ->next_seq() on the Bio::SeqIO object - to get a sequence object - please see the complete, working example I sent earlier. Best regards, Adam -- "Denial springs eternal." Adam Sj?gren asjo at koldfront.dk From scott at scottcain.net Tue Feb 5 09:45:14 2013 From: scott at scottcain.net (Scott Cain) Date: Tue, 5 Feb 2013 09:45:14 -0500 Subject: [Bioperl-l] Have your say in the 2013 GMOD Community Survey! Message-ID: Give us your thoughts on the GMOD project and win a personal DNA test from 23andMe! The GMOD project provides tools like GBrowse, Galaxy, MAKER, JBrowse, Tripal, Apollo, Chado, and many more to a huge community of users and developers around the world. To make sure that GMOD is giving you the support you need, we want to know how you use GMOD, which components you find valuable, your opinion on support, training, and GMOD's strengths and weaknesses. Your feedback is vital in helping GMOD to serve its user community more effectively and to suggest future directions for the project. Do the survey: http://gmod.org/survey.html The survey should take between 10 and 15 minutes (including thinking time), and participants can enter a draw to win "A Journey Through Your DNA", the personal DNA test from 23andMe (the winner can pick a $50 Amazon gift voucher if they prefer). The survey will be open until March 1st. Results will be collated and discussed at the April 2013 GMOD Meeting in Cambridge, UK, and posted on the GMOD wiki at http://gmod.org. Please spread the word to other friends and colleagues who use GMOD: the more voices we hear, the better the picture we get of the needs of our users, and the better we can help you! Do the survey: http://gmod.org/survey.html If you have any questions or problems with the survey, please email me -- I will be happy to help out! Thanks, Scott -- ------------------------------------------------------------------------ Scott Cain, Ph. D. scott at scottcain dot net GMOD Coordinator (http://gmod.org/) 216-392-3087 Ontario Institute for Cancer Research From tiago.hori at gmail.com Tue Feb 5 10:21:55 2013 From: tiago.hori at gmail.com (Tiago Hori) Date: Tue, 5 Feb 2013 07:21:55 -0800 (PST) Subject: [Bioperl-l] Search I::O Message-ID: <39b1269f-63a7-4b29-af79-8c93ab231abf@googlegroups.com> Hi All, I am trying to find the best putative orthologs for 44K Atlantic Salmon sequences, and so I need to parse 44K BLAST reports to find the best human hit. I am trying to learn Seach::IO, but when I try the first example on the HOWTO: use strict; use Bio::SearchIO; my $in = new Bio::SearchIO(-format => 'blast' -file => 'C001R047.txt'); while( my $result = $in->next_result ) { ## $result is a Bio::Search::Result::ResultI compliant object while( my $hit = $result->next_hit ) { ## $hit is a Bio::Search::Hit::HitI compliant object while( my $hsp = $hit->next_hsp ) { ## $hsp is a Bio::Search::HSP::HSPI compliant object if( $hsp->length('total') > 50 ) { if ( $hsp->percent_identity >= 75 ) { print "Query=", $result->query_name, " Hit=", $hit->name, " Length=", $hsp->length('total'), " Percent_id=", $hsp->percent_identity, "\n"; } } } } } I get this error: Odd number of elements in hash assignment at /usr/local/share/perl/5.14.2/Bio/SearchIO.pm line 189. I am using BioPerl version 1.6.901. Is there a format problem with the blast reports? Any help would be greatly appreciated! T. From tiago.hori at gmail.com Tue Feb 5 10:33:32 2013 From: tiago.hori at gmail.com (Tiago Hori) Date: Tue, 5 Feb 2013 07:33:32 -0800 (PST) Subject: [Bioperl-l] Search::IO example from HOWTO Message-ID: Hi All, I am trying to run tha example from the Search::IO how to use strict; use Bio::SearchIO; my $in = new Bio::SearchIO(-format => 'blast' -file => 'test.txt'); while( my $result = $in->next_result ) { ## $result is a Bio::Search::Result::ResultI compliant object while( my $hit = $result->next_hit ) { ## $hit is a Bio::Search::Hit::HitI compliant object while( my $hsp = $hit->next_hsp ) { ## $hsp is a Bio::Search::HSP::HSPI compliant object if( $hsp->length('total') > 50 ) { if ( $hsp->percent_identity >= 75 ) { print "Query=", $result->query_name, " Hit=", $hit->name, " Length=", $hsp->length('total'), " Percent_id=", $hsp->percent_identity, "\n"; } } } } } And I get this error:Odd number of elements in hash assignment at /usr/local/share/perl/5.14.2/Bio/SearchIO.pm line 189. Can anybody help! Cheers, T. From carandraug+dev at gmail.com Tue Feb 5 13:56:21 2013 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Tue, 5 Feb 2013 18:56:21 +0000 Subject: [Bioperl-l] removing packages from bioperl-live Message-ID: Hi some of the bioperl-live packages have already been split into separate repositories. However, they were never actually removed from bioperl-live. This creates 2 entry points for bug fixes and implementations. After a chat on #bioperl, I was told to ask here. Should these be removed? For example, there's bioperl-FeatureIO but that code alo exists in bioperl-live. Can I remove it from bioperl-live? Carn? From cjfields at illinois.edu Tue Feb 5 14:34:07 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Tue, 5 Feb 2013 19:34:07 +0000 Subject: [Bioperl-l] Removing Bio::FeatureIO? was Re: removing packages from bioperl-live In-Reply-To: References: Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu> Probably should retitle this to ask the question directly (make sure the right radars are pinged). My vote is yes, it should be removed. There were a lot of implementation issues with it that ended up becoming problematic. I do believe it is used, though, so I would like to get additional responses from the community before removing it and pointing to the separate repository (where there has been a lot of experimenting going on). chris On Feb 5, 2013, at 12:56 PM, Carn? Draug wrote: > Hi > > some of the bioperl-live packages have already been split into > separate repositories. However, they were never actually removed from > bioperl-live. This creates 2 entry points for bug fixes and > implementations. After a chat on #bioperl, I was told to ask here. > > Should these be removed? For example, there's bioperl-FeatureIO but > that code alo exists in bioperl-live. Can I remove it from > bioperl-live? > > Carn? > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From scott at scottcain.net Tue Feb 5 14:36:10 2013 From: scott at scottcain.net (Scott Cain) Date: Tue, 5 Feb 2013 14:36:10 -0500 Subject: [Bioperl-l] Removing Bio::FeatureIO? was Re: removing packages from bioperl-live In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu> Message-ID: I'm sure it will lead to lots of fun, but I suspect you are right and it should be removed. It's time you yank on that bandaid :-) Scott On Tue, Feb 5, 2013 at 2:34 PM, Fields, Christopher J wrote: > Probably should retitle this to ask the question directly (make sure the right radars are pinged). > > My vote is yes, it should be removed. There were a lot of implementation issues with it that ended up becoming problematic. I do believe it is used, though, so I would like to get additional responses from the community before removing it and pointing to the separate repository (where there has been a lot of experimenting going on). > > chris > > On Feb 5, 2013, at 12:56 PM, Carn? Draug wrote: > >> Hi >> >> some of the bioperl-live packages have already been split into >> separate repositories. However, they were never actually removed from >> bioperl-live. This creates 2 entry points for bug fixes and >> implementations. After a chat on #bioperl, I was told to ask here. >> >> Should these be removed? For example, there's bioperl-FeatureIO but >> that code alo exists in bioperl-live. Can I remove it from >> bioperl-live? >> >> Carn? >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- ------------------------------------------------------------------------ Scott Cain, Ph. D. scott at scottcain dot net GMOD Coordinator (http://gmod.org/) 216-392-3087 Ontario Institute for Cancer Research From carandraug+dev at gmail.com Tue Feb 5 15:06:23 2013 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Tue, 5 Feb 2013 20:06:23 +0000 Subject: [Bioperl-l] dependencies on perl version Message-ID: Hi how much perl backwards compatibility does bioperl needs to keep? If I have something I want to implement and use state (requires 5.010), is it acceptable? 5.010 is already a quite old perl version. Of course, there are other less elegant ways to implement those features. If I can't use modern perl stuff, what version number is the limit? Carn? From carandraug+dev at gmail.com Tue Feb 5 15:10:01 2013 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Tue, 5 Feb 2013 20:10:01 +0000 Subject: [Bioperl-l] Removing Bio::FeatureIO? was Re: removing packages from bioperl-live In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu> Message-ID: On 5 February 2013 19:34, Fields, Christopher J wrote: > Probably should retitle this to ask the question directly (make sure the right radars are pinged). > > My vote is yes, it should be removed. There were a lot of implementation issues with it that ended up becoming problematic. I do believe it is used, though, so I would like to get additional responses from the community before removing it and pointing to the separate repository (where there has been a lot of experimenting going on). Mentioning Bio::FeatureIO was just an example. I meant to ask it as more general. If the code is already in a separate repository, should it be removed from bioperl-live? Carn? From cjfields at illinois.edu Tue Feb 5 15:56:48 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Tue, 5 Feb 2013 20:56:48 +0000 Subject: [Bioperl-l] dependencies on perl version In-Reply-To: References: Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> Aim for 5.10.1, but be careful of smart-match. If you do this, make sure to add a 'use 5.010' pragma at the top. (for those who don't like this, please speak up. perl 5.8 has been around almost 8 yrs, we would like to allow using new features if at all possible) chris On Feb 5, 2013, at 2:06 PM, Carn? Draug wrote: > Hi > > how much perl backwards compatibility does bioperl needs to keep? > > If I have something I want to implement and use state (requires > 5.010), is it acceptable? 5.010 is already a quite old perl version. > Of course, there are other less elegant ways to implement those > features. If I can't use modern perl stuff, what version number is the > limit? > > Carn? > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Tue Feb 5 15:59:38 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Tue, 5 Feb 2013 20:59:38 +0000 Subject: [Bioperl-l] Removing Bio::FeatureIO? was Re: removing packages from bioperl-live In-Reply-To: References: <118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1829D@CHIMBX5.ad.uillinois.edu> On Feb 5, 2013, at 2:10 PM, Carn? Draug wrote: > On 5 February 2013 19:34, Fields, Christopher J wrote: >> Probably should retitle this to ask the question directly (make sure the right radars are pinged). >> >> My vote is yes, it should be removed. There were a lot of implementation issues with it that ended up becoming problematic. I do believe it is used, though, so I would like to get additional responses from the community before removing it and pointing to the separate repository (where there has been a lot of experimenting going on). > > Mentioning Bio::FeatureIO was just an example. I meant to ask it as > more general. If the code is already in a separate repository, should > it be removed from bioperl-live? > > Carn? Yes for Bio::FeatureIO, no for Bio::Root::Root and the others at the moment (I want to get a release out by March 1, which I'm planning on announcing later today, so the less disruptive it is the better). Once we get a new release out we should remove the rest. chris From cjfields at illinois.edu Tue Feb 5 16:53:29 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Tue, 5 Feb 2013 21:53:29 +0000 Subject: [Bioperl-l] Next BioPerl release Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu> All, I am scheduling the next BioPerl CPAN release tentatively for March 1. Any help in triaging bug reports would be greatly appreciated! Amongst all other changes, as mentioned in a separate thread we will remove Bio::FeatureIO, now developed in a separate repository: https://github.com/bioperl/Bio-FeatureIO Feedback, suggestions, etc are greatly appreciated. chris From miker at htblis.com Tue Feb 5 19:54:17 2013 From: miker at htblis.com (Michael Rogoff) Date: Tue, 5 Feb 2013 16:54:17 -0800 Subject: [Bioperl-l] Bio::Graphics error when rendering features with Split locations Message-ID: When trying to render features from a genbank file that include a split location e.g.: promoter join(1000..1080,1..5) /label=PROM1 The following exception is raised: Can't locate object method "has_tag" via package "Bio::Location::Simple" at lib/perl5/site_perl/5.10.1/Bio/Graphics/Glyph.pm line 704, line 36. This can be reproduced with the code in the example "Rendering Features from a GenBank or EMBL File" from the Graphics HOW-TO: http://www.bioperl.org/wiki/HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File Is there a way to change the script so that split locations would, at the very least, not cause a fatal error? Is there a different glyph type that needs to be used? Thanks in advance for any help. I've attached a simple genbank input that will reproduce the error: LOCUS sample2 1080 bp DNA circular DEFINITION Cloning vector sample2 ACCESSION sample2 VERSION sample2.1 GI:4352432 COMMENT Component Fragments FEATURES Location/Qualifiers terminator 39..328 /label=TERM1 /note="terminator 1" misc_feature 393..488 /label=MF1 CDS complement(800..900) /label=CDS1 /note="resistence gene" promoter join(1000..1080,1..5) /label=PROM1 ORIGIN 1 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 61 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 121 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 181 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 241 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 301 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 361 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 421 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 481 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 541 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 601 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 661 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 721 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 781 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 841 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 901 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 961 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1021 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn // P.S. I think I have traced the source of the problem to Glyph's _subfeat method, which in the case of a feature with split locations is returning location objects instead of feature objects. Is this a bug? sub _subfeat { my $class = shift; my $feature = shift; return $feature->segments if $feature->can('segments'); my @split = eval { my $id = $feature->location->seq_id; my @subs = $feature->location->sub_Location; grep {$id eq $_->seq_id} @subs; }; return @split if @split; # Either the APIs have changed, or I got confused at some point... return $feature->get_SeqFeatures if $feature->can('get_SeqFeatures'); return $feature->sub_SeqFeature if $feature->can('sub_SeqFeature'); return; } From l.m.timmermans at students.uu.nl Tue Feb 5 21:40:27 2013 From: l.m.timmermans at students.uu.nl (Leon Timmermans) Date: Wed, 6 Feb 2013 03:40:27 +0100 Subject: [Bioperl-l] dependencies on perl version In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> Message-ID: On Tue, Feb 5, 2013 at 9:56 PM, Fields, Christopher J wrote: > Aim for 5.10.1, but be careful of smart-match. If you do this, make sure to add a 'use 5.010' pragma at the top. > > (for those who don't like this, please speak up. perl 5.8 has been around almost 8 yrs, we would like to allow using new features if at all possible) I *really* hate saying it, but I fear a lot of places are still stuck on 5.8, in particular on 5.8.8 because of CentOS 5. I know my department still is and doesn't seem to be in a hurry to upgrade, and I'm pretty sure it won't be the only one (though personally I use a self-compiled 5.16). Leon From florent.angly at gmail.com Tue Feb 5 21:51:27 2013 From: florent.angly at gmail.com (Florent Angly) Date: Wed, 06 Feb 2013 12:51:27 +1000 Subject: [Bioperl-l] Removing Bio::FeatureIO? was Re: removing packages from bioperl-live In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1829D@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu> <118F034CF4C3EF48A96F86CE585B94BF6CE1829D@CHIMBX5.ad.uillinois.edu> Message-ID: <5111C52F.50101@gmail.com> On 06/02/13 06:59, Fields, Christopher J wrote: > On Feb 5, 2013, at 2:10 PM, Carn? Draug wrote: > >> On 5 February 2013 19:34, Fields, Christopher J wrote: >>> Probably should retitle this to ask the question directly (make sure the right radars are pinged). >>> >>> My vote is yes, it should be removed. There were a lot of implementation issues with it that ended up becoming problematic. I do believe it is used, though, so I would like to get additional responses from the community before removing it and pointing to the separate repository (where there has been a lot of experimenting going on). >> Mentioning Bio::FeatureIO was just an example. I meant to ask it as >> more general. If the code is already in a separate repository, should >> it be removed from bioperl-live? >> >> Carn? > Yes for Bio::FeatureIO, no for Bio::Root::Root and the others at the moment (I want to get a release out by March 1, which I'm planning on announcing later today, so the less disruptive it is the better). Once we get a new release out we should remove the rest. > > chris > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Sounds good to me (I've been burnt once by the fact that Bio::FeatureIO is in two places). Florent From florent.angly at gmail.com Tue Feb 5 21:56:19 2013 From: florent.angly at gmail.com (Florent Angly) Date: Wed, 06 Feb 2013 12:56:19 +1000 Subject: [Bioperl-l] dependencies on perl version In-Reply-To: References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> Message-ID: <5111C653.2010703@gmail.com> For what it's worth, the current stable version of Debian uses perl 5.10.1 (http://packages.debian.org/stable/perl/perl). Florent On 06/02/13 12:40, Leon Timmermans wrote: > On Tue, Feb 5, 2013 at 9:56 PM, Fields, Christopher J > wrote: >> Aim for 5.10.1, but be careful of smart-match. If you do this, make sure to add a 'use 5.010' pragma at the top. >> >> (for those who don't like this, please speak up. perl 5.8 has been around almost 8 yrs, we would like to allow using new features if at all possible) > I *really* hate saying it, but I fear a lot of places are still stuck > on 5.8, in particular on 5.8.8 because of CentOS 5. I know my > department still is and doesn't seem to be in a hurry to upgrade, and > I'm pretty sure it won't be the only one (though personally I use a > self-compiled 5.16). > > Leon > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From hlapp at drycafe.net Tue Feb 5 22:27:35 2013 From: hlapp at drycafe.net (Hilmar Lapp) Date: Tue, 5 Feb 2013 22:27:35 -0500 Subject: [Bioperl-l] Next BioPerl release In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu> Message-ID: <09524241-59F8-4BFF-8054-53CD0A649C11@drycafe.net> On Feb 5, 2013, at 4:53 PM, Fields, Christopher J wrote: > I am scheduling the next BioPerl CPAN release tentatively for March 1. Yay!! Thanks for your leadership again, Chris, and for volunteering your time for the project. If nothing else, and I know this is no compensation really worth speaking of, we owe you beer, and I'll certainly pay my debt to you in Berlin if you come there. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : =========================================================== From hlapp at drycafe.net Tue Feb 5 22:32:40 2013 From: hlapp at drycafe.net (Hilmar Lapp) Date: Tue, 5 Feb 2013 22:32:40 -0500 Subject: [Bioperl-l] dependencies on perl version In-Reply-To: <5111C653.2010703@gmail.com> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> Message-ID: Does anyone know what Ubuntu uses? I've heard lots of other old version problems with CentOS. 8 years is really old, and at some point I fear that weighing backwards compatibility too much just holds us back in a real detrimental way. -hilmar On Feb 5, 2013, at 9:56 PM, Florent Angly wrote: > For what it's worth, the current stable version of Debian uses perl 5.10.1 (http://packages.debian.org/stable/perl/perl). > Florent > > On 06/02/13 12:40, Leon Timmermans wrote: >> On Tue, Feb 5, 2013 at 9:56 PM, Fields, Christopher J >> wrote: >>> Aim for 5.10.1, but be careful of smart-match. If you do this, make sure to add a 'use 5.010' pragma at the top. >>> >>> (for those who don't like this, please speak up. perl 5.8 has been around almost 8 yrs, we would like to allow using new features if at all possible) >> I *really* hate saying it, but I fear a lot of places are still stuck >> on 5.8, in particular on 5.8.8 because of CentOS 5. I know my >> department still is and doesn't seem to be in a hurry to upgrade, and >> I'm pretty sure it won't be the only one (though personally I use a >> self-compiled 5.16). >> >> Leon >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : =========================================================== From cjfields at illinois.edu Tue Feb 5 22:58:08 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 6 Feb 2013 03:58:08 +0000 Subject: [Bioperl-l] dependencies on perl version In-Reply-To: References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE18CBE@CHIMBX5.ad.uillinois.edu> Re: being held back, I agree. I don't necessarily want to intentionally break current modules by adding modern code unless it can be demonstrated to be a decent benefit performance-wise, but I don't want to impede new additions by requiring compat with perl 5.8 (hence my suggestion of a 'use 5.01x' pragma when appropriate). Ubuntu 12.04 LTS is on perl 5.14.2: http://askubuntu.com/questions/80672/what-perl-version-will-be-in-12-04-lts BTW, I was wrong about perl 5.8 being 8 yrs old; it's almost 11 yrs old (perl 5.8.0 was released on 7/18/2002). perl 5.8 reached end-of-life in 2008, fixes being only for security reasons. So, I support dropping perl 5.8 support, but we should have a decent route of use for the folks stuck on old clusters. chris On Feb 5, 2013, at 9:32 PM, Hilmar Lapp wrote: > Does anyone know what Ubuntu uses? I've heard lots of other old version problems with CentOS. > > 8 years is really old, and at some point I fear that weighing backwards compatibility too much just holds us back in a real detrimental way. > > -hilmar > > On Feb 5, 2013, at 9:56 PM, Florent Angly wrote: > >> For what it's worth, the current stable version of Debian uses perl 5.10.1 (http://packages.debian.org/stable/perl/perl). >> Florent >> >> On 06/02/13 12:40, Leon Timmermans wrote: >>> On Tue, Feb 5, 2013 at 9:56 PM, Fields, Christopher J >>> wrote: >>>> Aim for 5.10.1, but be careful of smart-match. If you do this, make sure to add a 'use 5.010' pragma at the top. >>>> >>>> (for those who don't like this, please speak up. perl 5.8 has been around almost 8 yrs, we would like to allow using new features if at all possible) >>> I *really* hate saying it, but I fear a lot of places are still stuck >>> on 5.8, in particular on 5.8.8 because of CentOS 5. I know my >>> department still is and doesn't seem to be in a hurry to upgrade, and >>> I'm pretty sure it won't be the only one (though personally I use a >>> self-compiled 5.16). >>> >>> Leon >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From l.m.timmermans at students.uu.nl Tue Feb 5 23:11:52 2013 From: l.m.timmermans at students.uu.nl (Leon Timmermans) Date: Wed, 6 Feb 2013 05:11:52 +0100 Subject: [Bioperl-l] dependencies on perl version In-Reply-To: References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> Message-ID: On Wed, Feb 6, 2013 at 4:32 AM, Hilmar Lapp wrote: > Does anyone know what Ubuntu uses? 5.14.2, distrowatch is your friend ;-) > I've heard lots of other old version problems with CentOS. I know people who still use CentOS 4 in production :-| > 8 years is really old, and at some point I fear that weighing backwards compatibility too much just holds us back in a real detrimental way. CentOS 5 is 6 years old (and will be supported another 4), but CentOS 6 is 'only' 19 months. perl missing a release in the 5.8-5.10 timeframe combined with an unfortunate alignment of its release schedule with Red Hat's don't do us any favors here. Leon From cjfields at illinois.edu Tue Feb 5 23:14:24 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 6 Feb 2013 04:14:24 +0000 Subject: [Bioperl-l] dependencies on perl version In-Reply-To: References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE18E52@CHIMBX5.ad.uillinois.edu> On Feb 5, 2013, at 8:40 PM, Leon Timmermans wrote: > On Tue, Feb 5, 2013 at 9:56 PM, Fields, Christopher J > wrote: >> Aim for 5.10.1, but be careful of smart-match. If you do this, make sure to add a 'use 5.010' pragma at the top. >> >> (for those who don't like this, please speak up. perl 5.8 has been around almost 8 yrs, we would like to allow using new features if at all possible) > > I *really* hate saying it, but I fear a lot of places are still stuck > on 5.8, in particular on 5.8.8 because of CentOS 5. I know my > department still is and doesn't seem to be in a hurry to upgrade, and > I'm pretty sure it won't be the only one (though personally I use a > self-compiled 5.16). > > Leon We had the same problem for a while, but our sysadmins were willing to set up perl 5.12 (at that time) loadable as a module (we can of course set up a local perl as well). We're now using a sysadmin-installed perl 5.16 with our current cluster. chris From cjfields at illinois.edu Tue Feb 5 23:24:31 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 6 Feb 2013 04:24:31 +0000 Subject: [Bioperl-l] dependencies on perl version In-Reply-To: References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> On Feb 5, 2013, at 10:11 PM, Leon Timmermans wrote: > On Wed, Feb 6, 2013 at 4:32 AM, Hilmar Lapp wrote: >> Does anyone know what Ubuntu uses? > > 5.14.2, distrowatch is your friend ;-) > >> I've heard lots of other old version problems with CentOS. > > I know people who still use CentOS 4 in production :-| > >> 8 years is really old, and at some point I fear that weighing backwards compatibility too much just holds us back in a real detrimental way. > > CentOS 5 is 6 years old (and will be supported another 4), but CentOS > 6 is 'only' 19 months. perl missing a release in the 5.8-5.10 > timeframe combined with an unfortunate alignment of its release > schedule with Red Hat's don't do us any favors here. > > Leon Right, it took ~8 yrs to go from 5.8 to 5.10. I'd like to point out that Python users are in the same boat: the Python version for CentOS 5 is 2.4.3, and Biopython requires a minimum of python 2.5 (and recommends python 2.7). We can always state that perl 5.8 is supported for the upcoming Bioperl release, but we're dropping v5.8 support for any future releases. chris From l.m.timmermans at students.uu.nl Tue Feb 5 23:33:57 2013 From: l.m.timmermans at students.uu.nl (Leon Timmermans) Date: Wed, 6 Feb 2013 05:33:57 +0100 Subject: [Bioperl-l] dependencies on perl version In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> Message-ID: On Wed, Feb 6, 2013 at 5:24 AM, Fields, Christopher J wrote: > Right, it took ~8 yrs to go from 5.8 to 5.10. I'd like to point out that Python users are in the same boat: the Python version for CentOS 5 is 2.4.3, and Biopython requires a minimum of python 2.5 (and recommends python 2.7). > > We can always state that perl 5.8 is supported for the upcoming Bioperl release, but we're dropping v5.8 support for any future releases. Sounds reasonable. These things shouldn't come as a surprise. I suspect that the thing that will save us is that most of these people install it once and then never upgrade. Leon From hartzell at alerce.com Wed Feb 6 12:58:07 2013 From: hartzell at alerce.com (George Hartzell) Date: Wed, 6 Feb 2013 09:58:07 -0800 Subject: [Bioperl-l] dependencies on perl version In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> Message-ID: <20754.39343.128576.743448@gargle.gargle.HOWL> Fields, Christopher J writes: > [...] > Right, it took ~8 yrs to go from 5.8 to 5.10. I'd like to point > out that Python users are in the same boat: the Python version for > CentOS 5 is 2.4.3, and Biopython requires a minimum of python 2.5 > (and recommends python 2.7). > > We can always state that perl 5.8 is supported for the upcoming > Bioperl release, but we're dropping v5.8 support for any future > releases. Do more than drop support for 5.8. The Perl community has put a transparent and predictable process in place for releasing [generally] better versions of the language. It means that Perl has a chance of continuing to be relevant, attracting new talent and actually *fixing* some of the s&%t that gives Perl a bad rap. It gives people something to plan around, no one should be surprised that v 5.X.Y is coming out in mid 20ZZ. BioPerl should do the same thing, declare a release policy that trails along with the Perl release schedule. Keep it simple and no one can argue with it. Support Perl releases as long as the releases themselves are supported. Rather than expending energy supporting out of date platforms, put the energy into being modern (or Modern...), better distro building and packaging, testing, documentation and releasing so that the process of staying current is painless. Look forward. Keep it interesting and fun. Everyone running Mac OS 9 on their Pismo, raise your hand. Anyone make their living running sequencing gels in Plexiglas doohickeys on their lab bench? I'm not suggesting that the BioPerl community is free to make arbitrary and capricious changes that makes it difficult for *anyone* to get anything done. Churn is a waste of time. But why should the all-volunteer BioPerl community be stuck supporting code from 12 years ago because it's cost effective for someone else to avoid spending *their* $/time/people to stay up to date. Those sites that value stability/maturity/stagnation so highly have already accepted the cost/difficulty of nailing one of their feet to the floor as they try to run forward. They recognize and depend on the benefits of having that stable base but generally they've also accepted the costs associated with their restrictive choices. They know how to pull in separate kernel/driver updates so that they can actually run on nearly modern hardware. They know, and live with, the fact that they're not going to have access to the shiny new stuff. And they know how to stay up to date, when they need to, with the software that their users need to be competitive (e.g. BioConductor and R). As long as (if/when...) updating a BioPerl release is something that can reliably happen with a few cpanm invocations then the sites that otherwise favor punctuated equilibrium will learn to handle gradual change. Those folks that are "stuck" on older releases always have the option of supporting professional Perl programmers to keep older releases going, backport changes, etc.... They're already buying support for their platforms (or freeloading and coping), let them put bread on the table at one of the bioinformatics consultancies or labs if they have something special they need. Have fun. Use sharp tools. Do cool science. Build cool things. No one is paying you to be backwards compatible with the previous millennium. g. From amackey at virginia.edu Wed Feb 6 13:47:46 2013 From: amackey at virginia.edu (Aaron Mackey) Date: Wed, 6 Feb 2013 13:47:46 -0500 Subject: [Bioperl-l] dependencies on perl version In-Reply-To: <20754.39343.128576.743448@gargle.gargle.HOWL> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> Message-ID: Huzzah! -- Aaron J. Mackey, PhD Assistant Professor Center for Public Health Genomics University of Virginia amackey at virginia.edu http://www.cphg.virginia.edu/mackey On Wed, Feb 6, 2013 at 12:58 PM, George Hartzell wrote: > Fields, Christopher J writes: > > [...] > > Right, it took ~8 yrs to go from 5.8 to 5.10. I'd like to point > > out that Python users are in the same boat: the Python version for > > CentOS 5 is 2.4.3, and Biopython requires a minimum of python 2.5 > > (and recommends python 2.7). > > > > We can always state that perl 5.8 is supported for the upcoming > > Bioperl release, but we're dropping v5.8 support for any future > > releases. > > Do more than drop support for 5.8. > > The Perl community has put a transparent and predictable process in > place for releasing [generally] better versions of the language. It > means that Perl has a chance of continuing to be relevant, attracting > new talent and actually *fixing* some of the s&%t that gives Perl a > bad rap. It gives people something to plan around, no one should be > surprised that v 5.X.Y is coming out in mid 20ZZ. > > BioPerl should do the same thing, declare a release policy that trails > along with the Perl release schedule. Keep it simple and no one can > argue with it. Support Perl releases as long as the releases > themselves are supported. > > Rather than expending energy supporting out of date platforms, put the > energy into being modern (or Modern...), better distro building and > packaging, testing, documentation and releasing so that the process of > staying current is painless. > > Look forward. Keep it interesting and fun. > > Everyone running Mac OS 9 on their Pismo, raise your hand. Anyone > make their living running sequencing gels in Plexiglas doohickeys on > their lab bench? > > I'm not suggesting that the BioPerl community is free to make > arbitrary and capricious changes that makes it difficult for *anyone* > to get anything done. Churn is a waste of time. > > But why should the all-volunteer BioPerl community be stuck supporting > code from 12 years ago because it's cost effective for someone else to > avoid spending *their* $/time/people to stay up to date. > > Those sites that value stability/maturity/stagnation so highly have > already accepted the cost/difficulty of nailing one of their feet to > the floor as they try to run forward. They recognize and depend on > the benefits of having that stable base but generally they've also > accepted the costs associated with their restrictive choices. They > know how to pull in separate kernel/driver updates so that they can > actually run on nearly modern hardware. They know, and live with, the > fact that they're not going to have access to the shiny new stuff. > And they know how to stay up to date, when they need to, with the > software that their users need to be competitive (e.g. BioConductor > and R). > > As long as (if/when...) updating a BioPerl release is something that > can reliably happen with a few cpanm invocations then the sites that > otherwise favor punctuated equilibrium will learn to handle gradual > change. > > Those folks that are "stuck" on older releases always have the option > of supporting professional Perl programmers to keep older releases > going, backport changes, etc.... They're already buying support for > their platforms (or freeloading and coping), let them put bread on the > table at one of the bioinformatics consultancies or labs if they have > something special they need. > > Have fun. Use sharp tools. Do cool science. Build cool things. No > one is paying you to be backwards compatible with the previous > millennium. > > g. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From tiago.hori at gmail.com Wed Feb 6 08:25:41 2013 From: tiago.hori at gmail.com (Tiago Hori) Date: Wed, 6 Feb 2013 05:25:41 -0800 (PST) Subject: [Bioperl-l] Problems installing Bio::Tools::Run:StandAloneBlastPlus Message-ID: <9b488c6e-34b3-4269-a7ac-e2206720939a@googlegroups.com> Hi Guys, I am trying to install the module Bio::Tools::Run:StandAloneBlastPlus, but it has been hard so far. I managed to install and compile samtools, after finding all the dependencies, but I am still missing something! I posted the complete report below! Any help, would be great! Cheers, T. cpan[1]> install Bio::Tools::Run::StandAloneBlastPlus Reading '/home/tiagohori/.cpan/Metadata' Database was generated on Tue, 05 Feb 2013 18:41:03 GMT Running install for module 'Bio::Tools::Run::StandAloneBlastPlus' Running make for C/CJ/CJFIELDS/BioPerl-Run-1.006900.tar.gz Checksum for /home/tiagohori/.cpan/sources/authors/id/C/CJ/CJFIELDS/BioPerl-Run-1.006900.tar.gz ok Scanning cache /home/tiagohori/.cpan/build for sizes ..................................------------------------------------------DONE DEL(1/20): /home/tiagohori/.cpan/build/BioPerl-Run-1.006900-qpHfzz DEL(2/20): /home/tiagohori/.cpan/build/BioPerl-Run-1.006900-qpHfzz.yml DEL(3/20): /home/tiagohori/.cpan/build/BioPerl-Run-1.006900-nMOXgO DEL(4/20): /home/tiagohori/.cpan/build/BioPerl-Run-1.006900-nMOXgO.yml DEL(5/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-bgBQyC DEL(6/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-bgBQyC.yml DEL(7/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-Ki3dbt DEL(8/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-Ki3dbt.yml DEL(9/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-ciM7U4 DEL(10/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-ciM7U4.yml DEL(11/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-oDyi_5 DEL(12/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-oDyi_5.yml DEL(13/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-AQiiAn DEL(14/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-AQiiAn.yml DEL(15/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-0H2Z9o DEL(16/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-0H2Z9o.yml DEL(17/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-c_8A_U DEL(18/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-c_8A_U.yml DEL(19/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-lWtV8v DEL(20/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-lWtV8v.yml CPAN.pm: Building C/CJ/CJFIELDS/BioPerl-Run-1.006900.tar.gz Install scripts? y/n [n ] n Do you want to run tests that require connection to servers across the internet (likely to cause some failures)? y/n [n ] n - will not run internet-requiring tests Created MYMETA.yml and MYMETA.json Creating new 'Build' script for 'BioPerl-Run' version '1.006900' Building BioPerl-Run CJFIELDS/BioPerl-Run-1.006900.tar.gz ./Build -- OK Running Build test t/Amap.t ...................... 1/18 # Required executable for Bio::Tools::Run::Alignment::Amap is not present t/Amap.t ...................... ok t/AnalysisFactory_soap.t ...... skipped: Network tests have not been requested t/Analysis_soap.t ............. skipped: Network tests have not been requested t/BEDTools.t .................. 3/423 # Required executable for Bio::Tools::Run::BEDTools is not present t/BEDTools.t .................. ok t/BWA.t ....................... 1/36 # Required executable for Bio::Tools::Run::BWA is not present t/BWA.t ....................... ok t/Blat.t ...................... 1/33 # Required executable for Bio::Tools::Run::Alignment::Blat is not present # Looks like you planned 33 tests but ran 20. t/Blat.t ...................... Dubious, test returned 255 (wstat 65280, 0xff00) Failed 13/33 subtests (less 15 skipped subtests: 5 okay) t/Bowtie.t .................... 1/73 # Required executable for Bio::Tools::Run::Bowtie is not present t/Bowtie.t .................... ok t/Cap3.t ...................... 1/91 # Required executable for Bio::Tools::Run::Cap3 is not present t/Cap3.t ...................... ok t/Clustalw.t .................. 1/45 # Required executable for Bio::Tools::Run::Alignment::Clustalw is not present t/Clustalw.t .................. ok t/Coil.t ...................... 2/6 # Required executable for Bio::Tools::Run::Coil is not present t/Coil.t ...................... ok t/Consense.t .................. 1/9 # Required executable for Bio::Tools::Run::Phylo::Phylip::Consense is not present t/Consense.t .................. ok t/DBA.t ....................... 1/18 # Required executable for Bio::Tools::Run::Alignment::DBA is not present t/DBA.t ....................... ok t/DrawGram.t .................. 1/6 # Required executable for Bio::Tools::Run::Phylo::Phylip::DrawGram is not present t/DrawGram.t .................. ok t/DrawTree.t .................. 1/6 # Required executable for Bio::Tools::Run::Phylo::Phylip::DrawTree is not present t/DrawTree.t .................. ok t/EMBOSS.t .................... ok t/Ensembl.t ................... skipped: Network tests have not been requested t/Eponine.t ................... 1/7 # Looks like you planned 7 tests but ran 2. t/Eponine.t ................... Dubious, test returned 255 (wstat 65280, 0xff00) Failed 5/7 subtests t/Exonerate.t ................. 1/89 # Required executable for Bio::Tools::Run::Alignment::Exonerate is not present t/Exonerate.t ................. ok t/FootPrinter.t ............... 1/24 # Required executable for Bio::Tools::Run::FootPrinter is not present t/FootPrinter.t ............... ok t/Genemark.hmm.prokaryotic.t .. 1/99 # Required environment variable $GENEMARK_MODELS is not set t/Genemark.hmm.prokaryotic.t .. ok t/Genewise.t .................. 1/20 # Required executable for Bio::Tools::Run::Genewise is not present t/Genewise.t .................. ok t/Genscan.t ................... 1/6 # Required environment variable $GENSCANDIR is not set t/Genscan.t ................... ok t/Gerp.t ...................... 1/33 # Required executable for Bio::Tools::Run::Phylo::Gerp is not present t/Gerp.t ...................... ok t/Glimmer2.t .................. 1/217 # Required executable for Bio::Tools::Run::Glimmer is not present t/Glimmer2.t .................. ok t/Glimmer3.t .................. 1/111 # Required executable for Bio::Tools::Run::Glimmer is not present t/Glimmer3.t .................. ok t/Gumby.t ..................... 1/124 # Required executable for Bio::Tools::Run::Phylo::Gumby is not present t/Gumby.t ..................... ok t/Hmmer.t ..................... 1/27 # Required executable for Bio::Tools::Run::Hmmer is not present t/Hmmer.t ..................... ok t/Hyphy.t ..................... 2/15 # Required executable for Bio::Tools::Run::Phylo::Hyphy::SLAC is not present t/Hyphy.t ..................... ok t/Infernal.t .................. 1/43 # Required executable for Bio::Tools::Run::Infernal is not present t/Infernal.t .................. ok t/Kalign.t .................... 1/8 # Required executable for Bio::Tools::Run::Alignment::Kalign is not present t/Kalign.t .................... ok t/LVB.t ....................... 1/19 # Required executable for Bio::Tools::Run::Phylo::LVB is not present t/LVB.t ....................... ok t/Lagan.t ..................... 1/12 # Required executable for Bio::Tools::Run::Alignment::Lagan is not present t/Lagan.t ..................... ok t/MAFFT.t ..................... 1/17 # Required executable for Bio::Tools::Run::Alignment::MAFFT is not present t/MAFFT.t ..................... ok t/MCS.t ....................... 1/24 # Required executable for Bio::Tools::Run::MCS is not present t/MCS.t ....................... ok t/Maq.t ....................... 1/51 # Required executable for Bio::Tools::Run::Maq is not present t/Maq.t ....................... ok t/Match.t ..................... 1/7 # Required executable for Bio::Tools::Run::Match is not present t/Match.t ..................... ok t/Mdust.t ..................... 1/5 # Required executable for Bio::Tools::Run::Mdust is not present t/Mdust.t ..................... ok t/Meme.t ...................... 1/25 # Required executable for Bio::Tools::Run::Meme is not present t/Meme.t ...................... ok t/Minimo.t .................... 1/72 # Required executable for Bio::Tools::Run::Minimo is not present t/Minimo.t .................... ok t/Molphy.t .................... 1/10 # Required executable for Bio::Tools::Run::Phylo::Molphy::ProtML is not present t/Molphy.t .................... ok t/Muscle.t .................... 1/16 # Required executable for Bio::Tools::Run::Alignment::Muscle is not present t/Muscle.t .................... ok t/Neighbor.t .................. 1/17 # Required executable for Bio::Tools::Run::Phylo::Phylip::Neighbor is not present t/Neighbor.t .................. ok t/Newbler.t ................... 1/98 # Required executable for Bio::Tools::Run::Newbler is not present t/Newbler.t ................... ok t/Njtree.t .................... 1/6 # Required executable for Bio::Tools::Run::Phylo::Njtree::Best is not present t/Njtree.t .................... ok t/PAML.t ...................... 1/28 # Required executable for Bio::Tools::Run::Phylo::PAML::Codeml is not present t/PAML.t ...................... ok t/Pal2Nal.t ................... 1/9 # Required executable for Bio::Tools::Run::Alignment::Pal2Nal is not present t/Pal2Nal.t ................... ok t/PhastCons.t ................. 1/181 # Required executable for Bio::Tools::Run::Phylo::Phast::PhastCons is not present t/PhastCons.t ................. ok t/Phrap.t ..................... 1/127 # Required executable for Bio::Tools::Run::Phrap is not present t/Phrap.t ..................... ok t/Phyml.t ..................... 1/47 # Required executable for Bio::Tools::Run::Phylo::Phyml is not present t/Phyml.t ..................... ok t/Primate.t ................... 1/8 # Required executable for Bio::Tools::Run::Primate is not present t/Primate.t ................... ok t/Primer3.t ................... 1/9 # Required executable for Bio::Tools::Run::Primer3 is not present t/Primer3.t ................... ok t/Prints.t .................... 1/7 # Required executable for Bio::Tools::Run::Prints is not present t/Prints.t .................... ok t/Probalign.t ................. 1/13 # Required executable for Bio::Tools::Run::Alignment::Probalign is not present t/Probalign.t ................. ok t/Probcons.t .................. 1/11 # Required executable for Bio::Tools::Run::Alignment::Probcons is not present t/Probcons.t .................. ok t/Profile.t ................... 1/7 # Required executable for Bio::Tools::Run::Profile is not present t/Profile.t ................... ok t/Promoterwise.t .............. 1/9 # Required executable for Bio::Tools::Run::Promoterwise is not present t/Promoterwise.t .............. ok t/ProtDist.t .................. 1/14 # Required executable for Bio::Tools::Run::Phylo::Phylip::ProtDist is not present t/ProtDist.t .................. ok t/ProtPars.t .................. 1/11 # Required executable for Bio::Tools::Run::Phylo::Phylip::ProtPars is not present t/ProtPars.t .................. ok t/Pseudowise.t ................ 1/18 # Required executable for Bio::Tools::Run::Pseudowise is not present t/Pseudowise.t ................ ok t/QuickTree.t ................. 1/13 # Required executable for Bio::Tools::Run::Phylo::QuickTree is not present t/QuickTree.t ................. ok t/RepeatMasker.t .............. 1/12 RepeatMasker program not found as or not executable. # Required executable for Bio::Tools::Run::RepeatMasker is not present t/RepeatMasker.t .............. ok t/SABlastPlus.t ............... 1/65 # Required executable for Bio::Tools::Run::BlastPlus is not present # Looks like you planned 65 tests but ran 63. t/SABlastPlus.t ............... Dubious, test returned 255 (wstat 65280, 0xff00) Failed 2/65 subtests (less 59 skipped subtests: 4 okay) t/SLR.t ....................... 1/7 # Required executable for Bio::Tools::Run::Phylo::SLR is not present t/SLR.t ....................... ok t/Samtools.t .................. ok t/Seg.t ....................... 1/8 # Required executable for Bio::Tools::Run::Seg is not present t/Seg.t ....................... ok t/Semphy.t .................... 1/19 # Required executable for Bio::Tools::Run::Phylo::Semphy is not present t/Semphy.t .................... ok t/SeqBoot.t ................... 1/9 # Required executable for Bio::Tools::Run::Phylo::Phylip::SeqBoot is not present t/SeqBoot.t ................... ok t/Signalp.t ................... 1/7 # Required executable for Bio::Tools::Run::Signalp is not present t/Signalp.t ................... ok t/Sim4.t ...................... 1/23 # Required executable for Bio::Tools::Run::Alignment::Sim4 is not present t/Sim4.t ...................... ok t/Simprot.t ................... 1/6 # Required executable for Bio::Tools::Run::Simprot is not present t/Simprot.t ................... ok t/SoapEU-function.t ........... skipped: The optional module Bio::DB::ESoap (or dependencies thereof) was not installed t/SoapEU-unit.t ............... skipped: The optional module Bio::DB::ESoap (or dependencies thereof) was not installed t/StandAloneFasta.t ........... 1/15 # Required executable for Bio::Tools::Run::Alignment::StandAloneFasta is not present t/StandAloneFasta.t ........... ok t/TCoffee.t ................... 1/27 # Required executable for Bio::Tools::Run::Alignment::TCoffee is not present t/TCoffee.t ................... ok t/TigrAssembler.t ............. 1/88 # Required executable for Bio::Tools::Run::TigrAssembler is not present # Required executable for Bio::Tools::Run::TigrAssembler is not present t/TigrAssembler.t ............. ok t/Tmhmm.t ..................... 1/9 # Required executable for Bio::Tools::Run::Tmhmm is not present t/Tmhmm.t ..................... ok t/TribeMCL.t .................. ok t/Vista.t ..................... ok t/gmap-run.t .................. 1/8 # Required executable for Bio::Tools::Run::Alignment::Gmap is not present t/gmap-run.t .................. ok t/tRNAscanSE.t ................ 1/12 # Required executable for Bio::Tools::Run::tRNAscanSE is not present t/tRNAscanSE.t ................ ok Test Summary Report ------------------- t/Blat.t (Wstat: 65280 Tests: 20 Failed: 0) Non-zero exit status: 255 Parse errors: Bad plan. You planned 33 tests but ran 20. t/Eponine.t (Wstat: 65280 Tests: 2 Failed: 0) Non-zero exit status: 255 Parse errors: Bad plan. You planned 7 tests but ran 2. t/SABlastPlus.t (Wstat: 65280 Tests: 63 Failed: 0) Non-zero exit status: 255 Parse errors: Bad plan. You planned 65 tests but ran 63. Files=80, Tests=2876, 39 wallclock secs ( 0.54 usr 0.23 sys + 32.54 cusr 4.94 csys = 38.25 CPU) Result: FAIL Failed 3/80 test programs. 0/2876 subtests failed. CJFIELDS/BioPerl-Run-1.006900.tar.gz ./Build test -- NOT OK //hint// to see the cpan-testers results for installing this module, try: reports CJFIELDS/BioPerl-Run-1.006900.tar.gz Running Build install make test had returned bad status, won't install without force From guy.leonard at gmail.com Wed Feb 6 13:35:38 2013 From: guy.leonard at gmail.com (guy.leonard at gmail.com) Date: Wed, 6 Feb 2013 10:35:38 -0800 (PST) Subject: [Bioperl-l] Next BioPerl release In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu> Message-ID: <3e4d717e-b58a-4bfd-943d-6f213bfae260@googlegroups.com> Nice, super work. Will there be a rough list of feature changes/addition/deprecation, or shall I consult git logs? On Tuesday, 5 February 2013 21:53:29 UTC, Christopher Fields wrote: > > All, > > I am scheduling the next BioPerl CPAN release tentatively for March 1. > Any help in triaging bug reports would be greatly appreciated! > > Amongst all other changes, as mentioned in a separate thread we will > remove Bio::FeatureIO, now developed in a separate repository: > > https://github.com/bioperl/Bio-FeatureIO > > Feedback, suggestions, etc are greatly appreciated. > > chris > _______________________________________________ > Bioperl-l mailing list > Biop... at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From guy.leonard at gmail.com Wed Feb 6 13:35:38 2013 From: guy.leonard at gmail.com (guy.leonard at gmail.com) Date: Wed, 6 Feb 2013 10:35:38 -0800 (PST) Subject: [Bioperl-l] Next BioPerl release In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu> Message-ID: <3e4d717e-b58a-4bfd-943d-6f213bfae260@googlegroups.com> Nice, super work. Will there be a rough list of feature changes/addition/deprecation, or shall I consult git logs? On Tuesday, 5 February 2013 21:53:29 UTC, Christopher Fields wrote: > > All, > > I am scheduling the next BioPerl CPAN release tentatively for March 1. > Any help in triaging bug reports would be greatly appreciated! > > Amongst all other changes, as mentioned in a separate thread we will > remove Bio::FeatureIO, now developed in a separate repository: > > https://github.com/bioperl/Bio-FeatureIO > > Feedback, suggestions, etc are greatly appreciated. > > chris > _______________________________________________ > Bioperl-l mailing list > Biop... at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From sidd.basu at gmail.com Wed Feb 6 14:36:17 2013 From: sidd.basu at gmail.com (Siddhartha Basu) Date: Wed, 6 Feb 2013 13:36:17 -0600 Subject: [Bioperl-l] Re: Next BioPerl release In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu> Message-ID: <5112b0b3.a5dc320a.4105.1fe3@mx.google.com> Hi, On Tue, 05 Feb 2013, Fields, Christopher J wrote: > All, > > I am scheduling the next BioPerl CPAN release tentatively for March 1. Any help in triaging bug reports would be greatly appreciated! > > Amongst all other changes, as mentioned in a separate thread we will remove Bio::FeatureIO, now developed in a separate repository: > > https://github.com/bioperl/Bio-FeatureIO > > Feedback, suggestions, etc are greatly appreciated. Here are CI build report on 5.12, 5.14 and 5.16 using travis. https://api.travis-ci.org/jobs/4623997/log.txt?deansi=true https://api.travis-ci.org/jobs/4623998/log.txt?deansi=true https://api.travis-ci.org/jobs/4623999/log.txt?deansi=true Could not get 5.10 to work on travis. Though i activated the (--network) option, it still didn't run one of the test that needs network. Also, initially got confused by the fact that though it has dist.ini, the tests still has to run through Build.PL. Running **dzil test** do not work. Hope this helps. thanks, -siddhartha > > chris > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Wed Feb 6 14:46:49 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 6 Feb 2013 19:46:49 +0000 Subject: [Bioperl-l] Next BioPerl release In-Reply-To: <3e4d717e-b58a-4bfd-943d-6f213bfae260@googlegroups.com> References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu> <3e4d717e-b58a-4bfd-943d-6f213bfae260@googlegroups.com> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1A109@CHIMBX5.ad.uillinois.edu> We've been a little better at keeping track of significant changes this time 'round. There aren't a lot of major updates, but it's important to make sure we get a release out to ensure everyone (not just those familiar with git) can access them. chris On Feb 6, 2013, at 12:35 PM, wrote: > Nice, super work. > > Will there be a rough list of feature changes/addition/deprecation, or > shall I consult git logs? > > On Tuesday, 5 February 2013 21:53:29 UTC, Christopher Fields wrote: >> >> All, >> >> I am scheduling the next BioPerl CPAN release tentatively for March 1. >> Any help in triaging bug reports would be greatly appreciated! >> >> Amongst all other changes, as mentioned in a separate thread we will >> remove Bio::FeatureIO, now developed in a separate repository: >> >> https://github.com/bioperl/Bio-FeatureIO >> >> Feedback, suggestions, etc are greatly appreciated. >> >> chris >> _______________________________________________ >> Bioperl-l mailing list >> Biop... at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Wed Feb 6 14:54:58 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 6 Feb 2013 19:54:58 +0000 Subject: [Bioperl-l] Next BioPerl release In-Reply-To: <5112b0b3.a5dc320a.4105.1fe3@mx.google.com> References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu> <5112b0b3.a5dc320a.4105.1fe3@mx.google.com> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1A1AB@CHIMBX5.ad.uillinois.edu> On Feb 6, 2013, at 1:36 PM, Siddhartha Basu wrote: > Hi, > > On Tue, 05 Feb 2013, Fields, Christopher J wrote: > >> All, >> >> I am scheduling the next BioPerl CPAN release tentatively for March 1. Any help in triaging bug reports would be greatly appreciated! >> >> Amongst all other changes, as mentioned in a separate thread we will remove Bio::FeatureIO, now developed in a separate repository: >> >> https://github.com/bioperl/Bio-FeatureIO >> >> Feedback, suggestions, etc are greatly appreciated. > > Here are CI build report on 5.12, 5.14 and 5.16 using travis. > https://api.travis-ci.org/jobs/4623997/log.txt?deansi=true > https://api.travis-ci.org/jobs/4623998/log.txt?deansi=true > https://api.travis-ci.org/jobs/4623999/log.txt?deansi=true > > Could not get 5.10 to work on travis. Though i activated the (--network) > option, it still didn't run one of the test that needs network. Also, initially got > confused by the fact that though it has dist.ini, the tests still has > to run through Build.PL. Running **dzil test** do not work. > > Hope this helps. > > thanks, > -siddhartha Just to point out, that was for Bio-FeatureIO. Truthfully I'm not worried about that one yet; got to get over Mt. Everest first (the main release). Build.PL is there mainly as a convenience for users w/o Dist::Zilla, which, last I recall, had a higher dependency list than even BioPerl (though I may be mistaken). I'll probably have to set up a Build.PL that can be clobbered by Dist::Zilla as needed. Or we can just get rid of it and insist that dev. code has to be added via 'use lib' or PERL5LIB, and not allow installation. chris From sidd.basu at gmail.com Wed Feb 6 15:26:06 2013 From: sidd.basu at gmail.com (Siddhartha Basu) Date: Wed, 6 Feb 2013 14:26:06 -0600 Subject: [Bioperl-l] Re: Next BioPerl release In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1A1AB@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu> <5112b0b3.a5dc320a.4105.1fe3@mx.google.com> <118F034CF4C3EF48A96F86CE585B94BF6CE1A1AB@CHIMBX5.ad.uillinois.edu> Message-ID: <5112bc60.c69e320a.1e98.2028@mx.google.com> On Wed, 06 Feb 2013, Fields, Christopher J wrote: > On Feb 6, 2013, at 1:36 PM, Siddhartha Basu > wrote: > > > Hi, > > > > On Tue, 05 Feb 2013, Fields, Christopher J wrote: > > > >> All, > >> > >> I am scheduling the next BioPerl CPAN release tentatively for March 1. Any help in triaging bug reports would be greatly appreciated! > >> > >> Amongst all other changes, as mentioned in a separate thread we will remove Bio::FeatureIO, now developed in a separate repository: > >> > >> https://github.com/bioperl/Bio-FeatureIO > >> > >> Feedback, suggestions, etc are greatly appreciated. > > > > Here are CI build report on 5.12, 5.14 and 5.16 using travis. > > https://api.travis-ci.org/jobs/4623997/log.txt?deansi=true > > https://api.travis-ci.org/jobs/4623998/log.txt?deansi=true > > https://api.travis-ci.org/jobs/4623999/log.txt?deansi=true > > > > Could not get 5.10 to work on travis. Though i activated the (--network) > > option, it still didn't run one of the test that needs network. Also, initially got > > confused by the fact that though it has dist.ini, the tests still has > > to run through Build.PL. Running **dzil test** do not work. > > > > Hope this helps. > > > > thanks, > > -siddhartha > > Just to point out, that was for Bio-FeatureIO. Truthfully I'm not worried about that one yet; got to get over Mt. Everest first (the main release). So, what are steps left for getting the release out to CPAN. Like are there lot of feature branches still left to be merged, are there a lot of unit tests still not passing. Just trying to figure out anyway i could be of any help to expedite the release process. However, if they are already taken care of, please ignore. > > Build.PL is there mainly as a convenience for users w/o Dist::Zilla, which, last I recall, had a higher dependency list than even BioPerl (though I may be mistaken). I'll probably have to set up a Build.PL that can be clobbered by Dist::Zilla as needed. As far as the error i encountered, presence of Build.PL was blocking dzil build/release process. And by default, dzil expects to generate Build.PL during its build/release process. However, i am not sure which mode is the most suitable for bioperl devs. > Or we can just get rid of it and insist that dev. code has to be added via 'use lib' or PERL5LIB, and not allow installation. thanks, -siddhartha > > chris From hlapp at drycafe.net Wed Feb 6 16:30:33 2013 From: hlapp at drycafe.net (Hilmar Lapp) Date: Wed, 6 Feb 2013 16:30:33 -0500 Subject: [Bioperl-l] dependencies on perl version In-Reply-To: <20754.39343.128576.743448@gargle.gargle.HOWL> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> Message-ID: Great points, George, and you're making a very compelling argument. I'm in total agreement. It's almost becoming a reason to having to be embarrassed to still be programming in Perl these days, so one might as well have fun while it lasts. -hilmar On Feb 6, 2013, at 12:58 PM, George Hartzell wrote: > Fields, Christopher J writes: >> [...] >> Right, it took ~8 yrs to go from 5.8 to 5.10. I'd like to point >> out that Python users are in the same boat: the Python version for >> CentOS 5 is 2.4.3, and Biopython requires a minimum of python 2.5 >> (and recommends python 2.7). >> >> We can always state that perl 5.8 is supported for the upcoming >> Bioperl release, but we're dropping v5.8 support for any future >> releases. > > Do more than drop support for 5.8. > > The Perl community has put a transparent and predictable process in > place for releasing [generally] better versions of the language. It > means that Perl has a chance of continuing to be relevant, attracting > new talent and actually *fixing* some of the s&%t that gives Perl a > bad rap. It gives people something to plan around, no one should be > surprised that v 5.X.Y is coming out in mid 20ZZ. > > BioPerl should do the same thing, declare a release policy that trails > along with the Perl release schedule. Keep it simple and no one can > argue with it. Support Perl releases as long as the releases > themselves are supported. > > Rather than expending energy supporting out of date platforms, put the > energy into being modern (or Modern...), better distro building and > packaging, testing, documentation and releasing so that the process of > staying current is painless. > > Look forward. Keep it interesting and fun. > > Everyone running Mac OS 9 on their Pismo, raise your hand. Anyone > make their living running sequencing gels in Plexiglas doohickeys on > their lab bench? > > I'm not suggesting that the BioPerl community is free to make > arbitrary and capricious changes that makes it difficult for *anyone* > to get anything done. Churn is a waste of time. > > But why should the all-volunteer BioPerl community be stuck supporting > code from 12 years ago because it's cost effective for someone else to > avoid spending *their* $/time/people to stay up to date. > > Those sites that value stability/maturity/stagnation so highly have > already accepted the cost/difficulty of nailing one of their feet to > the floor as they try to run forward. They recognize and depend on > the benefits of having that stable base but generally they've also > accepted the costs associated with their restrictive choices. They > know how to pull in separate kernel/driver updates so that they can > actually run on nearly modern hardware. They know, and live with, the > fact that they're not going to have access to the shiny new stuff. > And they know how to stay up to date, when they need to, with the > software that their users need to be competitive (e.g. BioConductor > and R). > > As long as (if/when...) updating a BioPerl release is something that > can reliably happen with a few cpanm invocations then the sites that > otherwise favor punctuated equilibrium will learn to handle gradual > change. > > Those folks that are "stuck" on older releases always have the option > of supporting professional Perl programmers to keep older releases > going, backport changes, etc.... They're already buying support for > their platforms (or freeloading and coping), let them put bread on the > table at one of the bioinformatics consultancies or labs if they have > something special they need. > > Have fun. Use sharp tools. Do cool science. Build cool things. No > one is paying you to be backwards compatible with the previous > millennium. > > g. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : =========================================================== From cjfields at illinois.edu Wed Feb 6 17:11:06 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 6 Feb 2013 22:11:06 +0000 Subject: [Bioperl-l] BioPerl long-term, was Re: dependencies on perl version In-Reply-To: References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> George, Should put your post on a pedestal :) tl;dr version: I completely agree, but we need help in order to do this. Long(-winded) version: I agree completely, backwards compatibility is killing us. But, we do need current and new people to get involved and help drive this forward. We need people on all fronts, from coding and bug fixes to documentation and web site maintenance. I've been driving this bus for a number of years now. Not getting tired yet, but I am getting substantially busier with my current endeavors, so my time spent working on BioPerl has dwindled considerably. Any additional support or sharing of responsibilities will help tremendously in keeping up momentum (if someone else wants to take the wheel for a bit, please let me know :). If we follow the perl release route, we should streamline the release process (think Dist::Zilla), end support of older versions of Perl, and work on a sustainable release schedule. The fact that we have so many of us so-called 'old folks' speaking up in favor of this is a very good sign. We do need a bit more than that; we need help. BioPerl is a very large project. A key point we need to address, which is very important for the future of BioPerl. I use Perl quite a bit in my current work (dabble with Ruby and Python as well when I have to). BioPerl? A little, but not as much as I could. Shocked? The main three reason I don't use it 'in anger': performance, performance, and performance. It is very important that we make a concerted effort to address this at all levels. It could be as simple as completely separating parsing from object creation (where the bulk of performance problems seem to lie, but not all of them). A specific example: Heng Li once tested the performance of FASTQ parsing (perl, python, bioperl, biopython, his C code, etc). BioPerl's FASTQ couldn't even be measured; IIRC it went on for many hours until he killed it. This was with the older version of the parser, but I'm willing to bet the newer one I wrote isn't any better. This. needs. to. change. I see no problem in stating any generic parsing and low-level interfaces are just as much a part of what BioPerl encompasses as the higher-level Bio::* classes themselves. Steve and Jason were on to something with SearchIO; it's maybe not as performant as we would like, but it certainly is more flexible in terms of what can be done, b/c it separates out low-level parsing from object creation. That's the general model we should look at. There is a good reason Biopython is following this model with their SearchIO implementation (Peter C, are you reading this?) We have a lot of very talented people involved with this project, both on the purely computational and purely biological end as well as the folks like me who straddle the two domains. A lot of good code out there that can be used, wrapped, taken advantage of, including everything we currently have in BioPerl. Let's come up with something that both works and works well, that people can use on a regular basis, even at a low level if they choose. That alone would dissuade new users from writing up (yet another) custom FASTA/FASTQ/BLAST/GenBank/etc parser b/c the BioPerl one takes millennia to finish. A few examples on this front: Rob Buels created a generic parser for GFF3 (Bio::GFF3::LowLevel) with very few dependencies, we wrap this with the newer Bio::FeatureIO code. Leon has Bio::SFF. Lincoln of course wrote Bio::DB::Sam and Bio::DB::BigFile. I have started a wrapper around Heng's FASTQ/FASTA parsing code (kseq), it seems to work quite well (~20M FASTQ in 30 sec last I recall?). So: If it means targeting performance, backwards-compatibility be damned (using Devel::NYTProf?), we do that. If it means creating a new Bio-NGS repo to focus some of these efforts, so be it. If it means we get away from the Java-based interface stuff in favor of something more Perl-like (roles anyone?), then I'm all for it. If it means we modularize BioPerl so this can be done, well, you probably know where I stand (yes). If it means this is to be BioPerl 2.0, then let's move that direction, sooner than later. But I can't do it alone. We (not just me, but we) need to drive the direction we take. First one who codes gets the gold ring. chris On Feb 6, 2013, at 12:47 PM, Aaron Mackey wrote: > Huzzah! > > -- > Aaron J. Mackey, PhD > Assistant Professor > Center for Public Health Genomics > University of Virginia > amackey at virginia.edu > http://www.cphg.virginia.edu/mackey > > > On Wed, Feb 6, 2013 at 12:58 PM, George Hartzell wrote: > Fields, Christopher J writes: > > [...] > > Right, it took ~8 yrs to go from 5.8 to 5.10. I'd like to point > > out that Python users are in the same boat: the Python version for > > CentOS 5 is 2.4.3, and Biopython requires a minimum of python 2.5 > > (and recommends python 2.7). > > > > We can always state that perl 5.8 is supported for the upcoming > > Bioperl release, but we're dropping v5.8 support for any future > > releases. > > Do more than drop support for 5.8. > > The Perl community has put a transparent and predictable process in > place for releasing [generally] better versions of the language. It > means that Perl has a chance of continuing to be relevant, attracting > new talent and actually *fixing* some of the s&%t that gives Perl a > bad rap. It gives people something to plan around, no one should be > surprised that v 5.X.Y is coming out in mid 20ZZ. > > BioPerl should do the same thing, declare a release policy that trails > along with the Perl release schedule. Keep it simple and no one can > argue with it. Support Perl releases as long as the releases > themselves are supported. > > Rather than expending energy supporting out of date platforms, put the > energy into being modern (or Modern...), better distro building and > packaging, testing, documentation and releasing so that the process of > staying current is painless. > > Look forward. Keep it interesting and fun. > > Everyone running Mac OS 9 on their Pismo, raise your hand. Anyone > make their living running sequencing gels in Plexiglas doohickeys on > their lab bench? > > I'm not suggesting that the BioPerl community is free to make > arbitrary and capricious changes that makes it difficult for *anyone* > to get anything done. Churn is a waste of time. > > But why should the all-volunteer BioPerl community be stuck supporting > code from 12 years ago because it's cost effective for someone else to > avoid spending *their* $/time/people to stay up to date. > > Those sites that value stability/maturity/stagnation so highly have > already accepted the cost/difficulty of nailing one of their feet to > the floor as they try to run forward. They recognize and depend on > the benefits of having that stable base but generally they've also > accepted the costs associated with their restrictive choices. They > know how to pull in separate kernel/driver updates so that they can > actually run on nearly modern hardware. They know, and live with, the > fact that they're not going to have access to the shiny new stuff. > And they know how to stay up to date, when they need to, with the > software that their users need to be competitive (e.g. BioConductor > and R). > > As long as (if/when...) updating a BioPerl release is something that > can reliably happen with a few cpanm invocations then the sites that > otherwise favor punctuated equilibrium will learn to handle gradual > change. > > Those folks that are "stuck" on older releases always have the option > of supporting professional Perl programmers to keep older releases > going, backport changes, etc.... They're already buying support for > their platforms (or freeloading and coping), let them put bread on the > table at one of the bioinformatics consultancies or labs if they have > something special they need. > > Have fun. Use sharp tools. Do cool science. Build cool things. No > one is paying you to be backwards compatible with the previous > millennium. > > g. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From cjfields at illinois.edu Wed Feb 6 17:34:42 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 6 Feb 2013 22:34:42 +0000 Subject: [Bioperl-l] BioPerl long-term, was Re: dependencies on perl version In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1AF0C@CHIMBX5.ad.uillinois.edu> I want to clarify, parser optimization isn't the only point we need to focus on by any means (and may not be the main one). There is a lot of room for improvement top to bottom, that was one specific example I have long held to be an issue. -c On Feb 6, 2013, at 4:11 PM, "Fields, Christopher J" wrote: > Shocked? The main three reason I don't use it 'in anger': performance, performance, and performance. It is very important that we make a concerted effort to address this at all levels. It could be as simple as completely separating parsing from object creation (where the bulk of performance problems seem to lie, but not all of them). ... From p.j.a.cock at googlemail.com Wed Feb 6 17:43:13 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 6 Feb 2013 22:43:13 +0000 Subject: [Bioperl-l] BioPerl long-term, was Re: dependencies on perl version In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> Message-ID: On Wed, Feb 6, 2013 at 10:11 PM, Fields, Christopher J wrote: > > I see no problem in stating any generic parsing and low-level interfaces > are just as much a part of what BioPerl encompasses as the higher-level > Bio::* classes themselves. Steve and Jason were on to something with > SearchIO; it's maybe not as performant as we would like, but it certainly > is more flexible in terms of what can be done, b/c it separates out > low-level parsing from object creation. That's the general model we > should look at. There is a good reason Biopython is following this > model with their SearchIO implementation (Peter C, are you reading this?) Actually I don't think we did end up with that kind of separation in the Biopython SearchIO - which is not so say it isn't an excellent model to follow. Rather the Biopython SearchIO (like the BioPerl one) had as the first goal a consistent object model across assorted file formats. The idea of a low level minimal overhead parsers (which are very format specific), on which a heavier but consistent object model can be built might be a good balance - the high level API has the connivence, but if you give that up you can have more speed. That's what I recommend with FASTQ and Biopython, e.g. http://news.open-bio.org/news/2009/09/biopython-fast-fastq/ > > I have started a wrapper around Heng's FASTQ/FASTA parsing > code (kseq), it seems to work quite well (~20M FASTQ in 30 sec > last I recall?). > I'd have to dig through my emails, but I think the BioRuby guys looked at that too - as I recall while it was fast, the error handling left something to be desired. Email me directly or on the BioRuby list if you want to follow up on that. Regards, Peter From cjfields at illinois.edu Wed Feb 6 17:53:21 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 6 Feb 2013 22:53:21 +0000 Subject: [Bioperl-l] FASTQ, was Re: BioPerl long-term, was Re: dependencies on perl version In-Reply-To: References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu> On Feb 6, 2013, at 4:43 PM, Peter Cock wrote: > On Wed, Feb 6, 2013 at 10:11 PM, Fields, Christopher J > wrote: >> >> I see no problem in stating any generic parsing and low-level interfaces >> are just as much a part of what BioPerl encompasses as the higher-level >> Bio::* classes themselves. Steve and Jason were on to something with >> SearchIO; it's maybe not as performant as we would like, but it certainly >> is more flexible in terms of what can be done, b/c it separates out >> low-level parsing from object creation. That's the general model we >> should look at. There is a good reason Biopython is following this >> model with their SearchIO implementation (Peter C, are you reading this?) > > Actually I don't think we did end up with that kind of separation in the > Biopython SearchIO - which is not so say it isn't an excellent model > to follow. Rather the Biopython SearchIO (like the BioPerl one) had > as the first goal a consistent object model across assorted file > formats. > > The idea of a low level minimal overhead parsers (which are very > format specific), on which a heavier but consistent object model > can be built might be a good balance - the high level API has the > connivence, but if you give that up you can have more speed. > That's what I recommend with FASTQ and Biopython, e.g. > http://news.open-bio.org/news/2009/09/biopython-fast-fastq/ > >> >> I have started a wrapper around Heng's FASTQ/FASTA parsing >> code (kseq), it seems to work quite well (~20M FASTQ in 30 sec >> last I recall?). >> > > I'd have to dig through my emails, but I think the BioRuby guys > looked at that too - as I recall while it was fast, the error handling > left something to be desired. Email me directly or on the BioRuby > list if you want to follow up on that. > > Regards, > > Peter I did a little on this, worth following up on, but I pulled the FASTQ test examples you created from the paper to test it out. IIRC it parsed where it needed to, but I'm not sure how it handled bad sequences, so yes, worth looking into. Maybe worth moving to open-bio-l for broader discussion. chris From whereverroadgoes at gmail.com Wed Feb 6 16:59:04 2013 From: whereverroadgoes at gmail.com (Slym) Date: Wed, 6 Feb 2013 13:59:04 -0800 (PST) Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases In-Reply-To: <87txpr26jj.fsf@topper.koldfront.dk> References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com> <87txpr26jj.fsf@topper.koldfront.dk> Message-ID: <411e920d-e614-417d-9198-78bef9adba16@googlegroups.com> Everything's working now! Thank you very much, especially to you Adam! > From carandraug+dev at gmail.com Wed Feb 6 20:38:20 2013 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Thu, 7 Feb 2013 01:38:20 +0000 Subject: [Bioperl-l] dependencies on perl version In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> Message-ID: On 5 February 2013 20:56, Fields, Christopher J wrote: > On Feb 5, 2013, at 2:06 PM, Carn? Draug wrote: >> how much perl backwards compatibility does bioperl needs to keep? > > Aim for 5.10.1, but be careful of smart-match. Well, I solved my problem differently and ended up not needing any of the new features. But next time I'll know. Thanks Carn? From pcantalupo at gmail.com Wed Feb 6 23:04:08 2013 From: pcantalupo at gmail.com (Paul Cantalupo) Date: Wed, 6 Feb 2013 23:04:08 -0500 Subject: [Bioperl-l] bug 3376 status needs updated Message-ID: Hi, A few months ago, I fixed bug 3376 ( https://github.com/bioperl/bioperl-live/commit/142e5d79e3a6593db32bf0af99048f47d01bd3f2). The Redmine bug page (https://redmine.open-bio.org/issues/3376) hasn't been updated to resolved or closed. Should I do this or is Chris the only one who does that? Thank you, Paul From cjfields at illinois.edu Wed Feb 6 23:20:30 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Thu, 7 Feb 2013 04:20:30 +0000 Subject: [Bioperl-l] bug 3376 status needs updated In-Reply-To: References: Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1B45C@CHIMBX5.ad.uillinois.edu> No, go ahead and close it. Let me know if you run into perm. problems with it. chris On Feb 6, 2013, at 10:04 PM, Paul Cantalupo wrote: > Hi, > > A few months ago, I fixed bug 3376 ( > https://github.com/bioperl/bioperl-live/commit/142e5d79e3a6593db32bf0af99048f47d01bd3f2). > The Redmine bug page (https://redmine.open-bio.org/issues/3376) hasn't been > updated to resolved or closed. Should I do this or is Chris the only one > who does that? > > Thank you, > > Paul > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From l.m.timmermans at students.uu.nl Thu Feb 7 04:07:57 2013 From: l.m.timmermans at students.uu.nl (Leon Timmermans) Date: Thu, 7 Feb 2013 10:07:57 +0100 Subject: [Bioperl-l] Next BioPerl release In-Reply-To: <5112bc60.c69e320a.1e98.2028@mx.google.com> References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu> <5112b0b3.a5dc320a.4105.1fe3@mx.google.com> <118F034CF4C3EF48A96F86CE585B94BF6CE1A1AB@CHIMBX5.ad.uillinois.edu> <5112bc60.c69e320a.1e98.2028@mx.google.com> Message-ID: On Wed, Feb 6, 2013 at 9:26 PM, Siddhartha Basu wrote: > As far as the error i encountered, presence of Build.PL was blocking dzil > build/release process. And by default, dzil expects to generate > Build.PL during its build/release process. However, i am not sure which > mode is the most suitable for bioperl devs. You can prune the Build.PL, and then let dzil add its own. We wouldn't be the first to do that sort of thing. Leon From amackey at virginia.edu Thu Feb 7 10:25:07 2013 From: amackey at virginia.edu (Aaron Mackey) Date: Thu, 7 Feb 2013 10:25:07 -0500 Subject: [Bioperl-l] FASTQ, was Re: BioPerl long-term, was Re: dependencies on perl version In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> <118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu> Message-ID: You might also want to consider a lazy/pull-based parser to defer parsing/object-building for pieces of the object that don't get used. This also usually provides some error tolerance. -Aaron -- Aaron J. Mackey, PhD Assistant Professor Center for Public Health Genomics University of Virginia amackey at virginia.edu http://www.cphg.virginia.edu/mackey On Wed, Feb 6, 2013 at 5:53 PM, Fields, Christopher J wrote: > On Feb 6, 2013, at 4:43 PM, Peter Cock wrote: > > > On Wed, Feb 6, 2013 at 10:11 PM, Fields, Christopher J > > wrote: > >> > >> I see no problem in stating any generic parsing and low-level interfaces > >> are just as much a part of what BioPerl encompasses as the higher-level > >> Bio::* classes themselves. Steve and Jason were on to something with > >> SearchIO; it's maybe not as performant as we would like, but it > certainly > >> is more flexible in terms of what can be done, b/c it separates out > >> low-level parsing from object creation. That's the general model we > >> should look at. There is a good reason Biopython is following this > >> model with their SearchIO implementation (Peter C, are you reading > this?) > > > > Actually I don't think we did end up with that kind of separation in the > > Biopython SearchIO - which is not so say it isn't an excellent model > > to follow. Rather the Biopython SearchIO (like the BioPerl one) had > > as the first goal a consistent object model across assorted file > > formats. > > > > The idea of a low level minimal overhead parsers (which are very > > format specific), on which a heavier but consistent object model > > can be built might be a good balance - the high level API has the > > connivence, but if you give that up you can have more speed. > > That's what I recommend with FASTQ and Biopython, e.g. > > http://news.open-bio.org/news/2009/09/biopython-fast-fastq/ > > > >> > >> I have started a wrapper around Heng's FASTQ/FASTA parsing > >> code (kseq), it seems to work quite well (~20M FASTQ in 30 sec > >> last I recall?). > >> > > > > I'd have to dig through my emails, but I think the BioRuby guys > > looked at that too - as I recall while it was fast, the error handling > > left something to be desired. Email me directly or on the BioRuby > > list if you want to follow up on that. > > > > Regards, > > > > Peter > > I did a little on this, worth following up on, but I pulled the FASTQ test > examples you created from the paper to test it out. IIRC it parsed where > it needed to, but I'm not sure how it handled bad sequences, so yes, worth > looking into. Maybe worth moving to open-bio-l for broader discussion. > > chris > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From tiago.hori at gmail.com Thu Feb 7 09:58:37 2013 From: tiago.hori at gmail.com (Tiago Hori) Date: Thu, 7 Feb 2013 06:58:37 -0800 (PST) Subject: [Bioperl-l] Search I::O In-Reply-To: <6B0BCF1B-4B67-4697-9B34-8F822B4DC565@gmail.com> References: <39b1269f-63a7-4b29-af79-8c93ab231abf@googlegroups.com> <6B0BCF1B-4B67-4697-9B34-8F822B4DC565@gmail.com> Message-ID: Thanks, Jason! It is working Now. So here is what I am trying to accomplish. For a given Blastx report, I want to extract the best BLASTx hit that is human, and does not contain unnamed or Predicted. I got very close, but I still can't get it to give me only the top BLAST hit, it gives me all blast hits that meet my criteria. I tried using "last" to stop it from looping through the hits, once it found a human one, but it didn't work. Can someone help? Here is my code so far (mostly stolen for the wiki). use strict; use Bio::SearchIO; my $in = new Bio::SearchIO(-format => 'blast', -file => 'testsalmon.txt'); while( my $result = $in->next_result ) { ## $result is a Bio::Search::Result::ResultI compliant object while( my $hit = $result->next_hit ) { ## $hit is a Bio::Search::Hit::HitI compliant object if( $hit->description !~ /[Uu]nnamed|PREDICTED|hypothetical/){ if( $hit->description =~ /Homo sapiens/){ while( my $hsp = $hit->next_hsp ) { ## $hsp is a Bio::Search::HSP::HSPI compliant object if( $hsp->length('total') > 50 ) { if ( $hsp->percent_identity >= 30) { if( $hsp->evalue <= 1e-05){ print "Query=", $result->query_name,"\t", " Description=", $hit->description,"\t", " Hit=", $hit->name,"\t", " Length=", $hsp->length('total'),"\t", " Percent_id=", $hsp->percent_identity,"\t", } } } } } } } } T. On Wednesday, February 6, 2013 6:46:47 PM UTC-3:30, Jason Stajich wrote: > > you are missing a comma after the -format => 'blast' > should be > my $in = Bio::SearchIO->new(-format => 'blast', > -file => 'XXX' ); > > > On Feb 5, 2013, at 7:21 AM, Tiago Hori > > wrote: > > > Hi All, > > > > I am trying to find the best putative orthologs for 44K Atlantic Salmon > > sequences, and so I need to parse 44K BLAST reports to find the best > human > > hit. I am trying to learn Seach::IO, but when I try the first example on > > the HOWTO: use strict; > > use Bio::SearchIO; > > > > my $in = new Bio::SearchIO(-format => 'blast' > > -file => 'C001R047.txt'); > > > > while( my $result = $in->next_result ) { > > ## $result is a Bio::Search::Result::ResultI compliant object > > while( my $hit = $result->next_hit ) { > > ## $hit is a Bio::Search::Hit::HitI compliant object > > while( my $hsp = $hit->next_hsp ) { > > ## $hsp is a Bio::Search::HSP::HSPI compliant object > > if( $hsp->length('total') > 50 ) { > > if ( $hsp->percent_identity >= 75 ) { > > print "Query=", $result->query_name, > > " Hit=", $hit->name, > > " Length=", $hsp->length('total'), > > " Percent_id=", $hsp->percent_identity, "\n"; > > } > > } > > } > > } > > } > > > > I get this error: Odd number of elements in hash assignment at > > /usr/local/share/perl/5.14.2/Bio/SearchIO.pm line 189. > > > > I am using BioPerl version 1.6.901. Is there a format problem with the > > blast reports? > > > > Any help would be greatly appreciated! > > > > T. > > _______________________________________________ > > Bioperl-l mailing list > > Biop... at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Jason Stajich > jason.... at gmail.com > ja... at bioperl.org > > From cjfields at illinois.edu Thu Feb 7 10:56:04 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Thu, 7 Feb 2013 15:56:04 +0000 Subject: [Bioperl-l] FASTQ, was Re: BioPerl long-term, was Re: dependencies on perl version In-Reply-To: References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> <118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1C42B@CHIMBX5.ad.uillinois.edu> This will likely be the approach for more NGS-friendly Bio::Seq class. Calculation of the PHRED scores could also be deferred until needed. seqtk has some C-based methods that we could possibly take advantage of, but will have to look into it. chris On Feb 7, 2013, at 9:25 AM, Aaron Mackey wrote: > You might also want to consider a lazy/pull-based parser to defer parsing/object-building for pieces of the object that don't get used. This also usually provides some error tolerance. > > -Aaron > > -- > Aaron J. Mackey, PhD > Assistant Professor > Center for Public Health Genomics > University of Virginia > amackey at virginia.edu > http://www.cphg.virginia.edu/mackey > > > On Wed, Feb 6, 2013 at 5:53 PM, Fields, Christopher J wrote: > On Feb 6, 2013, at 4:43 PM, Peter Cock wrote: > > > On Wed, Feb 6, 2013 at 10:11 PM, Fields, Christopher J > > wrote: > >> > >> I see no problem in stating any generic parsing and low-level interfaces > >> are just as much a part of what BioPerl encompasses as the higher-level > >> Bio::* classes themselves. Steve and Jason were on to something with > >> SearchIO; it's maybe not as performant as we would like, but it certainly > >> is more flexible in terms of what can be done, b/c it separates out > >> low-level parsing from object creation. That's the general model we > >> should look at. There is a good reason Biopython is following this > >> model with their SearchIO implementation (Peter C, are you reading this?) > > > > Actually I don't think we did end up with that kind of separation in the > > Biopython SearchIO - which is not so say it isn't an excellent model > > to follow. Rather the Biopython SearchIO (like the BioPerl one) had > > as the first goal a consistent object model across assorted file > > formats. > > > > The idea of a low level minimal overhead parsers (which are very > > format specific), on which a heavier but consistent object model > > can be built might be a good balance - the high level API has the > > connivence, but if you give that up you can have more speed. > > That's what I recommend with FASTQ and Biopython, e.g. > > http://news.open-bio.org/news/2009/09/biopython-fast-fastq/ > > > >> > >> I have started a wrapper around Heng's FASTQ/FASTA parsing > >> code (kseq), it seems to work quite well (~20M FASTQ in 30 sec > >> last I recall?). > >> > > > > I'd have to dig through my emails, but I think the BioRuby guys > > looked at that too - as I recall while it was fast, the error handling > > left something to be desired. Email me directly or on the BioRuby > > list if you want to follow up on that. > > > > Regards, > > > > Peter > > I did a little on this, worth following up on, but I pulled the FASTQ test examples you created from the paper to test it out. IIRC it parsed where it needed to, but I'm not sure how it handled bad sequences, so yes, worth looking into. Maybe worth moving to open-bio-l for broader discussion. > > chris > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From amackey at virginia.edu Thu Feb 7 11:09:14 2013 From: amackey at virginia.edu (Aaron Mackey) Date: Thu, 7 Feb 2013 11:09:14 -0500 Subject: [Bioperl-l] FASTQ, was Re: BioPerl long-term, was Re: dependencies on perl version In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1C42B@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> <118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu> <118F034CF4C3EF48A96F86CE585B94BF6CE1C42B@CHIMBX5.ad.uillinois.edu> Message-ID: e.g., a pull-based FASTQ parser that did nothing else at the top level but "chunk" the file into as-yet-unparsed four-line blobs could appear to work very fast, if the user code did nothing but count the number of entries: while (my $seq = $seqio->nextseq) { $ct++ }; in other words, you defer *everything* except the minimal amount of parsing/logic required to detect object boundaries. This is, in fact, the exact opposite of the event-based SearchIO "push" parsers, which always perform the most parsing possible, despite the user never accessing most of the material. Lastly, with respect to performance, if the parsing/object building operation is not simply IO bound, then parallel parser/object-building CPU threads could be considered, which could then dynamically adapt to pre-parse attributes (e.g. quality scores) that the calling code was actually using. What's the state of thread-safe Perl these days? -Aaron On Thu, Feb 7, 2013 at 10:56 AM, Fields, Christopher J < cjfields at illinois.edu> wrote: > This will likely be the approach for more NGS-friendly Bio::Seq class. > Calculation of the PHRED scores could also be deferred until needed. > > seqtk has some C-based methods that we could possibly take advantage of, > but will have to look into it. > > chris > > On Feb 7, 2013, at 9:25 AM, Aaron Mackey wrote: > > > You might also want to consider a lazy/pull-based parser to defer > parsing/object-building for pieces of the object that don't get used. This > also usually provides some error tolerance. > > > > -Aaron > From sidd.basu at gmail.com Thu Feb 7 11:38:47 2013 From: sidd.basu at gmail.com (Siddhartha Basu) Date: Thu, 7 Feb 2013 10:38:47 -0600 Subject: [Bioperl-l] Re: FASTQ, was Re:BioPerl long-term, was Re: dependencies on perl version In-Reply-To: References: <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> <118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu> <118F034CF4C3EF48A96F86CE585B94BF6CE1C42B@CHIMBX5.ad.uillinois.edu> Message-ID: <5113d899.ea64320a.489a.262d@mx.google.com> Another approach might be use map-reduce(Hadoop) if possible. I have seen one implementation in biopython's GFF3 parser. http://bcbio.wordpress.com/2009/03/22/mapreduce-implementation-of-gff-parsing-for-biopython/ -siddhartha On Thu, 07 Feb 2013, Aaron Mackey wrote: > e.g., a pull-based FASTQ parser that did nothing else at the top level but > "chunk" the file into as-yet-unparsed four-line blobs could appear to work > very fast, if the user code did nothing but count the number of entries: > > while (my $seq = $seqio->nextseq) { $ct++ }; > > in other words, you defer *everything* except the minimal amount of > parsing/logic required to detect object boundaries. > > This is, in fact, the exact opposite of the event-based SearchIO "push" > parsers, which always perform the most parsing possible, despite the user > never accessing most of the material. > > Lastly, with respect to performance, if the parsing/object building > operation is not simply IO bound, then parallel parser/object-building CPU > threads could be considered, which could then dynamically adapt to > pre-parse attributes (e.g. quality scores) that the calling code was > actually using. What's the state of thread-safe Perl these days? > > -Aaron > > > On Thu, Feb 7, 2013 at 10:56 AM, Fields, Christopher J < > cjfields at illinois.edu> wrote: > > > This will likely be the approach for more NGS-friendly Bio::Seq class. > > Calculation of the PHRED scores could also be deferred until needed. > > > > seqtk has some C-based methods that we could possibly take advantage of, > > but will have to look into it. > > > > chris > > > > On Feb 7, 2013, at 9:25 AM, Aaron Mackey wrote: > > > > > You might also want to consider a lazy/pull-based parser to defer > > parsing/object-building for pieces of the object that don't get used. This > > also usually provides some error tolerance. > > > > > > -Aaron > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Thu Feb 7 11:55:53 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Thu, 7 Feb 2013 16:55:53 +0000 Subject: [Bioperl-l] FASTQ, was Re:BioPerl long-term, was Re: dependencies on perl version In-Reply-To: <5113d899.ea64320a.489a.262d@mx.google.com> References: <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> <118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu> <118F034CF4C3EF48A96F86CE585B94BF6CE1C42B@CHIMBX5.ad.uillinois.edu> <5113d899.ea64320a.489a.262d@mx.google.com> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1C7B8@CHIMBX5.ad.uillinois.edu> I think we will want to allow for a multitude of implementations. SeqIO already allows for that to a degree, but multiple backend implementations (say, different ways of parsing/processing FASTQ and others) isn't supported yet. chris On Feb 7, 2013, at 10:38 AM, Siddhartha Basu wrote: > Another approach might be use map-reduce(Hadoop) if possible. I have > seen one implementation in biopython's GFF3 parser. > http://bcbio.wordpress.com/2009/03/22/mapreduce-implementation-of-gff-parsing-for-biopython/ > > -siddhartha > > > On Thu, 07 Feb 2013, Aaron Mackey wrote: > >> e.g., a pull-based FASTQ parser that did nothing else at the top level but >> "chunk" the file into as-yet-unparsed four-line blobs could appear to work >> very fast, if the user code did nothing but count the number of entries: >> >> while (my $seq = $seqio->nextseq) { $ct++ }; >> >> in other words, you defer *everything* except the minimal amount of >> parsing/logic required to detect object boundaries. >> >> This is, in fact, the exact opposite of the event-based SearchIO "push" >> parsers, which always perform the most parsing possible, despite the user >> never accessing most of the material. >> >> Lastly, with respect to performance, if the parsing/object building >> operation is not simply IO bound, then parallel parser/object-building CPU >> threads could be considered, which could then dynamically adapt to >> pre-parse attributes (e.g. quality scores) that the calling code was >> actually using. What's the state of thread-safe Perl these days? >> >> -Aaron >> >> >> On Thu, Feb 7, 2013 at 10:56 AM, Fields, Christopher J < >> cjfields at illinois.edu> wrote: >> >>> This will likely be the approach for more NGS-friendly Bio::Seq class. >>> Calculation of the PHRED scores could also be deferred until needed. >>> >>> seqtk has some C-based methods that we could possibly take advantage of, >>> but will have to look into it. >>> >>> chris >>> >>> On Feb 7, 2013, at 9:25 AM, Aaron Mackey wrote: >>> >>>> You might also want to consider a lazy/pull-based parser to defer >>> parsing/object-building for pieces of the object that don't get used. This >>> also usually provides some error tolerance. >>>> >>>> -Aaron >>> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Thu Feb 7 12:01:07 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Thu, 7 Feb 2013 17:01:07 +0000 Subject: [Bioperl-l] FASTQ, was Re: BioPerl long-term, was Re: dependencies on perl version In-Reply-To: References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> <118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu> <118F034CF4C3EF48A96F86CE585B94BF6CE1C42B@CHIMBX5.ad.uillinois.edu> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1C7EF@CHIMBX5.ad.uillinois.edu> re: thread-safe perl, so-so at best from what I understand. chris On Feb 7, 2013, at 10:09 AM, Aaron Mackey wrote: > e.g., a pull-based FASTQ parser that did nothing else at the top level but "chunk" the file into as-yet-unparsed four-line blobs could appear to work very fast, if the user code did nothing but count the number of entries: > > while (my $seq = $seqio->nextseq) { $ct++ }; > > in other words, you defer *everything* except the minimal amount of parsing/logic required to detect object boundaries. > > This is, in fact, the exact opposite of the event-based SearchIO "push" parsers, which always perform the most parsing possible, despite the user never accessing most of the material. > > Lastly, with respect to performance, if the parsing/object building operation is not simply IO bound, then parallel parser/object-building CPU threads could be considered, which could then dynamically adapt to pre-parse attributes (e.g. quality scores) that the calling code was actually using. What's the state of thread-safe Perl these days? > > -Aaron > > > On Thu, Feb 7, 2013 at 10:56 AM, Fields, Christopher J wrote: > This will likely be the approach for more NGS-friendly Bio::Seq class. Calculation of the PHRED scores could also be deferred until needed. > > seqtk has some C-based methods that we could possibly take advantage of, but will have to look into it. > > chris > > On Feb 7, 2013, at 9:25 AM, Aaron Mackey wrote: > > > You might also want to consider a lazy/pull-based parser to defer parsing/object-building for pieces of the object that don't get used. This also usually provides some error tolerance. > > > > -Aaron From hartzell at alerce.com Thu Feb 7 16:36:24 2013 From: hartzell at alerce.com (George Hartzell) Date: Thu, 7 Feb 2013 13:36:24 -0800 Subject: [Bioperl-l] BioPerl long-term, was Re: dependencies on perl version In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> Message-ID: <20756.7768.125680.662488@gargle.gargle.HOWL> Fields, Christopher J writes: > George, > > Should put your post on a pedestal :) > > tl;dr version: I completely agree, but we need help in order to do this. > [...] And therein lies the [a] problem. Don't look at me.... I'm not coding on bioinformatics problems these days (though I'm available...) so _maybe_ I shouldn't have gotten up on the soapbox. But I'm so sick of getting into arguments (or walking away from them...) with Ruby and Python [and lisp and *PHP*] fans; Perl is dead, you can't write good code in Perl, look - Ruby has GEMS!, etc... Perl of the olden days was an easy language in which to write really shitty code. Even the Perl of the BioPerl heyday wasn't really much help; role your own OO, role your own distro-building, mountains of monkey-work to provide consistent POD, versioning, etc... But that's not the Perl that I use. I have Moose and Moo. TAP and the things built on it. Dist::Zilla. PerlTidy. PerlCritic. cpanm. MetaCPAN. Pinto. GitHub. Perlbrew. Wow. It isn't any harder to write good code, for measures that I care about, using Perl than it is *any* of the other similar languages. And it's just as easy, and happens just as frequently, for people to write shitty (undocumented, untested, poorly managed, poorly packaged, ...) stuff in the other languages. GET OFF MY LAWN, KID! (Yeah, I know...) But BioPerl *is* dying. You might be standing on the shoulders of giants when you use it to solve a problem, but you *definitely* have those same giants (and their extended families) on your shoulders every time I see you try move the project forward. All of that history has become the tail that's wagging the dog. If all y'all are going to keep the thing alive, moving forward and contributing to new great works then make Apple your hero. Deprecate the stuff that's holding you back, give folks a path forward and move on. Have fun. Use sharp tools. Do cool science. Build cool things. Advance your careers (forgot that one last time). Be reasonable and professional. Supporting last year's projects is someone else's business opportunity. g. ps. Are all y'all following this thread? http://news.ycombinator.com/item?id=5123022 Maybe someone should search down for this bit: "Where to start? Any list of this [sic] projects?" and insert a plug for the various open-bio projects. (But "someone" doesn't work here, he said...). From cjfields at illinois.edu Thu Feb 7 18:12:19 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Thu, 7 Feb 2013 23:12:19 +0000 Subject: [Bioperl-l] BioPerl long-term, was Re: dependencies on perl version In-Reply-To: <20756.7768.125680.662488@gargle.gargle.HOWL> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> <20756.7768.125680.662488@gargle.gargle.HOWL> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1D071@CHIMBX5.ad.uillinois.edu> On Feb 7, 2013, at 3:36 PM, George Hartzell wrote: > Fields, Christopher J writes: >> George, >> >> Should put your post on a pedestal :) >> >> tl;dr version: I completely agree, but we need help in order to do this. >> [...] > > And therein lies the [a] problem. Don't look at me.... > > I'm not coding on bioinformatics problems these days (though I'm > available...) so _maybe_ I shouldn't have gotten up on the soapbox. > > But I'm so sick of getting into arguments (or walking away from > them...) with Ruby and Python [and lisp and *PHP*] fans; Perl is dead, > you can't write good code in Perl, look - Ruby has GEMS!, etc? Right, but that's a perception not just in the Bio* world. It's larger and more pervasive than that. > Perl of the olden days was an easy language in which to write really > shitty code. Even the Perl of the BioPerl heyday wasn't really much > help; role your own OO, role your own distro-building, mountains of > monkey-work to provide consistent POD, versioning, etc... > > But that's not the Perl that I use. I have Moose and Moo. TAP and > the things built on it. Dist::Zilla. PerlTidy. PerlCritic. cpanm. > MetaCPAN. Pinto. GitHub. Perlbrew. Wow. Yes, and that is the direction we need to go in. > It isn't any harder to write good code, for measures that I care > about, using Perl than it is *any* of the other similar languages. > > And it's just as easy, and happens just as frequently, for people to > write shitty (undocumented, untested, poorly managed, poorly packaged, > ...) stuff in the other languages. Oh, I know. I'm working on some very nice looking but terribly implemented Python code now. > GET OFF MY LAWN, KID! (Yeah, I know...) > > But BioPerl *is* dying. You might be standing on the shoulders of > giants when you use it to solve a problem, but you *definitely* have > those same giants (and their extended families) on your shoulders > every time I see you try move the project forward. All of that > history has become the tail that's wagging the dog. Yep. > If all y'all are going to keep the thing alive, moving forward and > contributing to new great works then make Apple your hero. Deprecate > the stuff that's holding you back, give folks a path forward and move > on. That's fine. > Have fun. Use sharp tools. Do cool science. Build cool things. > Advance your careers (forgot that one last time). Be reasonable and > professional. > > Supporting last year's projects is someone else's business > opportunity. > > g. Right, but this isn't just my show. I can't do this alone; it's simply too much code and I don't have even 1/4 the time I used to have. > ps. Are all y'all following this thread? > > http://news.ycombinator.com/item?id=5123022 > > Maybe someone should search down for this bit: "Where to start? Any > list of this [sic] projects?" and insert a plug for the various > open-bio projects. (But "someone" doesn't work here, he said?). Read the original guy's post. He's completely delusional (okay, maybe not *completely*, but he comes across as quite bitter and unrealistic). Frankly I don't feel so bad if he wants to leave. He doesn't like messy things. Biology is messy, if one doesn't understand that then computational biology is not for them. chris From carandraug+dev at gmail.com Thu Feb 7 23:12:22 2013 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Fri, 8 Feb 2013 04:12:22 +0000 Subject: [Bioperl-l] BioPerl long-term, was Re: dependencies on perl version Message-ID: On 6 February 2013 22:11, "Fields, Christopher J" wrote: > [...] > So: > > If it means targeting performance, backwards-compatibility be damned (using Devel::NYTProf?), we do that. > > If it means creating a new Bio-NGS repo to focus some of these efforts, so be it. > > If it means we get away from the Java-based interface stuff in favor of something more Perl-like (roles anyone?), then I'm all for it. > > If it means we modularize BioPerl so this can be done, well, you probably know where I stand (yes). > > If it means this is to be BioPerl 2.0, then let's move that direction, sooner than later. > > But I can't do it alone. We (not just me, but we) need to drive the direction we take. > > First one who codes gets the gold ring. Hi I know I'm not much involved with bioperl development but here's my suggestion as maintainer of another quite modular free software project. I swear I'm not promoting it. Skip to the last paragraph for the very short version. Octave Forge is now a collection of packages for GNU Octave, each released independently whenever its maintainer sees fit. But it wasn't like that before. For a long time, everything was released at the same time, there was no independent packages. Then it was decided to split it into sections: main, extra and nonfree (free software dependent on non-free libraries, now purged), and inside those, it was split into packages, each with its own maintainer. But some packages were (and are) more active that the others. Some packages even came from single contributions and we never heard from the authors again. And so, with time, cruft settled in. We didn't want to remove the code, but no one was interested or comfortable enough on the field, to fix it either. Packages that had a much more active development were being dragged down by code that no one was maintaining. So we broke with that and each package is now released independently. We have packages that haven't been released in 3 years yes, but that just shows the packages that no one cares about. Those have been marked as unmaintained and anyone can come around and make a release if they care about it. As the maintainer of the project, I do *not* make the releases of the packages. The package maintainers prepares everything and uploads them, I only run a handful of tests (takes me 10min), upload it to our server, and make the official announcement. I am also the maintainer of one of the packages, and have often made releases of unmaintained packages because I needed it. That's to show, if they are important enough for someone, they will get a release somehow. If they are not important, why would we waste our time on them anyway? We now around 5 package releases per month, many of them being minor releases with a handful of bug fixes. Preparing a release of a small package is much easier and much less trouble than preparing a giant release encompassing all of them at the same time. Short version: I'd recommend to split the project into much smaller ones. Some of the small ones will wither and die but those are the less important ones, and will allow the others, the ones that people care about, freedom to grow faster. Bioperl would still be just one project, that incorporates a hundred or so of smaller modules. Let those who care the most about a specific module to take care of it and make the releases. Releasing a module becomes much simpler, which means more releases, more activity, and the smaller code base for each module also make it less intimidating for new contributors. Carn? From hartzell at alerce.com Fri Feb 8 01:17:17 2013 From: hartzell at alerce.com (George Hartzell) Date: Thu, 7 Feb 2013 22:17:17 -0800 Subject: [Bioperl-l] injecting a bit of levity.... Message-ID: <20756.39021.553502.116384@gargle.gargle.HOWL> Perl's not dead. It's FAMOUS! http://imgs.xkcd.com/comics/perl_problems.png g. From carandraug+dev at gmail.com Fri Feb 8 01:57:30 2013 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Fri, 8 Feb 2013 06:57:30 +0000 Subject: [Bioperl-l] getting a Bio::Search::HSP::HSPI from Bio::SimpleAlign (to find differences between sequences) Message-ID: Hi I already have a Bio::SimpleAlign object (got it after using TCoffee through bioperl-run module) and I'm trying to get a Bio::Search::HSP::HSPI object from a pair of the aligned sequences. How can I do this? I want to use the seq_inds method to compare the sequences. Here's my actual problem just in case I should be trying to fix it some other way. I have a bunch of sequences from protein isoforms. They have small differences between them, point-mutations, small insertions or deletions, nothing too big. I want to make a table of the mutations that each of them has against the consensus sequence. I already made the alignment and got have the consensus with "$align->consensus_string". Now, I want to get something like: isoform1: Ala67Gly, His90_Met91insGln isoform2: .... The seq_inds method from the Bio::Search::HSP::HSPI class seems to do the part of finding the differences, but how can I get one? I can't find it on the documentation. Any tips, and even showing a different approach to my problem, are most appreciated. Thanks, Carn? From l.m.timmermans at students.uu.nl Fri Feb 8 06:18:58 2013 From: l.m.timmermans at students.uu.nl (Leon Timmermans) Date: Fri, 8 Feb 2013 12:18:58 +0100 Subject: [Bioperl-l] BioPerl long-term, was Re: dependencies on perl version In-Reply-To: <20756.7768.125680.662488@gargle.gargle.HOWL> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> <20756.7768.125680.662488@gargle.gargle.HOWL> Message-ID: On Thu, Feb 7, 2013 at 10:36 PM, George Hartzell wrote: > But I'm so sick of getting into arguments (or walking away from > them...) with Ruby and Python [and lisp and *PHP*] fans; Perl is dead, > you can't write good code in Perl, look - Ruby has GEMS!, etc... > > Perl of the olden days was an easy language in which to write really > shitty code. Even the Perl of the BioPerl heyday wasn't really much > help; role your own OO, role your own distro-building, mountains of > monkey-work to provide consistent POD, versioning, etc... > > But that's not the Perl that I use. I have Moose and Moo. TAP and > the things built on it. Dist::Zilla. PerlTidy. PerlCritic. cpanm. > MetaCPAN. Pinto. GitHub. Perlbrew. Wow. I share that experience. > But BioPerl *is* dying. You might be standing on the shoulders of > giants when you use it to solve a problem, but you *definitely* have > those same giants (and their extended families) on your shoulders > every time I see you try move the project forward. All of that > history has become the tail that's wagging the dog. I share your sentiment. Most of BioPerl is architected so badly I can't stomach it most days, and I've worked on hairy codebases included perl itself. There's just too much sick and wrong. It's like hundreds of dot-com-era cgi scripts. The problem (which is common in scientific computing) is that once code works it's effectively abandoned. BioPerl is essentially a gathering of more than a thousand such modules. > If all y'all are going to keep the thing alive, moving forward and > contributing to new great works then make Apple your hero. Deprecate > the stuff that's holding you back, give folks a path forward and move > on. That would be lovely, but who is going to do that? We're suffering from the tragedy of the commons. > Have fun. Use sharp tools. Do cool science. Build cool things. > Advance your careers (forgot that one last time). Be reasonable and > professional. Sounds like good advice to me :-) > Supporting last year's projects is someone else's business > opportunity. True! > ps. Are all y'all following this thread? > > http://news.ycombinator.com/item?id=5123022 > > Maybe someone should search down for this bit: "Where to start? Any > list of this [sic] projects?" and insert a plug for the various > open-bio projects. (But "someone" doesn't work here, he said...). Interesting discussion, though the original post is too cynical even for my taste. Leon From cjfields at illinois.edu Fri Feb 8 09:08:56 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Fri, 8 Feb 2013 14:08:56 +0000 Subject: [Bioperl-l] BioPerl long-term, was Re: dependencies on perl version In-Reply-To: References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> <20756.7768.125680.662488@gargle.gargle.HOWL> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1DA2D@CHIMBX5.ad.uillinois.edu> On Feb 8, 2013, at 5:18 AM, Leon Timmermans wrote: > On Thu, Feb 7, 2013 at 10:36 PM, George Hartzell wrote: >> But I'm so sick of getting into arguments (or walking away from >> them...) with Ruby and Python [and lisp and *PHP*] fans; Perl is dead, >> you can't write good code in Perl, look - Ruby has GEMS!, etc... >> >> Perl of the olden days was an easy language in which to write really >> shitty code. Even the Perl of the BioPerl heyday wasn't really much >> help; role your own OO, role your own distro-building, mountains of >> monkey-work to provide consistent POD, versioning, etc... >> >> But that's not the Perl that I use. I have Moose and Moo. TAP and >> the things built on it. Dist::Zilla. PerlTidy. PerlCritic. cpanm. >> MetaCPAN. Pinto. GitHub. Perlbrew. Wow. > > I share that experience. > >> But BioPerl *is* dying. You might be standing on the shoulders of >> giants when you use it to solve a problem, but you *definitely* have >> those same giants (and their extended families) on your shoulders >> every time I see you try move the project forward. All of that >> history has become the tail that's wagging the dog. > > I share your sentiment. Most of BioPerl is architected so badly I > can't stomach it most days, and I've worked on hairy codebases > included perl itself. There's just too much sick and wrong. It's like > hundreds of dot-com-era cgi scripts. > > The problem (which is common in scientific computing) is that once > code works it's effectively abandoned. BioPerl is essentially a > gathering of more than a thousand such modules. Yep, the progression from 'it works' to 'it works very well' tends to have very high activation energy. Many of the fixes tend to be more bandaids (get it working) than fundamental surgery. I tried my hand at this, got a few things done. >> If all y'all are going to keep the thing alive, moving forward and >> contributing to new great works then make Apple your hero. Deprecate >> the stuff that's holding you back, give folks a path forward and move >> on. > > That would be lovely, but who is going to do that? We're suffering > from the tragedy of the commons. Spot on, but we could break that path for the time being. I think BioPerl as is will have to be in maintenance mode; we need a new effort to break with older perl, older practices. >> Have fun. Use sharp tools. Do cool science. Build cool things. >> Advance your careers (forgot that one last time). Be reasonable and >> professional. > > Sounds like good advice to me :-) > >> Supporting last year's projects is someone else's business >> opportunity. > > True! We just need to make a bioperl 1.x branch for the maintenance bit, rechristen 'master' as 'v2', and just move on to fixing the f****** code. Let's move on that. >> ps. Are all y'all following this thread? >> >> http://news.ycombinator.com/item?id=5123022 >> >> Maybe someone should search down for this bit: "Where to start? Any >> list of this [sic] projects?" and insert a plug for the various >> open-bio projects. (But "someone" doesn't work here, he said...). > > Interesting discussion, though the original post is too cynical even > for my taste. > > Leon Yes, that's not unusual unfortunately. We have a number of physicists and mathematicians here who have started their initial forays into computational biology, they're all startled at how noisy it is and how messy code can. Of course their disciplines have had the benefit of teaching students how to (somewhat decently) code for the last 40 years. chris From l.m.timmermans at students.uu.nl Fri Feb 8 07:08:06 2013 From: l.m.timmermans at students.uu.nl (Leon Timmermans) Date: Fri, 8 Feb 2013 13:08:06 +0100 Subject: [Bioperl-l] BioPerl long-term, was Re: dependencies on perl version In-Reply-To: References: Message-ID: On Fri, Feb 8, 2013 at 5:12 AM, Carn? Draug wrote: > Short version: > I'd recommend to split the project into much smaller ones. Some of the > small ones will wither and die but those are the less important ones, > and will allow the others, the ones that people care about, freedom to > grow faster. Bioperl would still be just one project, that > incorporates a hundred or so of smaller modules. Let those who care > the most about a specific module to take care of it and make the > releases. Releasing a module becomes much simpler, which means more > releases, more activity, and the smaller code base for each module > also make it less intimidating for new contributors. That has been a goal for some time now, but it's fairly complicated. Not only do we have a LOT of modules (bioperl-live alone is more than 900), they also have complicated dependencies. I've attached the results of my static dependency analysis of bioperl-live. I suspect this split-up needs to done by automated graph analysis, it's too much to do by hand. Leon -------------- next part -------------- A non-text attachment was scrubbed... Name: deps.dot Type: application/octet-stream Size: 93463 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: deps.png Type: image/png Size: 6694525 bytes Desc: not available URL: From sebastien.moretti at unil.ch Fri Feb 8 11:19:29 2013 From: sebastien.moretti at unil.ch (=?ISO-8859-1?Q?Moretti_S=E9bastien?=) Date: Fri, 08 Feb 2013 17:19:29 +0100 Subject: [Bioperl-l] PhyloXML Message-ID: <51152591.9010402@unil.ch> Hi I would like to add some XML to an existing PhyloXML tree. No problem to read and write it. I would like to add smthg after the tag as in http://www.phyloxml.org/examples_syntax/phyloxml_syntax_example_1.html but get problems with add_phyloXML_annotation() : Can't locate object method "annotation" via package "Bio::Tree::Tree" at /software/share/perl5/vendor_perl/Bio/TreeIO/phyloxml.pm line 984, line 1 (#1) (F) You called a method correctly, and it correctly indicated a package functioning as a class, but that package doesn't define that particular method, nor does any of its base classes. See perlobj. Uncaught exception from user code: Can't locate object method "annotation" via package "Bio::Tree::Tree" at /software/share/perl5/vendor_perl/Bio/TreeIO/phyloxml.pm line 984, line 1. at /software/share/perl5/vendor_perl/Bio/TreeIO/phyloxml.pm line 984 Bio::TreeIO::phyloxml::element_default('Bio::TreeIO::phyloxml=HASH(0x134b1268)') called at /software/share/perl5/vendor_perl/Bio/TreeIO/phyloxml.pm line 670 Bio::TreeIO::phyloxml::processXMLNode('Bio::TreeIO::phyloxml=HASH(0x134b1268)') called at /software/share/perl5/vendor_perl/Bio/TreeIO/phyloxml.pm line 309 Bio::TreeIO::phyloxml::add_phyloXML_annotation('Bio::TreeIO::phyloxml=HASH(0x134b1268)', '-obj', 'Bio::Tree::Tree=HASH(0x13525258)', '-xml', 'SUMF family') called at ./add_annotation_to_phyloxml.pl line 40 I think I do something wrong but what ? Here is the code my $treeio = new Bio::TreeIO(-file => "$infile", -format => 'phyloxml', ); my $tree = $treeio->next_tree; # Add annotation $treeio->add_phyloXML_annotation(-obj => $tree, -xml => 'SUMF family', ); -- S?bastien Moretti From cjfields at illinois.edu Sat Feb 9 01:25:17 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Sat, 9 Feb 2013 06:25:17 +0000 Subject: [Bioperl-l] BioPerl future Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1F217@CHIMBX5.ad.uillinois.edu> All, (cross-posting to gmod-gbrowse) I want to gauge the community's thoughts on a few things. At the moment I think we can safely say that BioPerl 1.x is in maintenance mode. By 'maintenance mode', I mean that we can only do so much with it w/o breaking backwards compatibility with old scripts. We need a way forward so that we can address fundamental problems within the core codebase, namely speed. I am thinking at the moment of pushing a 'v1' branch next week after I make an official announcement, with a new 1.6 release coming out from that branch (as already announced, tentatively scheduled for March 1). That frees up master for any code development, removal of modules/cruft, etc. This will open an initial path forward and at least enable us to do more. Make sense? This of course means that any code reliant on v1 should pull from that branch instead of 'master'. Thoughts? chris From cjfields at illinois.edu Sat Feb 9 01:43:24 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Sat, 9 Feb 2013 06:43:24 +0000 Subject: [Bioperl-l] BioPerl long-term, was Re: dependencies on perl version In-Reply-To: References: Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1F2C6@CHIMBX5.ad.uillinois.edu> On Feb 8, 2013, at 6:08 AM, Leon Timmermans wrote: > On Fri, Feb 8, 2013 at 5:12 AM, Carn? Draug wrote: >> Short version: >> I'd recommend to split the project into much smaller ones. Some of the >> small ones will wither and die but those are the less important ones, >> and will allow the others, the ones that people care about, freedom to >> grow faster. Bioperl would still be just one project, that >> incorporates a hundred or so of smaller modules. Let those who care >> the most about a specific module to take care of it and make the >> releases. Releasing a module becomes much simpler, which means more >> releases, more activity, and the smaller code base for each module >> also make it less intimidating for new contributors. > > That has been a goal for some time now, but it's fairly complicated. > Not only do we have a LOT of modules (bioperl-live alone is more than > 900), they also have complicated dependencies. I've attached the > results of my static dependency analysis of bioperl-live. I suspect > this split-up needs to done by automated graph analysis, it's too much > to do by hand. > > Leon > Leon, I'm hoping we can do this sooner than later. In fact, if we proceed with make a 'v1' branch or something similar, we can start extricating out code sooner than later (next few weeks). chris From cjfields at illinois.edu Sat Feb 9 08:51:35 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Sat, 9 Feb 2013 13:51:35 +0000 Subject: [Bioperl-l] [Gmod-gbrowse] BioPerl future Message-ID: Sheldon, The branch is where the old (v1.x) code would reside. Master branch would be v2. Chris Sent via phone -------- Original message -------- From: Sheldon McKay Date: To: "Fields, Christopher J" Cc: BioPerl List ,gmod-gbrowse at lists.sourceforge.net Subject: Re: [Gmod-gbrowse] BioPerl future Hi Chris, This sounds like a good idea. I think it will eventually allow bioperl to evolve into a leaner, meaner package that would be more likely to be adopted by new or isolated bioinformaticians, who tend to be put off by the size and complexity of bioperl as it now stands. One question I have is whether the name of branch v1 might be perceived as a step backward. How about v2? Sheldon On Saturday, February 9, 2013, Fields, Christopher J wrote: All, (cross-posting to gmod-gbrowse) I want to gauge the community's thoughts on a few things. At the moment I think we can safely say that BioPerl 1.x is in maintenance mode. By 'maintenance mode', I mean that we can only do so much with it w/o breaking backwards compatibility with old scripts. We need a way forward so that we can address fundamental problems within the core codebase, namely speed. I am thinking at the moment of pushing a 'v1' branch next week after I make an official announcement, with a new 1.6 release coming out from that branch (as already announced, tentatively scheduled for March 1). That frees up master for any code development, removal of modules/cruft, etc. This will open an initial path forward and at least enable us to do more. Make sense? This of course means that any code reliant on v1 should pull from that branch instead of 'master'. Thoughts? chris ------------------------------------------------------------------------------ Free Next-Gen Firewall Hardware Offer Buy your Sophos next-gen firewall before the end March 2013 and get the hardware for free! Learn more. http://p.sf.net/sfu/sophos-d2d-feb _______________________________________________ Gmod-gbrowse mailing list Gmod-gbrowse at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse -- Sheldon McKay, PhD Computational Biologist DNA Learning Center Cold Spring Harbor Laboratory 1 Bungtown Rd Cold Spring Harbor, NY 11724 (516) 367-5185 www.dnalc.org From sheldon.mckay at gmail.com Sat Feb 9 08:04:50 2013 From: sheldon.mckay at gmail.com (Sheldon McKay) Date: Sat, 9 Feb 2013 08:04:50 -0500 Subject: [Bioperl-l] [Gmod-gbrowse] BioPerl future In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1F217@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1F217@CHIMBX5.ad.uillinois.edu> Message-ID: Hi Chris, This sounds like a good idea. I think it will eventually allow bioperl to evolve into a leaner, meaner package that would be more likely to be adopted by new or isolated bioinformaticians, who tend to be put off by the size and complexity of bioperl as it now stands. One question I have is whether the name of branch v1 might be perceived as a step backward. How about v2? Sheldon On Saturday, February 9, 2013, Fields, Christopher J wrote: > All, > > (cross-posting to gmod-gbrowse) > > I want to gauge the community's thoughts on a few things. At the moment I > think we can safely say that BioPerl 1.x is in maintenance mode. By > 'maintenance mode', I mean that we can only do so much with it w/o breaking > backwards compatibility with old scripts. We need a way forward so that we > can address fundamental problems within the core codebase, namely speed. > > I am thinking at the moment of pushing a 'v1' branch next week after I > make an official announcement, with a new 1.6 release coming out from that > branch (as already announced, tentatively scheduled for March 1). That > frees up master for any code development, removal of modules/cruft, etc. > This will open an initial path forward and at least enable us to do more. > Make sense? This of course means that any code reliant on v1 should pull > from that branch instead of 'master'. > > Thoughts? > > chris > > ------------------------------------------------------------------------------ > Free Next-Gen Firewall Hardware Offer > Buy your Sophos next-gen firewall before the end March 2013 > and get the hardware for free! Learn more. > http://p.sf.net/sfu/sophos-d2d-feb > _______________________________________________ > Gmod-gbrowse mailing list > Gmod-gbrowse at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse > -- Sheldon McKay, PhD Computational Biologist DNA Learning Center Cold Spring Harbor Laboratory 1 Bungtown Rd Cold Spring Harbor, NY 11724 (516) 367-5185 www.dnalc.org From cjfields at illinois.edu Sat Feb 9 23:25:14 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Sun, 10 Feb 2013 04:25:14 +0000 Subject: [Bioperl-l] BioPerl future In-Reply-To: <2F53583C-9E7D-4D6A-A4C2-E5C27DDBA493@illinois.edu> References: <2F53583C-9E7D-4D6A-A4C2-E5C27DDBA493@illinois.edu> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1FC4C@CHIMBX5.ad.uillinois.edu> Apologies if you receive this twice. I never received the replies from the gbrowse list through bioperl-l so it is possible there were mail issues last night. ------------------------ All, (cross-posting to gmod-gbrowse) I want to gauge the community's thoughts on a few things. At the moment I think we can safely say that BioPerl 1.x is in maintenance mode. By 'maintenance mode', I mean that we can only do so much with it w/o breaking backwards compatibility with old scripts. We need a way forward so that we can address fundamental problems within the core codebase, namely speed. I am thinking at the moment of pushing a 'v1' branch next week after I make an official announcement, with a new 1.6 release coming out from that branch (as already announced, tentatively scheduled for March 1). That frees up master for any code development, removal of modules/cruft, etc. This will open an initial path forward and at least enable us to do more. Make sense? This of course means that any code reliant on v1 should pull from that branch instead of 'master'. Thoughts? chris From genehack at genehack.org Sat Feb 9 23:36:07 2013 From: genehack at genehack.org (John SJ Anderson) Date: Sat, 9 Feb 2013 20:36:07 -0800 Subject: [Bioperl-l] BioPerl future In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1FC4C@CHIMBX5.ad.uillinois.edu> References: <2F53583C-9E7D-4D6A-A4C2-E5C27DDBA493@illinois.edu> <118F034CF4C3EF48A96F86CE585B94BF6CE1FC4C@CHIMBX5.ad.uillinois.edu> Message-ID: <668BED38-61AE-4D21-A3BD-B7AEC9361EF6@genehack.org> On Feb 9, 2013, at 8:25 PM, "Fields, Christopher J" wrote: > Thoughts? +1 The other thing to maybe give some advance thought to is organization of the new development. Maybe instead of one big repository, we can encourage the more loosely coupled small pieces that everybody seems to realize we need by having more, smaller repositories. j. -- John SJ Anderson // genehack at genehack.org From carandraug+dev at gmail.com Sun Feb 10 13:40:33 2013 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Sun, 10 Feb 2013 18:40:33 +0000 Subject: [Bioperl-l] BioPerl future Message-ID: On 10 February 2013 17:00, wrote: > Message: 3 > Date: Sat, 9 Feb 2013 20:36:07 -0800 > From: John SJ Anderson > Subject: Re: [Bioperl-l] BioPerl future > To: "Fields, Christopher J" > Cc: BioPerl List > Message-ID: <668BED38-61AE-4D21-A3BD-B7AEC9361EF6 at genehack.org> > Content-Type: text/plain; charset=us-ascii > > On Feb 9, 2013, at 8:25 PM, "Fields, Christopher J" wrote: > >> Thoughts? > > +1 > > The other thing to maybe give some advance thought to is organization of the new development. Maybe instead of one big repository, we can encourage the more loosely coupled small pieces that everybody seems to realize we need by having more, smaller repositories. For those interested, I have just added instructions on the wiki on how to split a subset of modules, tests, files, etc from the bioperl-live repository into a new repository while keeping their old history. http://www.bioperl.org/wiki/Using_Git/Advanced#Split_a_module_from_bioperl-live Carn? From cjfields at illinois.edu Sun Feb 10 15:08:35 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Sun, 10 Feb 2013 20:08:35 +0000 Subject: [Bioperl-l] BioPerl future In-Reply-To: References: Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE20632@CHIMBX5.ad.uillinois.edu> On Feb 10, 2013, at 12:40 PM, Carn? Draug wrote: > On 10 February 2013 17:00, wrote: >> Message: 3 >> Date: Sat, 9 Feb 2013 20:36:07 -0800 >> From: John SJ Anderson >> Subject: Re: [Bioperl-l] BioPerl future >> To: "Fields, Christopher J" >> Cc: BioPerl List >> Message-ID: <668BED38-61AE-4D21-A3BD-B7AEC9361EF6 at genehack.org> >> Content-Type: text/plain; charset=us-ascii >> >> On Feb 9, 2013, at 8:25 PM, "Fields, Christopher J" wrote: >> >>> Thoughts? >> >> +1 >> >> The other thing to maybe give some advance thought to is organization of the new development. Maybe instead of one big repository, we can encourage the more loosely coupled small pieces that everybody seems to realize we need by having more, smaller repositories. > > For those interested, I have just added instructions on the wiki on > how to split a subset of modules, tests, files, etc from the > bioperl-live repository into a new repository while keeping their old > history. > > http://www.bioperl.org/wiki/Using_Git/Advanced#Split_a_module_from_bioperl-live > > Carn? It's probably worth looking at this page as well, then: http://www.bioperl.org/wiki/BioPerl_Modularization We should probably merge the two. chris From hlapp at drycafe.net Sun Feb 10 20:03:34 2013 From: hlapp at drycafe.net (Hilmar Lapp) Date: Sun, 10 Feb 2013 20:03:34 -0500 Subject: [Bioperl-l] PhyloXML In-Reply-To: <51152591.9010402@unil.ch> References: <51152591.9010402@unil.ch> Message-ID: On Feb 8, 2013, at 11:19 AM, Moretti S?bastien wrote: > # Add annotation > $treeio->add_phyloXML_annotation(-obj => $tree, > -xml => 'SUMF family', > ); If you really have $treeio in your script in this line and not $tree, then that's at least one problem. But the stack suggests that the above isn't the exact line in your script - can you confirm that? -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : =========================================================== From sebastien.moretti at unil.ch Mon Feb 11 02:08:22 2013 From: sebastien.moretti at unil.ch (=?ISO-8859-1?Q?S=E9bastien_MORETTI?=) Date: Mon, 11 Feb 2013 08:08:22 +0100 Subject: [Bioperl-l] PhyloXML In-Reply-To: References: <51152591.9010402@unil.ch> Message-ID: <511898E6.7060400@unil.ch> >> # Add annotation >> $treeio->add_phyloXML_annotation(-obj => $tree, >> -xml => 'SUMF family', >> ); > > If you really have $treeio in your script in this line and not $tree, then that's at least one problem. But the stack suggests that the above isn't the exact line in your script - can you confirm that? > > -hilmar I replaced $treeio by $tree in the above line but still get an error. Don't see what you mean by "the stack suggests that the above isn't the exact line in your script" The only think I changed is the length of the xml string I try to insert. But get the same error with an empty xml string. my $treeio = new Bio::TreeIO(-file => "$infile", -format => 'phyloxml', ); my $tree = $treeio->next_tree; # Add annotation $tree->add_phyloXML_annotation(-obj => $tree, -xml => 'SUMF family', ); Can't locate object method "add_phyloXML_annotation" via package "Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, line 1 (#1) (F) You called a method correctly, and it correctly indicated a package functioning as a class, but that package doesn't define that particular method, nor does any of its base classes. See perlobj. Uncaught exception from user code: Can't locate object method "add_phyloXML_annotation" via package "Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, line 1. at ./add_annotation_to_phyloxml.pl line 40 -- S?bastien Moretti Department of Ecology and Evolution, Biophore, University of Lausanne, CH-1015 Lausanne, Switzerland Tel.: +41 (21) 692 4221/4079 http://bioinfo.unil.ch/ From saladi1 at illinois.edu Tue Feb 12 16:24:34 2013 From: saladi1 at illinois.edu (Shyam Saladi) Date: Tue, 12 Feb 2013 13:24:34 -0800 Subject: [Bioperl-l] Bio::Tools::SeqStats->count_codons Message-ID: Hi, I am using the count_codons method from Bio::Tools::SeqStats and keep getting "AMBIGUOUS" codons, but I can't figure out why exactly. When I translate the same sequence that gives the error using another standard utility like (ExPASy - Translate), it seems to work alright. An example sequence is below. Could anyone lend some insight? Thanks, Shyam AAA AAC AAG AAT ACA ACC ACG ACT AGA AGC AGT *AMBIGUOUS* ATA ATC ATG ATT CAA CAC CAG CAT CCA CCC CCG CCT CGA CGC CGG CGT CTA CTC CTG CTT GAA GAC GAG GAT GCA GCC GCG GCT GGA GGC GGG GGT GTA GTC GTG GTT TAA TAC TAT TCA TCC TCG TCT TGG TGT TTA TTC TTG TTT count filename 1.722488038277511961722488038277511961722 2.966507177033492822966507177033492822967 1.531100478468899521531100478468899521531 0.9569377990430622009569377990430622009569 0.4784688995215311004784688995215311004785 1.722488038277511961722488038277511961722 1.33971291866028708133971291866028708134 1.913875598086124401913875598086124401914 0.1913875598086124401913875598086124401914 0.7655502392344497607655502392344497607656 1.435406698564593301435406698564593301435 * 0.09569377990430622009569377990430622009569* 0.3827751196172248803827751196172248803828 2.488038277511961722488038277511961722488 3.349282296650717703349282296650717703349 3.636363636363636363636363636363636363636 2.870813397129186602870813397129186602871 0.3827751196172248803827751196172248803828 1.626794258373205741626794258373205741627 0.4784688995215311004784688995215311004785 1.722488038277511961722488038277511961722 0.5741626794258373205741626794258373205742 1.052631578947368421052631578947368421053 1.244019138755980861244019138755980861244 0.3827751196172248803827751196172248803828 0.7655502392344497607655502392344497607656 0.1913875598086124401913875598086124401914 2.488038277511961722488038277511961722488 0.4784688995215311004784688995215311004785 0.6698564593301435406698564593301435406699 2.105263157894736842105263157894736842105 0.8612440191387559808612440191387559808612 2.870813397129186602870813397129186602871 1.435406698564593301435406698564593301435 1.722488038277511961722488038277511961722 2.775119617224880382775119617224880382775 2.00956937799043062200956937799043062201 2.488038277511961722488038277511961722488 3.540669856459330143540669856459330143541 2.00956937799043062200956937799043062201 0.1913875598086124401913875598086124401914 2.392344497607655502392344497607655502392 0.8612440191387559808612440191387559808612 5.454545454545454545454545454545454545455 1.913875598086124401913875598086124401914 0.8612440191387559808612440191387559808612 4.593301435406698564593301435406698564593 2.679425837320574162679425837320574162679 0.09569377990430622009569377990430622009569 1.148325358851674641148325358851674641148 1.148325358851674641148325358851674641148 0.8612440191387559808612440191387559808612 0.4784688995215311004784688995215311004785 2.105263157894736842105263157894736842105 0.9569377990430622009569377990430622009569 0.9569377990430622009569377990430622009569 0.09569377990430622009569377990430622009569 2.679425837320574162679425837320574162679 2.966507177033492822966507177033492822967 3.062200956937799043062200956937799043062 2.775119617224880382775119617224880382775 1045 temp.seq ATGGCACGTTTTTTTATTGATCGTCCCATCTTTGCGTGGGTGATCGCCTTAATTATTATGTTGGCGGGGGTGCTTTCAATTCGCACCCTGCCGGTTTCTCAATATCCCAGCATTGCACCGCCAACCGTGGTGATCAGTGCTAACTACCCTGGTGCATCGGCCAAGATTGTTGAAGACTCAGTGACTCAGGTGATTGAGCAACGCATGAAGGGTATCGATCACCTACGTTATATTGCCTCAACCAGCGATAGTTTCGGTAATGCTGAAATCACTTTGACCTTCAATGCCGAAGCCGATCCTGATATTGCTCAGGTACAAGTTCAGAACAAATTGCAGGGTGCAATGACCCTGTTACCACAAGAGGTACAGGCTCAAGGGGTTGACGTTAACAAATCAAGTTCTGGCTTYTTGATGGTGCTGGGTTTCGTATCGACTGACGGTTCCTTAGATAAAGGCGACATCGCCGACTATGTGGGTGCAAACGTACAAGATCCCATGAGCCGTGTACCGGGCGTGGGTGAAATTCAGCTGTTTGGTGCCCAATATGCGATGCGTATATGGCTTGATCCTTTAAAACTGACTCAATATAACTTGACCAGTTTAGAGGTGATCTCGGCGATTCGTGCTCAAAACGCGCAGGTGTCTGCGGGTCAGTTGGGTGGTACGCCGTCAATTCAAGGGCAAGAACTTAACGCCACTGTTTCGGCGCAAAGTCGTTTGCAAACCCCTGAAGAGTTTCGCAAGATTATCCTGAAGTCTGATACTTCGGGTGCGAATGTGTTCCTCGGTGATGTGGCGCGCGTAGAGTTAGGTTCAGAGAGTTATGCCGTTGTCTCGTTCTACAATGGTAAGCCTGCTACTGGTTTAGCGATTAAACTGGCGACAGGCGCAAACGCGTTGGATACCGCTGAAGCTGTTCGTGATAAAGTTGAAGAATTGCGACCTTTCTTCCCGCAAGGGTTGGATGTTGTTTATCCCTACGATACTACGCCATTCGTTGAGAAATCGATAGAAGGCGTGGTACACACCCTGCTCGAAGCGATTGTTCTGGTGTTTGTCATCATGTACCTCTTCCTGCAAAACTTCCGTGCGACCTTAATTCCGACGATTGCGGTACCAGTGGTCTTGCTGGGAACGTTTGCGATTTTGTCGGCCACGGGCTTCTCTATCAACACCCTTACCATGTTTGCTATGGTGCTGGCGATTGGTCTGTTGGTGGACGACGCCATCGTGGTGGTTGAAAACGTTGAGCGGGTGATGTCGGAAGAAGGGTTGAGCCCACTCGAAGCGACTCGTAAATCGATGGATCAAATCACTGGCGCCTTAGTTGGTATTGGTTTGACGTTATCTGCTGTATTTGTGCCAATGGCATTTATGTCGGGTTCTACTGGGGTCATTTACCGTCAGTTCTCGATCACTATCGTGTCTGCGATGGCATTGTCGGTATTAGTGGCCTTGATTTTAACGCCGGCACTTTGTGCCACTATGTTAAAACCCGTGCAGAAGGGACATGGTCATATTGAAACCGGTTTCTTCGGTTGGTTTAACCGTAACTTTGATCGCTTAACTAACCGTTACGAATCCAGTGTGGCGGGCATAGTGAAGCGTGGCTTTAGAGTCATGATGATTTATGTGGCTTTAGTGGTCGCCGTCGGTTGGATCTTCATGCGTATGCCAACTGCATTCTTACCCGATGAAGACCAAGGTATCTTGTTTACGCAGGCGATTTTGCCAACAAACTCGACTCAAGAAAGTACCCTCAAAGTGCTGGATAAGGTATCCGATCACTTCATGGCTGAAGAAGGCGTGAGATCGGTATTCAGCGTGGCGGGCTTTAGCTTTGCGGGTCAAGGCCAAAACATGGGTATCGCTTTCGTTGGCTTGAAGGATTGGTCAGAGCGTGAAGCACCTGGTATGGATGTGCAGTCTATTGCGGGTCGTGCTATGGGTGCCTTTAGTCAAATTAAAGACGCCTTCGTATTTGCCTTCGTACCACCTGCGGTTATTGAGCTGGGTACGGCGAATGGTTTTGACATGTACCTGCAAGATAAAAACGGTCAAGGCCACGATAAGTTAATAGCGGCTCGTAACCAATTGCTGGGTATGGCGGCTCAGAATCCAAACCTTATGGGTGTTCGCCCTAATGGTCAGGAAGATGCGCCAATCTATCAATTGCATATTGATCATGCAAAGTTGAGCGCATTAGGCGTTGATATTGCTAACGTTAACAGTGTGTTGGCAACTGCTTGGGGTGGTTCCTATGTGAACGATTTTATCGACCGCGGCCGTGTGAAAAAGGTATTTGTGCAAGGTGATGCCCAATACCGTATGCAGCCTGAAGACCTCAACACTTGGTACGTGCGTAACAACAAGGGTGACATGGTGCCATTTTCGGCCTTTGCAACAGGTTCTTGGGAATACGGCTCACCGCGTCTAGAACGTTTTAACGGTTTACCAGCGGTGAATATTCAAGGCGCAACTGCACCAGGCTTTAGTACGGGTGCTGCCATGACTATCATGGAGGACTTAGTTAAGCAGCTACCACCTGGCTTTGGCATCGAGTGGAACGGCTTATCCTACGAGGAACGTTTATCGGGTAACCAAGCACCAGCCTTGTATGCGTTGTCGATTCTGGTGGTATTCCTTGTATTAGCAGCCTTGTATGAAAGCTGGTCAGTACCGTTTGCGGTTATCCTTGTGGTTCCATTGGGGATTATCGGTGCTCTATTGGCGATGAATGGTCGAGGCTTGCCTAACGACGTGTTCTTCCAAGTGGGTCTGTTAACAACGGTTGGTTTGGCAACCAAGAACGCCATCTTGATTGTGGAATTTGCAAAAGAATTCTACGAGAAGGGGGCGGGTCTGGTTGAGGCGACCTTACATGCGGTCCGCGTGCGTTTACGTCCGATTTTAATGACGTCGCTCGCTTTTGGTCTGGGGGTTGTACCGCTAGCCATTAGTACAGGTGTGGGTTCGGGCAGTCAGAACGCCATTGGTACCGGTGTACTTGGCGGTATGATGAGTTCGACCTTCTTAGGTATCTTCTTCGTGCCACTGTTCTTCGTCATTGTTGAGCGGATCTTCAGTAAACGAGAGCGAAAAGCGAAAGAGAAAAATCCTACGTCGACGGATTAA From bosborne11 at verizon.net Tue Feb 12 21:30:08 2013 From: bosborne11 at verizon.net (Brian Osborne) Date: Tue, 12 Feb 2013 21:30:08 -0500 Subject: [Bioperl-l] Bio::Tools::SeqStats->count_codons In-Reply-To: References: Message-ID: Shyam, An ambiguous codon would be one that has a character other than [ACTGU] in it. I see '!' in your sequences, that would create an ambiguous codon. Brian O. On Feb 12, 2013, at 4:24 PM, Shyam Saladi wrote: > Hi, > > I am using the count_codons method from Bio::Tools::SeqStats and keep > getting "AMBIGUOUS" codons, but I can't figure out why exactly. > > When I translate the same sequence that gives the error using another > standard utility like (ExPASy - Translate), it seems to work alright. > > An example sequence is below. Could anyone lend some insight? > > Thanks, > Shyam > > > > AAA AAC AAG AAT ACA ACC ACG ACT AGA AGC > AGT *AMBIGUOUS* ATA ATC ATG ATT CAA CAC > CAG CAT CCA CCC CCG CCT CGA CGC CGG > CGT CTA CTC CTG CTT GAA GAC GAG GAT GCA > GCC GCG GCT GGA GGC GGG GGT GTA GTC > GTG GTT TAA TAC TAT TCA TCC TCG TCT TGG > TGT TTA TTC TTG TTT count filename > 1.722488038277511961722488038277511961722 > 2.966507177033492822966507177033492822967 > 1.531100478468899521531100478468899521531 > 0.9569377990430622009569377990430622009569 > 0.4784688995215311004784688995215311004785 > 1.722488038277511961722488038277511961722 > 1.33971291866028708133971291866028708134 > 1.913875598086124401913875598086124401914 > 0.1913875598086124401913875598086124401914 > 0.7655502392344497607655502392344497607656 > 1.435406698564593301435406698564593301435 * > 0.09569377990430622009569377990430622009569* > 0.3827751196172248803827751196172248803828 > 2.488038277511961722488038277511961722488 > 3.349282296650717703349282296650717703349 > 3.636363636363636363636363636363636363636 > 2.870813397129186602870813397129186602871 > 0.3827751196172248803827751196172248803828 > 1.626794258373205741626794258373205741627 > 0.4784688995215311004784688995215311004785 > 1.722488038277511961722488038277511961722 > 0.5741626794258373205741626794258373205742 > 1.052631578947368421052631578947368421053 > 1.244019138755980861244019138755980861244 > 0.3827751196172248803827751196172248803828 > 0.7655502392344497607655502392344497607656 > 0.1913875598086124401913875598086124401914 > 2.488038277511961722488038277511961722488 > 0.4784688995215311004784688995215311004785 > 0.6698564593301435406698564593301435406699 > 2.105263157894736842105263157894736842105 > 0.8612440191387559808612440191387559808612 > 2.870813397129186602870813397129186602871 > 1.435406698564593301435406698564593301435 > 1.722488038277511961722488038277511961722 > 2.775119617224880382775119617224880382775 > 2.00956937799043062200956937799043062201 > 2.488038277511961722488038277511961722488 > 3.540669856459330143540669856459330143541 > 2.00956937799043062200956937799043062201 > 0.1913875598086124401913875598086124401914 > 2.392344497607655502392344497607655502392 > 0.8612440191387559808612440191387559808612 > 5.454545454545454545454545454545454545455 > 1.913875598086124401913875598086124401914 > 0.8612440191387559808612440191387559808612 > 4.593301435406698564593301435406698564593 > 2.679425837320574162679425837320574162679 > 0.09569377990430622009569377990430622009569 > 1.148325358851674641148325358851674641148 > 1.148325358851674641148325358851674641148 > 0.8612440191387559808612440191387559808612 > 0.4784688995215311004784688995215311004785 > 2.105263157894736842105263157894736842105 > 0.9569377990430622009569377990430622009569 > 0.9569377990430622009569377990430622009569 > 0.09569377990430622009569377990430622009569 > 2.679425837320574162679425837320574162679 > 2.966507177033492822966507177033492822967 > 3.062200956937799043062200956937799043062 > 2.775119617224880382775119617224880382775 1045 temp.seq > > ATGGCACGTTTTTTTATTGATCGTCCCATCTTTGCGTGGGTGATCGCCTTAATTATTATGTTGGCGGGGGTGCTTTCAATTCGCACCCTGCCGGTTTCTCAATATCCCAGCATTGCACCGCCAACCGTGGTGATCAGTGCTAACTACCCTGGTGCATCGGCCAAGATTGTTGAAGACTCAGTGACTCAGGTGATTGAGCAACGCATGAAGGGTATCGATCACCTACGTTATATTGCCTCAACCAGCGATAGTTTCGGTAATGCTGAAATCACTTTGACCTTCAATGCCGAAGCCGATCCTGATATTGCTCAGGTACAAGTTCAGAACAAATTGCAGGGTGCAATGACCCTGTTACCACAAGAGGTACAGGCTCAAGGGGTTGACGTTAACAAATCAAGTTCTGGCTTYTTGATGGTGCTGGGTTTCGTATCGACTGACGGTTCCTTAGATAAAGGCGACATCGCCGACTATGTGGGTGCAAACGTACAAGATCCCATGAGCCGTGTACCGGGCGTGGGTGAAATTCAGCTGTTTGGTGCCCAATATGCGATGCGTATATGGCTTGATCCTTTAAAACTGACTCAATATAACTTGACCAGTTTAGAGGTGATCTCGGCGATTCGTGCTCAAAACGCGCAGGTGTCTGCGGGTCAGTTGGGTGGTACGCCGTCAATTCAAGGGCAAGAACTTAACGCCACTGTTTCGGCGCAAAGTCGTTTGCAAACCCCTGAAGAGTTTCGCAAGATTATCCTGAAGTCTGATACTTCGGGTGCGAATGTGTTCCTCGGTGATGTGGCGCGCGTAGAGTTAGGTTCAGAGAGTTATGCCGTTGTCTCGTTCTACAATGGTAAGCCTGCTACTGGTTTAGCGATTAAACTGGCGACAGGCGCAAACGCGTTGGATACCGCTGAAGCTGTTCGTGATAAAGTTGAAGAATTGCGACCTTTCTTCCCGCAAGGGTTGGATGTTGTTTATCCCTACGATACTAC! > GCCATTCGTTGAGAAATCGATAGAAGGCGTGGTACACACCCTGCTCGAAGCGATTGTTCTGGTGTTTGTCATCATGTACCTCTTCCTGCAAAACTTCCGTGCGACCTTAATTCCGACGATTGCGGTACCAGTGGTCTTGCTGGGAACGTTTGCGATTTTGTCGGCCACGGGCTTCTCTATCAACACCCTTACCATGTTTGCTATGGTGCTGGCGATTGGTCTGTTGGTGGACGACGCCATCGTGGTGGTTGAAAACGTTGAGCGGGTGATGTCGGAAGAAGGGTTGAGCCCACTCGAAGCGACTCGTAAATCGATGGATCAAATCACTGGCGCCTTAGTTGGTATTGGTTTGACGTTATCTGCTGTATTTGTGCCAATGGCATTTATGTCGGGTTCTACTGGGGTCATTTACCGTCAGTTCTCGATCACTATCGTGTCTGCGATGGCATTGTCGGTATTAGTGGCCTTGATTTTAACGCCGGCACTTTGTGCCACTATGTTAAAACCCGTGCAGAAGGGACATGGTCATATTGAAACCGGTTTCTTCGGTTGGTTTAACCGTAACTTTGATCGCTTAACTAACCGTTACGAATCCAGTGTGGCGGGCATAGTGAAGCGTGGCTTTAGAGTCATGATGATTTATGTGGCTTTAGTGGTCGCCGTCGGTTGGATCTTCATGCGTATGCCAACTGCATTCTTACCCGATGAAGACCAAGGTATCTTGTTTACGCAGGCGATTTTGCCAACAAACTCGACTCAAGAAAGTACCCTCAAAGTGCTGGATAAGGTATCCGATCACTTCATGGCTGAAGAAGGCGTGAGATCGGTATTCAGCGTGGCGGGCTTTAGCTTTGCGGGTCAAGGCCAAAACATGGGTATCGCTTTCGTTGGCTTGAAGGATTGGTCAGAGCGTGAAGCACCTGGTATGGATGTGCAGTCTATTGCGGGTCGTGCTATGGGTGCCTTTAGTCAAATTAAAGACGCCTTC! > GTATTTGCCTTCGTACCACCTGCGGTTATTGAGCTGGGTACGGCGAATGGTTTTGACATGTACCTGCAAG > ATAAAAACGGTCAAGGCCACGATAAGTTAATAGCGGCTCGTAACCAATTGCTGGGTATGGCGGCTCAGAATCCAAACCTTATGGGTGTTCGCCCTAATGGTCAGGAAGATGCGCCAATCTATCAATTGCATATTGATCATGCAAAGTTGAGCGCATTAGGCGTTGATATTGCTAACGTTAACAGTGTGTTGGCAACTGCTTGGGGTGGTTCCTATGTGAACGATTTTATCGACCGCGGCCGTGTGAAAAAGGTATTTGTGCAAGGTGATGCCCAATACCGTATGCAGCCTGAAGACCTCAACACTTGGTACGTGCGTAACAACAAGGGTGACATGGTGCCATTTTCGGCCTTTGCAACAGGTTCTTGGGAATACGGCTCACCGCGTCTAGAACGTTTTAACGGTTTACCAGCGGTGAATATTCAAGGCGCAACTGCACCAGGCTTTAGTACGGGTGCTGCCATGACTATCATGGAGGACTTAGTTAAGCAGCTACCACCTGGCTTTGGCATCGAGTGGAACGGCTTATCCTACGAGGAACGTTTATCGGGTAACCAAGCACCAGCCTTGTATGCGTTGTCGATTCTGGTGGTATTCCTTGTATTAGCAGCCTTGTATGAAAGCTGGTCAGTACCGTTTGCGGTTATCCTTGTGGTTCCATTGGGGATTATCGGTGCTCTATTGGCGATGAATGGTCGAGGCTTGCCTAACGACGTGTTCTTCCAAGTGGGTCTGTTAACAACGGTTGGTTTGGCAACCAAGAACGCCATCTTGATTGTGGAATTTGCAAAAGAATTCTACGAGAAGGGGGCGGGTCTGGTTGAGGCGACCTTACATGCGGTCCGCGTGCGTTTACGTCCGATTTTAATGACGTCGCTCGCTTTTGGTCTGGGGGTTGTACCGCTAGCCATTAGTACAGGTGTGGGTTCGGGCAGTCAGAACGCCATTGGTACCGGTGTACTTGGCGGTATGATGAGTTCGACCTTCTTA! > GGTATCTTCTTCGTGCCACTGTTCTTCGTCATTGTTGAGCGGATCTTCAGTAAACGAGAGCGAAAAGCGAAAGAGAAAAATCCTACGTCGACGGATTAA > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Wed Feb 13 10:18:10 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 13 Feb 2013 15:18:10 +0000 Subject: [Bioperl-l] [ANNOUNCEMENT] BioPerl Future Development Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE24CF5@CHIMBX5.ad.uillinois.edu> All, tl;dr: A lot of change is coming. Be forewarned and be prepared. This is an 'official' announcement to the BioPerl community on future BioPerl plans. We have decided to move continued maintenance of Bioperl release series over to the new 'v1' branch. This branch will be the point where any future versions of 1.6.x code will be released, starting with the (already-scheduled) March 1 release. The 'master' branch will become the main focal point for future development of BioPerl going into an eventual v2 release, with a focus on performance enhancements, addressing newer technologies like NGS and large data, code cleanup, and simplifying the code base. We welcome any help with code improvements. GMOD folks? Want to help? This is a good opportunity to address BioPerl short-comings in the code base! What this means for anyone using BioPerl currently: 1) We anticipate significant issues if you are relying on the 'master' branch for anything. To inelegantly state it, the core developers are taking back the 'master' branch for future development. Please please please do not rely on the 'master' branch for stable code; if you are reliant on the BioPerl 1.6.x, make sure to use 'v1'. We can revisit whether to make 'v1' the default checkout branch if/when the need arises. 2) Expect not to find some modules. We will be migrating modules requiring external dependencies and other associated chunks of the code base out into their own repositories over the next year to help future maintenance; the eventual intent is to release all of these independently on CPAN. We will completely remove all code previously marked as deprecated, and we may immediately deprecate additional modules if needed (this will of course be discussed on list). 3) Expect version numbering to change significantly. Because we are releasing code in separate repositories, I fully expect downstream versioning problems if we stick with the current system (e.g. all bioperl-live modules having the same version). It will be too much of a headache to sync versions for all modules as this will entail making a full release of all bioperl code, one of the main reasons we are splitting out code to begin with. At the moment, no specific versioning scheme has been chosen, though I *highly* recommend using X.Y versioning for simplicity (e.g. no more 3-point versions). This is the standard that Lincoln has adopted for Bio::Graphics and GBrowse. 4) Expect quick deprecation of methods within modules as needed. These should of course be brought up to the mail list prior to actual implementation, but I would anticipate some things changing as we try to adopt a more consistent method naming scheme. 5) The same steps outlined for bioperl-live will apply for bioperl-run modules. We will have to decide the best approach to use for those, e.g. whether to separate them out based on task (alignment), application group (NGS, BLAST, RNA), etc. and how these may fit organically with bioperl-live modules where appropriate. 6) Do not expect a new CPAN release of such code until Dec 2013. Even then it will be in an alpha stage. We are all busy campers. We do not anticipate significant changes to bioperl-network or bioperl-db at this time beyond updating them to deal with new changes. I'm sure there are many other points that need to be discussed. Please reply over the next week if you have any concerns. chris From cjfields at illinois.edu Wed Feb 13 11:01:07 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 13 Feb 2013 16:01:07 +0000 Subject: [Bioperl-l] Test-pls ignore Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE2506D@CHIMBX5.ad.uillinois.edu> testing the mail list to see if it is working. -c From sebastien.moretti at unil.ch Wed Feb 13 11:21:23 2013 From: sebastien.moretti at unil.ch (=?ISO-8859-1?Q?Moretti_S=E9bastien?=) Date: Wed, 13 Feb 2013 17:21:23 +0100 Subject: [Bioperl-l] PhyloXML In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE24E8D@CHIMBX5.ad.uillinois.edu> References: <51152591.9010402@unil.ch> <511898E6.7060400@unil.ch> <118F034CF4C3EF48A96F86CE585B94BF6CE24E8D@CHIMBX5.ad.uillinois.edu> Message-ID: <511BBD83.2000708@unil.ch> >>>> # Add annotation >>>> $treeio->add_phyloXML_annotation(-obj => $tree, >>>> -xml => 'SUMF family', >>>> ); >>> >>> If you really have $treeio in your script in this line and not $tree, then that's at least one problem. But the stack suggests that the above isn't the exact line in your script - can you confirm that? >>> >>> -hilmar >> >> I replaced $treeio by $tree in the above line but still get an error. >> Don't see what you mean by "the stack suggests that the above isn't the exact line in your script" >> >> The only think I changed is the length of the xml string I try to insert. But get the same error with an empty xml string. >> >> >> >> my $treeio = new Bio::TreeIO(-file => "$infile", >> -format => 'phyloxml', >> ); >> my $tree = $treeio->next_tree; >> >> # Add annotation >> $tree->add_phyloXML_annotation(-obj => $tree, >> -xml => 'SUMF family', >> ); >> >> Can't locate object method "add_phyloXML_annotation" via package >> "Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, line 1 (#1) >> (F) You called a method correctly, and it correctly indicated a package >> functioning as a class, but that package doesn't define that particular >> method, nor does any of its base classes. See perlobj. >> >> Uncaught exception from user code: >> Can't locate object method "add_phyloXML_annotation" via package "Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, line 1. >> at ./add_annotation_to_phyloxml.pl line 40 > > Will have to look into this. One problem we have is that phyloXML support has dwindled, so if anyone wants to take this on I would be more than happy to help them get started. > > chris You mean that BioPerl 1.6.901 has not a full support of PhyloXML ? The problem I have is "expected" ? -- S?bastien Moretti From cjfields at illinois.edu Wed Feb 13 10:47:17 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 13 Feb 2013 15:47:17 +0000 Subject: [Bioperl-l] PhyloXML In-Reply-To: <511898E6.7060400@unil.ch> References: <51152591.9010402@unil.ch> <511898E6.7060400@unil.ch> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE24E8D@CHIMBX5.ad.uillinois.edu> On Feb 11, 2013, at 1:08 AM, S?bastien MORETTI wrote: >>> # Add annotation >>> $treeio->add_phyloXML_annotation(-obj => $tree, >>> -xml => 'SUMF family', >>> ); >> >> If you really have $treeio in your script in this line and not $tree, then that's at least one problem. But the stack suggests that the above isn't the exact line in your script - can you confirm that? >> >> -hilmar > > I replaced $treeio by $tree in the above line but still get an error. > Don't see what you mean by "the stack suggests that the above isn't the exact line in your script" > > The only think I changed is the length of the xml string I try to insert. But get the same error with an empty xml string. > > > > my $treeio = new Bio::TreeIO(-file => "$infile", > -format => 'phyloxml', > ); > my $tree = $treeio->next_tree; > > # Add annotation > $tree->add_phyloXML_annotation(-obj => $tree, > -xml => 'SUMF family', > ); > > Can't locate object method "add_phyloXML_annotation" via package > "Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, line 1 (#1) > (F) You called a method correctly, and it correctly indicated a package > functioning as a class, but that package doesn't define that particular > method, nor does any of its base classes. See perlobj. > > Uncaught exception from user code: > Can't locate object method "add_phyloXML_annotation" via package "Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, line 1. > at ./add_annotation_to_phyloxml.pl line 40 > > > > -- > S?bastien Moretti > Department of Ecology and Evolution, > Biophore, University of Lausanne, > CH-1015 Lausanne, Switzerland > Tel.: +41 (21) 692 4221/4079 > http://bioinfo.unil.ch/\ Will have to look into this. One problem we have is that phyloXML support has dwindled, so if anyone wants to take this on I would be more than happy to help them get started. chris From carandraug+dev at gmail.com Wed Feb 13 12:23:23 2013 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Wed, 13 Feb 2013 17:23:23 +0000 Subject: [Bioperl-l] Next BioPerl release Message-ID: On 5 February 2013 21:53, Fields, Christopher J wrote: > I am scheduling the next BioPerl CPAN release tentatively for March 1. Any help in triaging bug reports would be greatly appreciated! Hi is this release of bioperl-live only or also includes bioperl-run? Carn? From cjfields at illinois.edu Wed Feb 13 12:08:21 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 13 Feb 2013 17:08:21 +0000 Subject: [Bioperl-l] PhyloXML In-Reply-To: <511BBD83.2000708@unil.ch> References: <51152591.9010402@unil.ch> <511898E6.7060400@unil.ch> <118F034CF4C3EF48A96F86CE585B94BF6CE24E8D@CHIMBX5.ad.uillinois.edu> <511BBD83.2000708@unil.ch> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE2532B@CHIMBX5.ad.uillinois.edu> On Feb 13, 2013, at 10:21 AM, Moretti S?bastien wrote: >>>>> # Add annotation >>>>> $treeio->add_phyloXML_annotation(-obj => $tree, >>>>> -xml => 'SUMF family', >>>>> ); >>>> >>>> If you really have $treeio in your script in this line and not $tree, then that's at least one problem. But the stack suggests that the above isn't the exact line in your script - can you confirm that? >>>> >>>> -hilmar >>> >>> I replaced $treeio by $tree in the above line but still get an error. >>> Don't see what you mean by "the stack suggests that the above isn't the exact line in your script" >>> >>> The only think I changed is the length of the xml string I try to insert. But get the same error with an empty xml string. >>> >>> >>> >>> my $treeio = new Bio::TreeIO(-file => "$infile", >>> -format => 'phyloxml', >>> ); >>> my $tree = $treeio->next_tree; >>> >>> # Add annotation >>> $tree->add_phyloXML_annotation(-obj => $tree, >>> -xml => 'SUMF family', >>> ); >>> >>> Can't locate object method "add_phyloXML_annotation" via package >>> "Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, line 1 (#1) >>> (F) You called a method correctly, and it correctly indicated a package >>> functioning as a class, but that package doesn't define that particular >>> method, nor does any of its base classes. See perlobj. >>> >>> Uncaught exception from user code: >>> >>> at ./add_annotation_to_phyloxml.pl line 40 >> >> Will have to look into this. One problem we have is that phyloXML support has dwindled, so if anyone wants to take this on I would be more than happy to help them get started. >> >> chris > > You mean that BioPerl 1.6.901 has not a full support of PhyloXML ? > The problem I have is "expected" ? > > -- > S?bastien Moretti I think it handles most of phyloXML fine, but the implementation of the parser is a little tricky. I tried cleaning this up a few years back but didn't make much progress. The function is in Bio::TreeIO::phyloxml, so the correct call should be (as you previously had it): $treeio->add_phyloXML_annotation(-obj => $tree, -xml => 'SUMF family', ); My guess is that Bio::Tree::Tree was AnnotatableI at one point but that was removed, will have to trace that back. Can you file a bug on this? https://redmine.open-bio.org/ chris From cjfields at illinois.edu Wed Feb 13 13:05:53 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 13 Feb 2013 18:05:53 +0000 Subject: [Bioperl-l] Next BioPerl release In-Reply-To: References: Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE25573@CHIMBX5.ad.uillinois.edu> On Feb 13, 2013, at 11:23 AM, Carn? Draug wrote: > On 5 February 2013 21:53, Fields, Christopher J wrote: >> I am scheduling the next BioPerl CPAN release tentatively for March 1. Any help in triaging bug reports would be greatly appreciated! > > Hi > > is this release of bioperl-live only or also includes bioperl-run? > > Carn? We can work on a bioperl-run release. It's too much to handle both in one go. The problem I have faced with bioperl-run in the past is similar to bioperl-live, that the tools used are a moving target and that makes the wrappers easily out-of-date. I would really like a more flexible generic way of defining these that would allow for easier maintenance. chris From l.m.timmermans at students.uu.nl Wed Feb 13 14:44:22 2013 From: l.m.timmermans at students.uu.nl (Leon Timmermans) Date: Wed, 13 Feb 2013 20:44:22 +0100 Subject: [Bioperl-l] Next BioPerl release In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE25573@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE25573@CHIMBX5.ad.uillinois.edu> Message-ID: On Wed, Feb 13, 2013 at 7:05 PM, Fields, Christopher J wrote: > We can work on a bioperl-run release. It's too much to handle both in one go. The problem I have faced with bioperl-run in the past is similar to bioperl-live, that the tools used are a moving target and that makes the wrappers easily out-of-date. I would really like a more flexible generic way of defining these that would allow for easier maintenance. Also, bioperl-run needs to be cut into smaller distributions even more than bioperl-live. Few people if anyone at all has all tools it tries to wrap at hand, so its almost impossible to pass its testing suite. We need dists that can realistically pass. Leon From cjfields at illinois.edu Wed Feb 13 16:04:26 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 13 Feb 2013 21:04:26 +0000 Subject: [Bioperl-l] Next BioPerl release In-Reply-To: References: <118F034CF4C3EF48A96F86CE585B94BF6CE25573@CHIMBX5.ad.uillinois.edu> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE25B07@CHIMBX5.ad.uillinois.edu> On Feb 13, 2013, at 1:44 PM, Leon Timmermans wrote: > On Wed, Feb 13, 2013 at 7:05 PM, Fields, Christopher J > wrote: >> We can work on a bioperl-run release. It's too much to handle both in one go. The problem I have faced with bioperl-run in the past is similar to bioperl-live, that the tools used are a moving target and that makes the wrappers easily out-of-date. I would really like a more flexible generic way of defining these that would allow for easier maintenance. > > Also, bioperl-run needs to be cut into smaller distributions even more > than bioperl-live. Few people if anyone at all has all tools it tries > to wrap at hand, so its almost impossible to pass its testing suite. > > We need dists that can realistically pass. > > Leon Yup. It's a mess. chris From florent.angly at gmail.com Wed Feb 13 17:33:14 2013 From: florent.angly at gmail.com (Florent Angly) Date: Thu, 14 Feb 2013 08:33:14 +1000 Subject: [Bioperl-l] [ANNOUNCEMENT] BioPerl Future Development In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE24CF5@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE24CF5@CHIMBX5.ad.uillinois.edu> Message-ID: <511C14AA.9030107@gmail.com> On 14/02/13 01:18, Fields, Christopher J wrote: > I*highly* recommend using X.Y versioning for simplicity (e.g. no more 3-point versions) Yes, I support the X.Y versioning as well. Florent From l.m.timmermans at students.uu.nl Wed Feb 13 18:12:06 2013 From: l.m.timmermans at students.uu.nl (Leon Timmermans) Date: Thu, 14 Feb 2013 00:12:06 +0100 Subject: [Bioperl-l] [ANNOUNCEMENT] BioPerl Future Development In-Reply-To: <511C14AA.9030107@gmail.com> References: <118F034CF4C3EF48A96F86CE585B94BF6CE24CF5@CHIMBX5.ad.uillinois.edu> <511C14AA.9030107@gmail.com> Message-ID: On Wed, Feb 13, 2013 at 11:33 PM, Florent Angly wrote: > On 14/02/13 01:18, Fields, Christopher J wrote: >> >> I*highly* recommend using X.Y versioning for simplicity (e.g. no more >> 3-point versions) > > Yes, I support the X.Y versioning as well. > Florent See also: http://www.dagolden.com/index.php/369/version-numbers-should-be-boring/ Leon From daisieh at gmail.com Thu Feb 14 00:21:15 2013 From: daisieh at gmail.com (Daisie Huang) Date: Wed, 13 Feb 2013 21:21:15 -0800 (PST) Subject: [Bioperl-l] Question regarding while loops for reading files In-Reply-To: References: Message-ID: <3cbbba3b-759d-4281-9592-6b690aea92ab@googlegroups.com> I think you need to reset the pointer to the filehandle before you go through the while loop the second time: seek $fh,0,0 On Wednesday, February 13, 2013 6:46:41 PM UTC-8, Tiago Hori wrote: > > Hey Guys, > > I am still at the same place. I am writing these little pieces of code to > try to learn the language better, so any advice would be useful. I am again > parsing through tab delimited files and now trying to find fish from on id > (in these case families AS5 and AS9), retrieve the weights and average > them. When I started I did it for one family and it worked (instead of the > @families I had a scalar $family set to AS5). But really it is more useful > to look at more than one family at time (I should mention that are 2 types > of fish per family one ends in PS , the other doesn't). So I tried to use a > foreach loop to go through the file twice, once with a the search value set > to AS5 and a second time to AS9. It works for AS5, but for some reason, the > foreach loop sets $test to AS9 the second time, but it doesn't go through > the while loop. What am I doing wrong? > > here is the code: > > #! /usr/bin/perl > use strict; > use warnings; > > my $file = $ARGV[0]; > my @family = ('AS5','AS9'); > my $i; > my $ii; > my $test; > > open (my $fh, "<", $file) or die ("Can't open $file: $!"); > > foreach (@family){ > $test = $_; > my @data_weight_2N = (); > my @data_weight_3N = (); > while (<$fh>){ > chomp; > my $line = $_; > my @data = split ("\t", $line); > if ($data[0] !~ /[0-9]*/){ > next;} > elsif ($data[1] eq "ABF09-$test"){ > $i += 1; > push (@data_weight_2N, $data[6]); > }elsif ($data[1] eq "ABF09-".$test."PS"){ > $ii += 1; > push (@data_weight_3N,$data[6]); > } > } > my $mean_2N = &average (\@data_weight_2N); > my $stdev_2N = &stdev (\@data_weight_2N); > my $stderr_2N = ($stdev_2N/sqrt($i)); > > print "These are the the avearge weight, stdev and stderr for $test > 2N:\t", $mean_2N,"\t",$stdev_2N,"\t",$stderr_2N, "\n"; > > my $mean_3N = &average (\@data_weight_3N); > my $stdev_3N = &stdev (\@data_weight_3N); > my $stderr_3N = ($stdev_3N/sqrt($i)); > > print "These are the the avearge weight, stdev and stderr for $test > 3N:\t", $mean_3N,"\t",$stdev_3N,"\t",$stderr_3N, "\n"; > } > > close ($fh); > > sub average{ > my($data) = @_; > if (not @$data) { > print ("Empty array\n"); > return 0; > } > my $total = 0; > foreach (@$data) { > $total += $_; > } > my $average = $total / @$data; > return $average; > } > > sub stdev{ > my($data) = @_; > if(@$data == 1){ > return 0; > } > my $average = &average($data); > my $sqtotal = 0; > foreach(@$data) { > $sqtotal += ($average-$_) ** 2; > } > my $std = ($sqtotal / (@$data-1)) ** 0.5; > return $std; > } > > Thanks, > > T. > > -- > "Education is not to be used to promote obscurantism." - Theodonius > Dobzhansky. > > "Gracias a la vida que me ha dado tanto > Me ha dado el sonido y el abecedario > Con ?l, las palabras que pienso y declaro > Madre, amigo, hermano > Y luz alumbrando la ruta del alma del que estoy amando > > Gracias a la vida que me ha dado tanto > Me ha dado la marcha de mis pies cansados > Con ellos anduve ciudades y charcos > Playas y desiertos, monta?as y llanos > Y la casa tuya, tu calle y tu patio" > > Violeta Parra - Gracias a la Vida > > Tiago S. F. Hori. PhD. > Ocean Science Center-Memorial University of Newfoundland > From sebastien.moretti at unil.ch Thu Feb 14 03:09:06 2013 From: sebastien.moretti at unil.ch (=?ISO-8859-1?Q?S=E9bastien_MORETTI?=) Date: Thu, 14 Feb 2013 09:09:06 +0100 Subject: [Bioperl-l] PhyloXML In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE2532B@CHIMBX5.ad.uillinois.edu> References: <51152591.9010402@unil.ch> <511898E6.7060400@unil.ch> <118F034CF4C3EF48A96F86CE585B94BF6CE24E8D@CHIMBX5.ad.uillinois.edu> <511BBD83.2000708@unil.ch> <118F034CF4C3EF48A96F86CE585B94BF6CE2532B@CHIMBX5.ad.uillinois.edu> Message-ID: <511C9BA2.9000508@unil.ch> >>>>>> # Add annotation >>>>>> $treeio->add_phyloXML_annotation(-obj => $tree, >>>>>> -xml => 'SUMF family', >>>>>> ); >>>>> >>>>> If you really have $treeio in your script in this line and not $tree, then that's at least one problem. But the stack suggests that the above isn't the exact line in your script - can you confirm that? >>>>> >>>>> -hilmar >>>> >>>> I replaced $treeio by $tree in the above line but still get an error. >>>> Don't see what you mean by "the stack suggests that the above isn't the exact line in your script" >>>> >>>> The only think I changed is the length of the xml string I try to insert. But get the same error with an empty xml string. >>>> >>>> >>>> >>>> my $treeio = new Bio::TreeIO(-file => "$infile", >>>> -format => 'phyloxml', >>>> ); >>>> my $tree = $treeio->next_tree; >>>> >>>> # Add annotation >>>> $tree->add_phyloXML_annotation(-obj => $tree, >>>> -xml => 'SUMF family', >>>> ); >>>> >>>> Can't locate object method "add_phyloXML_annotation" via package >>>> "Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, line 1 (#1) >>>> (F) You called a method correctly, and it correctly indicated a package >>>> functioning as a class, but that package doesn't define that particular >>>> method, nor does any of its base classes. See perlobj. >>>> >>>> Uncaught exception from user code: >>>> >>>> at ./add_annotation_to_phyloxml.pl line 40 >>> >>> Will have to look into this. One problem we have is that phyloXML support has dwindled, so if anyone wants to take this on I would be more than happy to help them get started. >>> >>> chris >> >> You mean that BioPerl 1.6.901 has not a full support of PhyloXML ? >> The problem I have is "expected" ? >> >> -- >> S?bastien Moretti > > I think it handles most of phyloXML fine, but the implementation of the parser is a little tricky. I tried cleaning this up a few years back but didn't make much progress. > > The function is in Bio::TreeIO::phyloxml, so the correct call should be (as you previously had it): > > $treeio->add_phyloXML_annotation(-obj => $tree, > -xml => 'SUMF family', > ); > > My guess is that Bio::Tree::Tree was AnnotatableI at one point but that was removed, will have to trace that back. Can you file a bug on this? > > https://redmine.open-bio.org/ > > chris I will fill a bug on this. I'd be happy to try to contribute to the phyloxml code. But don't know how to proceed for BioPerl. -- S?bastien Moretti From hartzell at alerce.com Thu Feb 14 15:04:44 2013 From: hartzell at alerce.com (George Hartzell) Date: Thu, 14 Feb 2013 12:04:44 -0800 Subject: [Bioperl-l] Question regarding while loops for reading files In-Reply-To: <3cbbba3b-759d-4281-9592-6b690aea92ab@googlegroups.com> References: <3cbbba3b-759d-4281-9592-6b690aea92ab@googlegroups.com> Message-ID: <20765.17244.185833.755900@gargle.gargle.HOWL> I think that it's important to get feedback on code that one has written and to try to understand how/what/why someone else has done in their code. To that end.... Since Tiago's using this to learn the language better I can't resist some comments beyond resetting the file handle. For grins I rewrote it using Text::CSV_XS and Statistics::Basic and to take a single pass through the data file using a multilevel data structure. I resisted the urge to rewrite it in Moose. Didn't even have an urge to rewrite it in R. Funny, that.... The script is here Tiago.pl https://gist.github.com/hartzell/4955401 With something like what I think the data looks like here: https://gist.github.com/hartzell/4955570 Even without that big of a rewrite, I had a bunch of local comments which are inline below. Daisie Huang writes: > [...] > On Wednesday, February 13, 2013 6:46:41 PM UTC-8, Tiago Hori wrote: > > > > Hey Guys, > > > > I am still at the same place. I am writing these little pieces of code to > > try to learn the language better, so any advice would be useful. > > [...] > > here is the code: > > > > #! /usr/bin/perl > > use strict; > > use warnings; > > > > my $file = $ARGV[0]; Slightly better would be $filename, so that when you step up to Path::Class you can differentiate a file object from a file name string. > > my @family = ('AS5','AS9'); Better would be @families, plural. See the use of $family below. > > my $i; > > my $ii; As far as I can tell, these are just counting the number of things that you push onto the various arrays. You don't need them, referring to the list in scalar context will give you its size. > > my $test; You use this to hold the name of the family, so it's not particularly evocative. You should also restrict it's scope to within the loop. See the comment for the foreach loop. > > open (my $fh, "<", $file) or die ("Can't open $file: $!"); You made my day, three arg. open *and* you checked for errors. Nice! > > foreach (@family){ Better would be for my $family (@families) { which is evocative and restricts the scope of $family to the for loop (and for is 4 characters shorter than foreach...). > > $test = $_; No longer need this, using $family declared in the for loop with the proper scoping. > > my @data_weight_2N = (); > > my @data_weight_3N = (); > > while (<$fh>){ > > chomp; > > my $line = $_; > > my @data = split ("\t", $line); Don't parse CSV (TSV) files yourself. Get in the habit of using Text::CSV_XS. > > if ($data[0] !~ /[0-9]*/){ > > next;} > > elsif ($data[1] eq "ABF09-$test"){ > > $i += 1; You don't need the counter. > > push (@data_weight_2N, $data[6]); > > }elsif ($data[1] eq "ABF09-".$test."PS"){ > > $ii += 1; You don't need the counter. > > push (@data_weight_3N,$data[6]); > > } > > } > > my $mean_2N = &average (\@data_weight_2N); > > my $stdev_2N = &stdev (\@data_weight_2N); You don't need the ampersands on the subroutine calls. They're old school and just encourage people to make fun of our language for its use of all those funny punctuation marks . > > my $stderr_2N = ($stdev_2N/sqrt($i)); Unless I'm mistaken, this is equivalent my $stderr_2N = ($stdev_2N/sqrt(scalar @data_weight_2N)); and you don't need the counter, the explicit use of scalar there might even be redundant (I'm a coward). You use the same trick in your subroutine defn's below. > > > > print "These are the the avearge weight, stdev and stderr for $test > > 2N:\t", $mean_2N,"\t",$stdev_2N,"\t",$stderr_2N, "\n"; > > > > my $mean_3N = &average (\@data_weight_3N); > > my $stdev_3N = &stdev (\@data_weight_3N); > > my $stderr_3N = ($stdev_3N/sqrt($i)); > > > > print "These are the the avearge weight, stdev and stderr for $test > > 3N:\t", $mean_3N,"\t",$stdev_3N,"\t",$stderr_3N, "\n"; > > } > > > > close ($fh); Ah, rats. You checked whether open worked, you need to do the same thing on close too! close ($fh) or die !$; Or you could just use autodie qw(open close); and then they'll die appropriately when they have to and you don't have to bother with the checking. > > sub average{ > > my($data) = @_; > > if (not @$data) { > > print ("Empty array\n"); > > return 0; > > } > > my $total = 0; > > foreach (@$data) { > > $total += $_; > > } use List::AllUtils qw(sum); # somewhere up at the top of the script... my $total = sum(@$data); if (not defined $total) { print "Empty array\n"; return; } List::AllUtils is your friend. Learn to use it. Your returning 0 for an empty list is probably the wrong thing, isn't it possible to the total to actually be 0? Just return instead. Don't return undef, just return (and let perl take context into account for you). You probably don't actually want to spew "Empty array" out into your output stream, imagine writing a script that postprocesses your output and having to deal with it. If you really need to say it, send it to standard error with print STDERR "Empty array\n"; > > my $average = $total / @$data; > > return $average; If you don't really need the error message, then you can get to my $total = sum(@$data); return unless $total; return $total / @$data; And if an empty data array is *truly* unexpected, maybe you should just die/carp. > > } > > > > sub stdev{ > > my($data) = @_; > > if(@$data == 1){ > > return 0; > > } > > my $average = &average($data); > > my $sqtotal = 0; > > foreach(@$data) { > > $sqtotal += ($average-$_) ** 2; > > } > > my $std = ($sqtotal / (@$data-1)) ** 0.5; > > return $std; > > } Ditto on the use of List::AllUtils, etc... Phew. The only other thing I'd like to see would be an arrangement that let's you write simple tests. A simple sol'n would be to package the entire main part of the code up into e.g. a subroutine that returns a hashref keyed by family, containing a hashref keyed by 2N/3N/... and then you could just: use Test::More; use Tiago qw(summarize); my $output = summarize("test_data.tsv"); is($output->{AS5}->{'2N}, "42", "Got the magic number") # etc... done_testing; Thanks for sharing your code. Keep practicing! g. From carandraug+dev at gmail.com Thu Feb 14 17:13:45 2013 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Thu, 14 Feb 2013 22:13:45 +0000 Subject: [Bioperl-l] bioperl in Google Summer of Code 2013 Message-ID: Hi we got word of it on another project I'm involved with and I was wondering. Is bioperl going to apply for the Google Summer of Code this year? http://www.google-melange.com/gsoc/homepage/google/gsoc2013 Carn? From hlapp at drycafe.net Fri Feb 15 09:28:30 2013 From: hlapp at drycafe.net (Hilmar Lapp) Date: Fri, 15 Feb 2013 09:28:30 -0500 Subject: [Bioperl-l] bioperl in Google Summer of Code 2013 In-Reply-To: References: Message-ID: <50A8D5F3-7248-49C1-AA1D-969C2BCA38A9@drycafe.net> I presume the OBF does as an umbrella organization on behalf of all Bio* projects. If you fancy proposing a project idea or mentoring, now is not a bad time to think about that or looking for co-mentors. -hilmar Sent with a tap. On Feb 14, 2013, at 5:13 PM, Carn? Draug wrote: > Hi > > we got word of it on another project I'm involved with and I was > wondering. Is bioperl going to apply for the Google Summer of Code > this year? > > http://www.google-melange.com/gsoc/homepage/google/gsoc2013 > > Carn? > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From p.j.a.cock at googlemail.com Fri Feb 15 09:47:39 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Fri, 15 Feb 2013 14:47:39 +0000 Subject: [Bioperl-l] bioperl in Google Summer of Code 2013 In-Reply-To: <50A8D5F3-7248-49C1-AA1D-969C2BCA38A9@drycafe.net> References: <50A8D5F3-7248-49C1-AA1D-969C2BCA38A9@drycafe.net> Message-ID: On Fri, Feb 15, 2013 at 2:28 PM, Hilmar Lapp wrote: > I presume the OBF does as an umbrella organization on behalf of all Bio* > projects. If you fancy proposing a project idea or mentoring, now is not a > bad time to think about that or looking for co-mentors. > > -hilmar Yes, the plan is that as in the last few years, the OBF will apply to GSoC and cover for BioPerl, BioJava, BioRuby, Biopython etc. At this stage the Bio* projects would be wise to start coming up with some good project ideas and experienced developers thinking about being a mentor. For potential students, getting involved in the community early is a good idea (e.g. bug reports, or better fixing existing bugs) See also: http://lists.open-bio.org/mailman/listinfo/gsoc http://lists.open-bio.org/mailman/listinfo/gsoc-mentors Peter From cjfields at illinois.edu Fri Feb 15 09:59:43 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Fri, 15 Feb 2013 14:59:43 +0000 Subject: [Bioperl-l] bioperl in Google Summer of Code 2013 In-Reply-To: References: <50A8D5F3-7248-49C1-AA1D-969C2BCA38A9@drycafe.net> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE28328@CHIMBX5.ad.uillinois.edu> On Feb 15, 2013, at 8:47 AM, Peter Cock wrote: > On Fri, Feb 15, 2013 at 2:28 PM, Hilmar Lapp wrote: >> I presume the OBF does as an umbrella organization on behalf of all Bio* >> projects. If you fancy proposing a project idea or mentoring, now is not a >> bad time to think about that or looking for co-mentors. >> >> -hilmar > > Yes, the plan is that as in the last few years, the OBF will apply to > GSoC and cover for BioPerl, BioJava, BioRuby, Biopython etc. At > this stage the Bio* projects would be wise to start coming up with > some good project ideas and experienced developers thinking about > being a mentor. For potential students, getting involved in the > community early is a good idea (e.g. bug reports, or better fixing > existing bugs) > > See also: > http://lists.open-bio.org/mailman/listinfo/gsoc > http://lists.open-bio.org/mailman/listinfo/gsoc-mentors > > Peter At the moment I'm not sure if Rob is heading this up or if the baton will be passed on to someone else. I can't take charge of writing up a proposal at the moment but I can certainly help edit. chris From scott at scottcain.net Fri Feb 15 14:18:37 2013 From: scott at scottcain.net (Scott Cain) Date: Fri, 15 Feb 2013 14:18:37 -0500 Subject: [Bioperl-l] sequence-region directives in gff files In-Reply-To: References: Message-ID: Hi Carn?, Thanks for pointing this out; I was only sort of paying attention to the FeatureIO discussion, and it hadn't occurred to me that my commit was the problem. I believe I've reproduced the functionality from that commit, and I even added a test that makes use of the added method (yes, I know, it surprised me too!). All of the tests now pass for me in the FeatureIO master. I'm putting it on my todo list to check that the Chado loader that makes use of Bio::FeatureIO still works as expected with the new incarnation. Thanks, Scott On Wed, Feb 13, 2013 at 5:22 AM, Carn? Draug wrote: > Hi Scott > > 3 years ago, the code for the Bio::SeqFeatureIO::* modules was split > from bioperl-live into a separate repository[1]. Because the code was > not removed from the bioperl-live repository, people ended up patching > on both sides, leading to 2 branches of development. Last weekend I > merged them back together with the exception of one commit that would > not longer apply[2]. > > This commit was authored by you with the following commit message: > "tiny change to Bio::FeatureIO::gff to allow the gmod chado gff3 bulk > loader to not choke when the gff file has ##sequence-region > directives. The loader is documented not to support this, but now it > will quitely ignore those directives." > > Do you think you could take a look at it? > > Thank you, > Carn? > > [1] https://github.com/bioperl/Bio-FeatureIO > [2] https://github.com/bioperl/bioperl-live/commit/7218728b66ad297953676236077fd0ec757378c0 -- ------------------------------------------------------------------------ Scott Cain, Ph. D. scott at scottcain dot net GMOD Coordinator (http://gmod.org/) 216-392-3087 Ontario Institute for Cancer Research From carandraug+dev at gmail.com Tue Feb 19 13:52:57 2013 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Tue, 19 Feb 2013 18:52:57 +0000 Subject: [Bioperl-l] bioperl in Google Summer of Code 2013 In-Reply-To: References: <50A8D5F3-7248-49C1-AA1D-969C2BCA38A9@drycafe.net> <118F034CF4C3EF48A96F86CE585B94BF6CE28328@CHIMBX5.ad.uillinois.edu> Message-ID: On 15 February 2013 14:28, Hilmar Lapp wrote: > [...] > If you fancy proposing a project idea or mentoring, now is not a bad time to think about that or looking for co-mentors. On 15 February 2013 14:59, Fields, Christopher J wrote: > At the moment I'm not sure if Rob is heading this up or if the baton will be passed on to someone else. I can't take charge of writing up a proposal at the moment but I can certainly help edit. I would like to participate this year as a student. I do not have however, have any bioperl itch that would last a summer to fix. The largest of them is to implement BLAST using NCBI's server. They have made available a SOAP-based BLAST and doing this has been on my todo for ages. Would you suggest any other project for bioperl? Carn? From peymanalavi at yahoo.com Tue Feb 19 16:16:49 2013 From: peymanalavi at yahoo.com (peyman alavi) Date: Tue, 19 Feb 2013 13:16:49 -0800 (PST) Subject: [Bioperl-l] BioGraphics: Bio::SCF installation through cpan fails Message-ID: <1361308609.90384.YahooMailNeo@web120901.mail.ne1.yahoo.com> Hello, I am having problems for a while trying to install the Bio::SCF module on my Vista32. Now, I know that Bio::SCF isn't really a Bioperl module, but I need it for Bio::Graphics, and I thought perhaps other people had experienced the same problem before.? I have installed zlib and io_lib (both their last available versions), but it looks like sth. (presumably with io_lib) is missing. I should be very grateful if someone could tell me what still needs to be done! Here are the paths where the io_lib "library" and "include" directories are installed, and I set them to cpan before trying to install Bio::SCF: o conf makepl_arg ?LIBS=-Lc:/MinGW/msys/1.0/local/lib INC=-Ic:/MinGW/msys/1.0/local/include? And the following is what I get on the STDOUT: ? Set up gcc environment - 4.7.2 [32m cpan shell -- CPAN exploration and modules installation (v1.9800) Enter 'h' for help.[0m ? [32m??? makepl_arg???????? [LIBS=-Lc:/MinGW/msys/1.0/local/lib INC=-Ic:/MinGW/msys/1.0/local/include][0m [32mPlease use 'o conf commit' to make the config permanent![0m ? [32m[0m [32mReading 'D:\Perl\cpan\Metadata'[0m [32m? Database was generated on Sun, 17 Feb 2013 12:17:02 GMT[0m [32mRunning install for module 'Bio::SCF'[0m [32mRunning make for L/LD/LDS/Bio-SCF-1.03.tar.gz[0m [32mChecksum for D:\Perl\cpan\sources\authors\id\L\LD\LDS\Bio-SCF-1.03.tar.gz ok[0m [32mScanning cache D:\Perl/cpan/build for sizes[0m [32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32mDONE[0m [32mBio-SCF-1.03/[0m [32mBio-SCF-1.03/t/[0m [32mBio-SCF-1.03/t/scf.t[0m [32mBio-SCF-1.03/eg/[0m [32mBio-SCF-1.03/eg/write_test_obj.pl[0m [32mBio-SCF-1.03/eg/write_test_tied.pl[0m [32mBio-SCF-1.03/eg/read_test_obj.pl[0m [32mBio-SCF-1.03/eg/read_test_tied.pl[0m [32mBio-SCF-1.03/SCF/[0m [32mBio-SCF-1.03/SCF/Arrays.pm[0m [32mBio-SCF-1.03/DISCLAIMER[0m [32mBio-SCF-1.03/README[0m [32mBio-SCF-1.03/SCF.pm[0m [32mBio-SCF-1.03/SCF.xs[0m [32mBio-SCF-1.03/Changes[0m [32mBio-SCF-1.03/test.scf[0m [32mBio-SCF-1.03/Makefile.PL[0m [32mBio-SCF-1.03/META.yml[0m [32mBio-SCF-1.03/INSTALL[0m [32mBio-SCF-1.03/MANIFEST[0m [32m ? CPAN.pm: Building L/LD/LDS/Bio-SCF-1.03.tar.gz[0m ? Set up gcc environment - 4.7.2 Checking if your kit is complete... Looks good Writing Makefile for Bio::SCF Writing MYMETA.yml and MYMETA.json cp SCF.pm blib\lib\Bio\SCF.pm cp SCF/Arrays.pm blib\lib\Bio\SCF\Arrays.pm D:\Perl\bin\perl.exe D:\Perl\site\lib\ExtUtils\xsubpp? -typemap D:\Perl\lib\ExtUtils\typemap? SCF.xs > SCF.xsc && D:\Perl\bin\perl.exe -MExtUtils::Command -e mv -- SCF.xsc SCF.c Please specify prototyping behavior for SCF.xs (see perlxs manual) c:/MinGW/bin/gcc.exe -c? -Ic:/MinGW/msys/1.0/local/include ???????????? -DNDEBUG -DWIN32 -D_CONSOLE -DNO_STRICT -DHAVE_DES_FCRYPT -DUSE_SITECUSTOMIZE -DPERL_IMPLICIT_CONTEXT -DPERL_IMPLICIT_SYS -DUSE_PERLIO -D_USE_32BIT_TIME_T -DPERL_MSVCRT_READFIX -DHASATTRIBUTE -fno-strict-aliasing -mms-bitfields -O2 ??????? ??-DVERSION=\"1.03\" ??????? -DXS_VERSION=\"1.03\"? "-ID:\Perl\lib\CORE"? -DLITTLE_ENDIAN SCF.c In file included from c:/MinGW/msys/1.0/local/include/io_lib/scf.h:31:0, ???????????????? from SCF.xs:12: c:/MinGW/msys/1.0/local/include/io_lib/mFILE.h:23:0: warning: "MF_APPEND" redefined [enabled by default] In file included from c:\mingw\bin\../lib/gcc/mingw32/4.7.2/../../../../include/windows.h:55:0, ???????????????? from D:\Perl\lib\CORE/win32.h:61, ???????????????? from D:\Perl\lib\CORE/win32thread.h:4, ???????????????? from D:\Perl\lib\CORE/perl.h:2825, ???????????????? from SCF.xs:5: c:\mingw\bin\../lib/gcc/mingw32/4.7.2/../../../../include/winuser.h:131:0: note: this is the location of the previous definition SCF.xs: In function 'XS_Bio__SCF_get_scf_pointer': SCF.xs:35:2: warning: passing argument 3 of '(*Perl_ILIO_ptr((struct PerlInterpreter *)Perl_get_context()))->pNameStat' from incompatible pointer type [enabled by default] SCF.xs:35:2: note: expected 'struct _stati64 *' but argument is of type 'struct stat *' Running Mkbootstrap for Bio::SCF () D:\Perl\bin\perl.exe -MExtUtils::Command -e chmod -- 644 SCF.bs D:\Perl\bin\perl.exe -MExtUtils::Mksymlists \ ???? -e "Mksymlists('NAME'=>\"Bio::SCF\", 'DLBASE' => 'SCF', 'DL_FUNCS' => {? }, 'FUNCLIST' => [], 'IMPORTS' => {? }, 'DL_VARS' => []);" Set up gcc environment - 4.7.2 dlltool --def SCF.def --output-exp dll.exp c:\MinGW\bin\g++.exe -o blib\arch\auto\Bio\SCF\SCF.dll -Wl,--base-file -Wl,dll.base -mdll -L"D:\Perl\lib\CORE" SCF.o?? D:\Perl\lib\CORE\perl512.lib c:\MinGW\lib\libkernel32.a c:\MinGW\lib\libuser32.a c:\MinGW\lib\libgdi32.a c:\MinGW\lib\libwinspool.a c:\MinGW\lib\libcomdlg32.a c:\MinGW\lib\libadvapi32.a c:\MinGW\lib\libshell32.a c:\MinGW\lib\libole32.a c:\MinGW\lib\liboleaut32.a c:\MinGW\lib\libnetapi32.a c:\MinGW\lib\libuuid.a c:\MinGW\lib\libws2_32.a c:\MinGW\lib\libmpr.a c:\MinGW\lib\libwinmm.a c:\MinGW\lib\libversion.a c:\MinGW\lib\libodbc32.a c:\MinGW\lib\libodbccp32.a c:\MinGW\lib\libcomctl32.a c:\MinGW\lib\libmsvcrt.a dll.exp Warning: resolving _VirtualQuery at 12 by linking to _VirtualQuery Use --enable-stdcall-fixup to disable these warnings Use --disable-stdcall-fixup to disable these fixups Warning: resolving _VirtualProtect at 16 by linking to _VirtualProtect Warning: resolving _EnterCriticalSection at 4 by linking to _EnterCriticalSection Warning: resolving _TlsGetValue at 4 by linking to _TlsGetValue Warning: resolving _GetLastError at 0 by linking to _GetLastError Warning: resolving _LeaveCriticalSection at 4 by linking to _LeaveCriticalSection Warning: resolving _DeleteCriticalSection at 4 by linking to _DeleteCriticalSection Warning: resolving _InitializeCriticalSection at 4 by linking to _InitializeCriticalSection SCF.o:SCF.c:(.text+0xf35): undefined reference to `mfreopen' SCF.o:SCF.c:(.text+0xf4b): undefined reference to `mfwrite_scf' SCF.o:SCF.c:(.text+0xf6a): undefined reference to `mfflush' SCF.o:SCF.c:(.text+0xf72): undefined reference to `mfdestroy' SCF.o:SCF.c:(.text+0x1138): undefined reference to `write_scf' SCF.o:SCF.c:(.text+0x16ac): undefined reference to `scf_deallocate' SCF.o:SCF.c:(.text+0x17b1): undefined reference to `mfreopen' SCF.o:SCF.c:(.text+0x17c1): undefined reference to `mfread_scf' SCF.o:SCF.c:(.text+0x19bd): undefined reference to `read_scf' c:/mingw/bin/../lib/gcc/mingw32/4.7.2/../../../../mingw32/bin/ld.exe: SCF.o: bad reloc address 0xa4 in section `.rdata' c:/mingw/bin/../lib/gcc/mingw32/4.7.2/../../../../mingw32/bin/ld.exe: final link failed: Invalid operation collect2.exe: error: ld returned 1 exit status dmake.exe:? Error code 129, while making 'blib\arch\auto\Bio\SCF\SCF.dll' [32m? LDS/Bio-SCF-1.03.tar.gz[0m [31m? D:\Perl\site\bin\dmake.exe -- NOT OK[0m [32mRunning make test[0m [32m? Can't test without successful make[0m [32mRunning make install[0m [32m? Make had returned bad status, install seems impossible[0m [32mFailed during this command: ?LDS/Bio-SCF-1.03.tar.gz????????????????????? : make NO[0m [32m[0m [31mWarning: Configuration not saved.[0m [32mLockfile removed.[0m ? ? ?Thanks in advance for any useful suggestions/help!! Peyman From scott at scottcain.net Tue Feb 19 18:39:44 2013 From: scott at scottcain.net (Scott Cain) Date: Tue, 19 Feb 2013 18:39:44 -0500 Subject: [Bioperl-l] BioGraphics: Bio::SCF installation through cpan fails In-Reply-To: <1361308609.90384.YahooMailNeo@web120901.mail.ne1.yahoo.com> References: <1361308609.90384.YahooMailNeo@web120901.mail.ne1.yahoo.com> Message-ID: <777246AB-2EF0-403D-9652-8EA8390D5C53@scottcain.net> Hi Peyman, I have no idea what might be required to get staden and Bio::SCF installed on a windows machine; you have my sympathies for having to go through it. But what I wanted to touch on was what you wrote, that is, that you "need" it for Bio::Graphics. I just wanted to point out that you don't need it unless you want to be able to display traces from ABI sequencers (which most people don't really care to do these days). Bioi::SCF is listed as a recommended module, not a required one. Scott Sent from my iPad On Feb 19, 2013, at 4:16 PM, peyman alavi wrote: > Hello, > I am having > problems for a while trying to install the Bio::SCF module on my Vista32. Now, I know that Bio::SCF isn't really a Bioperl module, but I need it for Bio::Graphics, and I thought perhaps other people had experienced the same problem before. I > have installed zlib and io_lib (both their last available versions), but it > looks like sth. (presumably with io_lib) is missing. I should be very grateful > if someone could tell me what still needs to be done! > Here are > the paths where the io_lib "library" and "include" directories are installed, and I > set them to cpan before trying to install Bio::SCF: > o conf > makepl_arg ?LIBS=-Lc:/MinGW/msys/1.0/local/lib INC=-Ic:/MinGW/msys/1.0/local/include? > And the > following is what I get on the STDOUT: > > Set up gcc environment - 4.7.2 > [32m > cpan shell -- CPAN exploration and modules installation (v1.9800) > Enter 'h' for help.[0m > > [32m makepl_arg [LIBS=-Lc:/MinGW/msys/1.0/local/lib > INC=-Ic:/MinGW/msys/1.0/local/include][0m > [32mPlease use 'o conf commit' to make the config permanent![0m > > [32m[0m > [32mReading 'D:\Perl\cpan\Metadata'[0m > [32m Database was generated on > Sun, 17 Feb 2013 12:17:02 GMT[0m > [32mRunning install for module 'Bio::SCF'[0m > [32mRunning make for L/LD/LDS/Bio-SCF-1.03.tar.gz[0m > [32mChecksum for > D:\Perl\cpan\sources\authors\id\L\LD\LDS\Bio-SCF-1.03.tar.gz ok[0m > [32mScanning cache D:\Perl/cpan/build for sizes[0m > [32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32mDONE[0m > [32mBio-SCF-1.03/[0m > [32mBio-SCF-1.03/t/[0m > [32mBio-SCF-1.03/t/scf.t[0m > [32mBio-SCF-1.03/eg/[0m > [32mBio-SCF-1.03/eg/write_test_obj.pl[0m > [32mBio-SCF-1.03/eg/write_test_tied.pl[0m > [32mBio-SCF-1.03/eg/read_test_obj.pl[0m > [32mBio-SCF-1.03/eg/read_test_tied.pl[0m > [32mBio-SCF-1.03/SCF/[0m > [32mBio-SCF-1.03/SCF/Arrays.pm[0m > [32mBio-SCF-1.03/DISCLAIMER[0m > [32mBio-SCF-1.03/README[0m > [32mBio-SCF-1.03/SCF.pm[0m > [32mBio-SCF-1.03/SCF.xs[0m > [32mBio-SCF-1.03/Changes[0m > [32mBio-SCF-1.03/test.scf[0m > [32mBio-SCF-1.03/Makefile.PL[0m > [32mBio-SCF-1.03/META.yml[0m > [32mBio-SCF-1.03/INSTALL[0m > [32mBio-SCF-1.03/MANIFEST[0m > [32m > CPAN.pm: Building > L/LD/LDS/Bio-SCF-1.03.tar.gz[0m > > Set up gcc environment - 4.7.2 > Checking if your kit is complete... > Looks good > Writing Makefile for Bio::SCF > Writing MYMETA.yml and MYMETA.json > cp SCF.pm blib\lib\Bio\SCF.pm > cp SCF/Arrays.pm blib\lib\Bio\SCF\Arrays.pm > D:\Perl\bin\perl.exe D:\Perl\site\lib\ExtUtils\xsubpp -typemap D:\Perl\lib\ExtUtils\typemap SCF.xs > SCF.xsc && > D:\Perl\bin\perl.exe -MExtUtils::Command -e mv -- SCF.xsc SCF.c > Please specify prototyping behavior for SCF.xs (see perlxs manual) > c:/MinGW/bin/gcc.exe -c -Ic:/MinGW/msys/1.0/local/include -DNDEBUG > -DWIN32 -D_CONSOLE -DNO_STRICT -DHAVE_DES_FCRYPT -DUSE_SITECUSTOMIZE > -DPERL_IMPLICIT_CONTEXT -DPERL_IMPLICIT_SYS -DUSE_PERLIO -D_USE_32BIT_TIME_T > -DPERL_MSVCRT_READFIX -DHASATTRIBUTE -fno-strict-aliasing -mms-bitfields -O2 -DVERSION=\"1.03\" -DXS_VERSION=\"1.03\" "-ID:\Perl\lib\CORE" -DLITTLE_ENDIAN SCF.c > In file included from c:/MinGW/msys/1.0/local/include/io_lib/scf.h:31:0, > from SCF.xs:12: > c:/MinGW/msys/1.0/local/include/io_lib/mFILE.h:23:0: warning: > "MF_APPEND" redefined [enabled by default] > In file included from > c:\mingw\bin\../lib/gcc/mingw32/4.7.2/../../../../include/windows.h:55:0, > from > D:\Perl\lib\CORE/win32.h:61, > from > D:\Perl\lib\CORE/win32thread.h:4, > from > D:\Perl\lib\CORE/perl.h:2825, > from SCF.xs:5: > c:\mingw\bin\../lib/gcc/mingw32/4.7.2/../../../../include/winuser.h:131:0: > note: this is the location of the previous definition > SCF.xs: In function 'XS_Bio__SCF_get_scf_pointer': > SCF.xs:35:2: warning: passing argument 3 of '(*Perl_ILIO_ptr((struct > PerlInterpreter *)Perl_get_context()))->pNameStat' from incompatible pointer > type [enabled by default] > SCF.xs:35:2: note: expected 'struct _stati64 *' but argument is of type > 'struct stat *' > Running Mkbootstrap for Bio::SCF () > D:\Perl\bin\perl.exe -MExtUtils::Command -e chmod -- 644 SCF.bs > D:\Perl\bin\perl.exe -MExtUtils::Mksymlists \ > -e > "Mksymlists('NAME'=>\"Bio::SCF\", 'DLBASE' => 'SCF', > 'DL_FUNCS' => { }, 'FUNCLIST' => > [], 'IMPORTS' => { }, 'DL_VARS' => > []);" > Set up gcc environment - 4.7.2 > dlltool --def SCF.def --output-exp dll.exp > c:\MinGW\bin\g++.exe -o blib\arch\auto\Bio\SCF\SCF.dll -Wl,--base-file > -Wl,dll.base -mdll -L"D:\Perl\lib\CORE" SCF.o D:\Perl\lib\CORE\perl512.lib > c:\MinGW\lib\libkernel32.a c:\MinGW\lib\libuser32.a c:\MinGW\lib\libgdi32.a > c:\MinGW\lib\libwinspool.a c:\MinGW\lib\libcomdlg32.a c:\MinGW\lib\libadvapi32.a > c:\MinGW\lib\libshell32.a c:\MinGW\lib\libole32.a c:\MinGW\lib\liboleaut32.a > c:\MinGW\lib\libnetapi32.a c:\MinGW\lib\libuuid.a c:\MinGW\lib\libws2_32.a > c:\MinGW\lib\libmpr.a c:\MinGW\lib\libwinmm.a c:\MinGW\lib\libversion.a > c:\MinGW\lib\libodbc32.a c:\MinGW\lib\libodbccp32.a c:\MinGW\lib\libcomctl32.a > c:\MinGW\lib\libmsvcrt.a dll.exp > Warning: resolving _VirtualQuery at 12 by linking to _VirtualQuery > Use --enable-stdcall-fixup to disable these warnings > Use --disable-stdcall-fixup to disable these fixups > Warning: resolving _VirtualProtect at 16 by linking to _VirtualProtect > Warning: resolving _EnterCriticalSection at 4 by linking to > _EnterCriticalSection > Warning: resolving _TlsGetValue at 4 by linking to _TlsGetValue > Warning: resolving _GetLastError at 0 by linking to _GetLastError > Warning: resolving _LeaveCriticalSection at 4 by linking to > _LeaveCriticalSection > Warning: resolving _DeleteCriticalSection at 4 by linking to > _DeleteCriticalSection > Warning: resolving _InitializeCriticalSection at 4 by linking to > _InitializeCriticalSection > SCF.o:SCF.c:(.text+0xf35): undefined reference to `mfreopen' > SCF.o:SCF.c:(.text+0xf4b): undefined reference to `mfwrite_scf' > SCF.o:SCF.c:(.text+0xf6a): undefined reference to `mfflush' > SCF.o:SCF.c:(.text+0xf72): undefined reference to `mfdestroy' > SCF.o:SCF.c:(.text+0x1138): undefined reference to `write_scf' > SCF.o:SCF.c:(.text+0x16ac): undefined reference to `scf_deallocate' > SCF.o:SCF.c:(.text+0x17b1): undefined reference to `mfreopen' > SCF.o:SCF.c:(.text+0x17c1): undefined reference to `mfread_scf' > SCF.o:SCF.c:(.text+0x19bd): undefined reference to `read_scf' > c:/mingw/bin/../lib/gcc/mingw32/4.7.2/../../../../mingw32/bin/ld.exe: > SCF.o: bad reloc address 0xa4 in section `.rdata' > c:/mingw/bin/../lib/gcc/mingw32/4.7.2/../../../../mingw32/bin/ld.exe: > final link failed: Invalid operation > collect2.exe: error: ld returned 1 exit status > dmake.exe: Error code 129, while > making 'blib\arch\auto\Bio\SCF\SCF.dll' > [32m LDS/Bio-SCF-1.03.tar.gz[0m > [31m D:\Perl\site\bin\dmake.exe > -- NOT OK[0m > [32mRunning make test[0m > [32m Can't test without successful > make[0m > [32mRunning make install[0m > [32m Make had returned bad > status, install seems impossible[0m > [32mFailed during this command: > LDS/Bio-SCF-1.03.tar.gz : make NO[0m > [32m[0m > [31mWarning: Configuration not saved.[0m > [32mLockfile removed.[0m > > > Thanks in advance for any useful > suggestions/help!! > Peyman > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From anngregory at email.arizona.edu Wed Feb 20 00:20:41 2013 From: anngregory at email.arizona.edu (Ann Gregory) Date: Tue, 19 Feb 2013 22:20:41 -0700 Subject: [Bioperl-l] Problem Parsing BLAST output to annotate FASTA file Message-ID: Hi BioPerl, I am having issues with a BioPerl script. I have a blastxml file from a blastx blast and the original multifasta file containing the original nucleotides sequences. I want to take the blast result (ie. the blast description) and annotate my multifasta file. I have written 2 while loops that extract the blast descriptions as well as the nucleotide sequence from the multifasta file. My problem is that I cannot incorporate one of the while loops into the other without loosing the loop property of one of the loops. I would like to take the 1st blast description, then the 1st nucleotide sequence, then the 2nd blast description, then the 2nd nucleotide sequence and so on...just can figure out how to alternate the results. See script below: use warnings; use strict; use Bio::SearchIO; use Bio::SeqIO; my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file => "$ARGV[0]"); while (my $result = $search_in->next_result) { while (my $hit = $result->next_hit) { while (my $hsp = $hit->next_hsp) { my $qd = $hit->description; print $qd, "\n"; } } } my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]"); while (my $seqobj = $seqio->next_seq) { my $nuc = $seqobj->seq(); print $nuc, "\n"; }-- Ann (Nina) Gregory Graduate Student Rich Lab / Sullivan Lab Soil, Water, Environmental Science Department University of Arizona From yonexhalaolv at gmail.com Wed Feb 20 04:17:12 2013 From: yonexhalaolv at gmail.com (Sebastian Lau) Date: Wed, 20 Feb 2013 01:17:12 -0800 (PST) Subject: [Bioperl-l] =?utf-8?q?failed_to_install_via_fink=EF=BC=9Ano_packa?= =?utf-8?q?ge_found_for_specification_=27bioperl-pm5100=27!?= Message-ID: <84fa1bcb-a39f-4847-bff2-e3a9c2b909ea@googlegroups.com> *Hi guys,* * * *I just about to install bioperl on my MacOS 10.7.5 via fink. but after typing the command, fink said it couldn't find any package:* fangmatoMacBook-Pro:~ yoyo$ fink install bioperl-pm5100 Information about 6901 packages read in 1 seconds. Failed: no package found for specification 'bioperl-pm5100'! fangmatoMacBook-Pro:~ yoyo$ fink install bioperl-pm588 Information about 6901 packages read in 1 seconds. Failed: no package found for specification 'bioperl-pm588'! fangmatoMacBook-Pro:~ yoyo$ fink install bioperl-pm586 Information about 6901 packages read in 1 seconds. Failed: no package found for specification 'bioperl-pm586'! *I followed the instruction on wiki. I don't know what's wrong with it. Thanks for your help.* From awitney at sgul.ac.uk Wed Feb 20 10:22:51 2013 From: awitney at sgul.ac.uk (Adam Witney) Date: Wed, 20 Feb 2013 15:22:51 +0000 Subject: [Bioperl-l] Problem Parsing BLAST output to annotate FASTA file In-Reply-To: References: Message-ID: <5124EA4B.5020409@sgul.ac.uk> Hi Ann, On 20/02/2013 05:20, Ann Gregory wrote: > Hi BioPerl, > > I am having issues with a BioPerl script. I have a blastxml file from a > blastx blast and the original multifasta file containing the original > nucleotides sequences. > > I want to take the blast result (ie. the blast description) and annotate my > multifasta file. > > I have written 2 while loops that extract the blast descriptions as well as > the nucleotide sequence from the multifasta file. > > My problem is that I cannot incorporate one of the while loops into the > other without loosing the loop property of one of the loops. I would like > to take the 1st blast description, then the 1st nucleotide sequence, then > the 2nd blast description, then the 2nd nucleotide sequence and so > on...just can figure out how to alternate the results. > > See script below: > > > use warnings; > use strict; > use Bio::SearchIO; > use Bio::SeqIO; > > > my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file => > "$ARGV[0]"); > while (my $result = $search_in->next_result) { > while (my $hit = $result->next_hit) { > while (my $hsp = $hit->next_hsp) { > my $qd = $hit->description; > print $qd, "\n"; > } > } > } > > my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]"); > while (my $seqobj = $seqio->next_seq) { > my $nuc = $seqobj->seq(); > print $nuc, "\n"; > }-- I think what you are proposing assumes that the loop over the BLAST results will come back in the same order as the loop over the Fasta file, this may be the case, but I'm not sure its something I would rely on. Anyway, I would loop over the BLAST results, storing the relevant data to an array or hash and then loop over the fasta file to put the two together. eg: my $blast_data; while ( ... blast data ... ) { ... $blast_data->{$qd} = ... } while ( my $seqobj = $seqio->next_seq ) { my $id = $seqobj->id; print $blast_data->{$id}."\n"; } something along those lines... or have i misunderstood you? if so can you provide some more details, like what do you want your output to look like? HTH Adam From andreas.leimbach at uni-wuerzburg.de Wed Feb 20 11:24:50 2013 From: andreas.leimbach at uni-wuerzburg.de (Andreas Leimbach) Date: Wed, 20 Feb 2013 17:24:50 +0100 Subject: [Bioperl-l] Problem Parsing BLAST output to annotate FASTA file In-Reply-To: References: Message-ID: <5124F8D2.4020904@uni-wuerzburg.de> oops, I just realized I had one loop to much in there. Adam is correct. Sorry. The last part of the code I send you should look like this: my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]"); while (my $seqobj = $seqio->next_seq) { print ">$hits{$seqobj->display_id}\n"; my $nuc = $seqobj->seq(); print $nuc, "\n"; } Cheers, Andreas -- Andreas Leimbach Universit?t M?nster Institut f?r Hygiene Mendelstr. 7 D-48149 M?nster Germany Tel.: +49 (0)551 39 3843 E-Mail: andreas.leimbach at uni-wuerzburg.de On 20.2.13 06:20, Ann Gregory wrote: > Hi BioPerl, > > I am having issues with a BioPerl script. I have a blastxml file from a > blastx blast and the original multifasta file containing the original > nucleotides sequences. > > I want to take the blast result (ie. the blast description) and annotate my > multifasta file. > > I have written 2 while loops that extract the blast descriptions as well as > the nucleotide sequence from the multifasta file. > > My problem is that I cannot incorporate one of the while loops into the > other without loosing the loop property of one of the loops. I would like > to take the 1st blast description, then the 1st nucleotide sequence, then > the 2nd blast description, then the 2nd nucleotide sequence and so > on...just can figure out how to alternate the results. > > See script below: > > > use warnings; > use strict; > use Bio::SearchIO; > use Bio::SeqIO; > > > my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file => > "$ARGV[0]"); > while (my $result = $search_in->next_result) { > while (my $hit = $result->next_hit) { > while (my $hsp = $hit->next_hsp) { > my $qd = $hit->description; > print $qd, "\n"; > } > } > } > > my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]"); > while (my $seqobj = $seqio->next_seq) { > my $nuc = $seqobj->seq(); > print $nuc, "\n"; > }-- > Ann (Nina) Gregory > Graduate Student > Rich Lab / Sullivan Lab > Soil, Water, Environmental Science Department > University of Arizona > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From andreas.leimbach at uni-wuerzburg.de Wed Feb 20 11:14:29 2013 From: andreas.leimbach at uni-wuerzburg.de (Andreas Leimbach) Date: Wed, 20 Feb 2013 17:14:29 +0100 Subject: [Bioperl-l] Problem Parsing BLAST output to annotate FASTA file In-Reply-To: References: Message-ID: <5124F665.5050602@uni-wuerzburg.de> Hi Ann, I agree with Adam, but I was already writing my email, while his came in. Hope it helps: I hope I understand correctly what you want to do. Just to clarify, you queried a protein blast database with blastx and nucleotide queries. Now you want to associate the protein description for the FIRST blast hit with the corresponding nucleotide fasta file. Is that correct? You have to put the two while loops into one another. Or associate the blast hits with the query descriptions. But it's not feasible to take the first blast hit and the first nucleotide fasta seq, then the 2nd of both etc, as Adam already pointed out. You would have to iterate through both at the same time. I.e. take the first blast hit, then iterate through the nucleotide fasta until you find the hit. Then take the 2nd blast hit and iterate through the nucleotide fasta etc. It's probably easiest to do this in a hash. Something along the lines of (not tested I just punched that in the E-Mail): my %hits; my $hit_desc; my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file => "$ARGV[0]"); while (my $result = $search_in->next_result) { while (my $hit = $result->next_hit) { while (my $hsp = $hit->next_hsp) { if ($hit->description eq $hit_desc) { # Only want the first blast hit next; } my $hit_desc = $hit->description; $hits{$result->query_description} = $hit_desc; } } } my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]"); foreach my $query (keys %hits) { while (my $seqobj = $seqio->next_seq) { if ($seqobj->display_id eq $query) { print ">$hits{$query}\n"; my $nuc = $seqobj->seq(); print $nuc, "\n"; } You might want to put some evalue cutoff in there to only score significant hits. Also if your nucleotide query multi-fasta file is very large, you might consider creating an index first: http://www.bioperl.org/wiki/HOWTO:Local_Databases#Bio::Index Hope that helps! Cheers, Andreas P.S.: Please next time include version numbers for BioPerl and Perl and a little more detail what you want to do. ;-) -- Andreas Leimbach Universit?t M?nster Institut f?r Hygiene Mendelstr. 7 D-48149 M?nster Germany Tel.: +49 (0)551 39 3843 E-Mail: andreas.leimbach at uni-wuerzburg.de On 20.2.13 06:20, Ann Gregory wrote: > Hi BioPerl, > > I am having issues with a BioPerl script. I have a blastxml file from a > blastx blast and the original multifasta file containing the original > nucleotides sequences. > > I want to take the blast result (ie. the blast description) and annotate my > multifasta file. > > I have written 2 while loops that extract the blast descriptions as well as > the nucleotide sequence from the multifasta file. > > My problem is that I cannot incorporate one of the while loops into the > other without loosing the loop property of one of the loops. I would like > to take the 1st blast description, then the 1st nucleotide sequence, then > the 2nd blast description, then the 2nd nucleotide sequence and so > on...just can figure out how to alternate the results. > > See script below: > > > use warnings; > use strict; > use Bio::SearchIO; > use Bio::SeqIO; > > > my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file => > "$ARGV[0]"); > while (my $result = $search_in->next_result) { > while (my $hit = $result->next_hit) { > while (my $hsp = $hit->next_hsp) { > my $qd = $hit->description; > print $qd, "\n"; > } > } > } > > my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]"); > while (my $seqobj = $seqio->next_seq) { > my $nuc = $seqobj->seq(); > print $nuc, "\n"; > }-- > Ann (Nina) Gregory > Graduate Student > Rich Lab / Sullivan Lab > Soil, Water, Environmental Science Department > University of Arizona > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From andreas.leimbach at uni-wuerzburg.de Wed Feb 20 12:00:51 2013 From: andreas.leimbach at uni-wuerzburg.de (Andreas Leimbach) Date: Wed, 20 Feb 2013 18:00:51 +0100 Subject: [Bioperl-l] Problem Parsing BLAST output to annotate FASTA file In-Reply-To: References: <5124F8D2.4020904@uni-wuerzburg.de> Message-ID: <51250143.9050503@uni-wuerzburg.de> Hey Ann, damn, it 's not my best day ... Anyways, I wouldn't work with List::MoreUtils's each_array function, as this assumes that the blast hits and the nucleotide queries are in the same order (as Adam pointed out). Rather use a hash which associates a key to a certain value. Also, the hash can be used to skip sequences that have no hits. Here's my new version: my %hits; my $hit_desc; my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file => "$ARGV[0]"); while (my $result = $search_in->next_result) { while (my $hit = $result->next_hit) { while (my $hsp = $hit->next_hsp) { $hits{$result->query_description} = $hit->description; # hash: associate query_desc (key) with hit_desc (value) last; # jump out of the while loop; this should resolve getting only the first hit } last; # see above } } my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]"); while (my $seqobj = $seqio->next_seq) { if ($hits{$seqobj->display_id}) { # only true if display_id associated with hit_desc and should skip seqs without hits print ">$hits{$seqobj->display_id}\n"; my $nuc = $seqobj->seq(); print $nuc, "\n"; } } Cheers, Andreas P.S.: I redirected your mail to the BioPerl mailing list, others might profit from my mistakes ;-) ... -- Andreas Leimbach Universit?t M?nster Institut f?r Hygiene Mendelstr. 7 D-48149 M?nster Germany Tel.: +49 (0)551 39 3843 E-Mail: andreas.leimbach at uni-wuerzburg.de On 20.2.13 17:35, Ann Gregory wrote: > Hi Andreas, > > Thanks for you help! I don't understand how this gets the first blast hit: > > if ($hit->description eq $hit_desc) { # Only want the first blast hit > next; > } > > I tried this and seems to be working...but I can't get the 1st blast hit > or skip the sequences that had no hits. Do you know any quick fixes? > > * > use warnings; > use strict; > use Bio::SearchIO; > use Bio::SeqIO; > use List::MoreUtils qw(each_array); > > my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file => > "$ARGV[0]"); > my @ids; > while (my $result = $search_in->next_result) { > while (my $hit = $result->next_hit) { > while (my $hsp = $hit->next_hsp) { > my $match = $result->num_hits; > push(@ids, $qd); > } > } > } > } > > my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]"); > my @seqs; > while (my $seqobj = $seqio->next_seq) { > my $nuc = $seqobj->seq(); > push(@seqs, $nuc); > } > > my $it = each_array(@ids, at seqs); > while(my($ids,$seqs)=$it->()){ > print $ids, "\n", $seqs, "\n"; > } > * > > Thanks again! > ~Ann > > On Wed, Feb 20, 2013 at 9:24 AM, Andreas Leimbach > > wrote: > > oops, I just realized I had one loop to much in there. Adam is > correct. Sorry. > > The last part of the code I send you should look like this: > > > my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]"); > while (my $seqobj = $seqio->next_seq) { > print ">$hits{$seqobj->display_id}\__n"; > > my $nuc = $seqobj->seq(); > print $nuc, "\n"; > } > > > Cheers, > Andreas > > > -- > Andreas Leimbach > Universit?t M?nster > Institut f?r Hygiene > Mendelstr. 7 > D-48149 M?nster > Germany > > Tel.: +49 (0)551 39 3843 > E-Mail: andreas.leimbach at uni-__wuerzburg.de > > > On 20.2.13 06:20, Ann Gregory wrote: > > Hi BioPerl, > > I am having issues with a BioPerl script. I have a blastxml file > from a > blastx blast and the original multifasta file containing the > original > nucleotides sequences. > > I want to take the blast result (ie. the blast description) and > annotate my > multifasta file. > > I have written 2 while loops that extract the blast descriptions > as well as > the nucleotide sequence from the multifasta file. > > My problem is that I cannot incorporate one of the while loops > into the > other without loosing the loop property of one of the loops. I > would like > to take the 1st blast description, then the 1st nucleotide > sequence, then > the 2nd blast description, then the 2nd nucleotide sequence and so > on...just can figure out how to alternate the results. > > See script below: > > > use warnings; > use strict; > use Bio::SearchIO; > use Bio::SeqIO; > > > my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file => > "$ARGV[0]"); > while (my $result = $search_in->next_result) { > while (my $hit = $result->next_hit) { > while (my $hsp = $hit->next_hsp) { > my $qd = $hit->description; > print $qd, "\n"; > } > } > } > > my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => > "$ARGV[1]"); > while (my $seqobj = $seqio->next_seq) { > my $nuc = $seqobj->seq(); > print $nuc, "\n"; > }-- > Ann (Nina) Gregory > Graduate Student > Rich Lab / Sullivan Lab > Soil, Water, Environmental Science Department > University of Arizona > _________________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/__mailman/listinfo/bioperl-l > > > > > > -- > Ann (Nina) Gregory > Graduate Student > Rich Lab / Sullivan Lab > Soil, Water, Environmental Science Department > University of Arizona > > > From cjfields at illinois.edu Wed Feb 20 13:24:58 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 20 Feb 2013 18:24:58 +0000 Subject: [Bioperl-l] Problem Parsing BLAST output to annotate FASTA file In-Reply-To: <51250143.9050503@uni-wuerzburg.de> References: <5124F8D2.4020904@uni-wuerzburg.de> <51250143.9050503@uni-wuerzburg.de> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE2EB4A@CHIMBX5.ad.uillinois.edu> If this is meant to be something done using the same FASTA files for a bunch of BLAST reports, might be worth setting up a flat file index and using that to look up and grab the sequences; it should be a LOT faster, just the first pass (generation of the initial index) would take a little time. Look at Bio::DB::Fasta for an example. chris On Feb 20, 2013, at 11:00 AM, Andreas Leimbach wrote: > Hey Ann, > > damn, it 's not my best day ... Anyways, I wouldn't work with List::MoreUtils's each_array function, as this assumes that the blast hits and the nucleotide queries are in the same order (as Adam pointed out). Rather use a hash which associates a key to a certain value. Also, the hash can be used to skip sequences that have no hits. > Here's my new version: > > my %hits; > my $hit_desc; > my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file => > "$ARGV[0]"); > while (my $result = $search_in->next_result) { > while (my $hit = $result->next_hit) { > while (my $hsp = $hit->next_hsp) { > $hits{$result->query_description} = $hit->description; # hash: associate query_desc (key) with hit_desc (value) > last; # jump out of the while loop; this should resolve getting only the first hit > } > last; # see above > } > } > > > my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]"); > while (my $seqobj = $seqio->next_seq) { > if ($hits{$seqobj->display_id}) { # only true if display_id associated with hit_desc and should skip seqs without hits > print ">$hits{$seqobj->display_id}\n"; > my $nuc = $seqobj->seq(); > print $nuc, "\n"; > } > } > > Cheers, > Andreas > > P.S.: I redirected your mail to the BioPerl mailing list, others might profit from my mistakes ;-) ... > > -- > Andreas Leimbach > Universit?t M?nster > Institut f?r Hygiene > Mendelstr. 7 > D-48149 M?nster > Germany > > Tel.: +49 (0)551 39 3843 > E-Mail: andreas.leimbach at uni-wuerzburg.de > > On 20.2.13 17:35, Ann Gregory wrote: >> Hi Andreas, >> >> Thanks for you help! I don't understand how this gets the first blast hit: >> >> if ($hit->description eq $hit_desc) { # Only want the first blast hit >> next; >> } >> >> I tried this and seems to be working...but I can't get the 1st blast hit >> or skip the sequences that had no hits. Do you know any quick fixes? >> >> * >> use warnings; >> use strict; >> use Bio::SearchIO; >> use Bio::SeqIO; >> use List::MoreUtils qw(each_array); >> >> my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file => >> "$ARGV[0]"); >> my @ids; >> while (my $result = $search_in->next_result) { >> while (my $hit = $result->next_hit) { >> while (my $hsp = $hit->next_hsp) { >> my $match = $result->num_hits; >> push(@ids, $qd); >> } >> } >> } >> } >> >> my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]"); >> my @seqs; >> while (my $seqobj = $seqio->next_seq) { >> my $nuc = $seqobj->seq(); >> push(@seqs, $nuc); >> } >> >> my $it = each_array(@ids, at seqs); >> while(my($ids,$seqs)=$it->()){ >> print $ids, "\n", $seqs, "\n"; >> } >> * >> >> Thanks again! >> ~Ann >> >> On Wed, Feb 20, 2013 at 9:24 AM, Andreas Leimbach >> > > wrote: >> >> oops, I just realized I had one loop to much in there. Adam is >> correct. Sorry. >> >> The last part of the code I send you should look like this: >> >> >> my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]"); >> while (my $seqobj = $seqio->next_seq) { >> print ">$hits{$seqobj->display_id}\__n"; >> >> my $nuc = $seqobj->seq(); >> print $nuc, "\n"; >> } >> >> >> Cheers, >> Andreas >> >> >> -- >> Andreas Leimbach >> Universit?t M?nster >> Institut f?r Hygiene >> Mendelstr. 7 >> D-48149 M?nster >> Germany >> >> Tel.: +49 (0)551 39 3843 >> E-Mail: andreas.leimbach at uni-__wuerzburg.de >> >> >> On 20.2.13 06:20, Ann Gregory wrote: >> >> Hi BioPerl, >> >> I am having issues with a BioPerl script. I have a blastxml file >> from a >> blastx blast and the original multifasta file containing the >> original >> nucleotides sequences. >> >> I want to take the blast result (ie. the blast description) and >> annotate my >> multifasta file. >> >> I have written 2 while loops that extract the blast descriptions >> as well as >> the nucleotide sequence from the multifasta file. >> >> My problem is that I cannot incorporate one of the while loops >> into the >> other without loosing the loop property of one of the loops. I >> would like >> to take the 1st blast description, then the 1st nucleotide >> sequence, then >> the 2nd blast description, then the 2nd nucleotide sequence and so >> on...just can figure out how to alternate the results. >> >> See script below: >> >> >> use warnings; >> use strict; >> use Bio::SearchIO; >> use Bio::SeqIO; >> >> >> my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file => >> "$ARGV[0]"); >> while (my $result = $search_in->next_result) { >> while (my $hit = $result->next_hit) { >> while (my $hsp = $hit->next_hsp) { >> my $qd = $hit->description; >> print $qd, "\n"; >> } >> } >> } >> >> my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => >> "$ARGV[1]"); >> while (my $seqobj = $seqio->next_seq) { >> my $nuc = $seqobj->seq(); >> print $nuc, "\n"; >> }-- >> Ann (Nina) Gregory >> Graduate Student >> Rich Lab / Sullivan Lab >> Soil, Water, Environmental Science Department >> University of Arizona >> _________________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/__mailman/listinfo/bioperl-l >> >> >> >> >> >> -- >> Ann (Nina) Gregory >> Graduate Student >> Rich Lab / Sullivan Lab >> Soil, Water, Environmental Science Department >> University of Arizona >> >> >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From carandraug+dev at gmail.com Mon Feb 25 05:08:23 2013 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Mon, 25 Feb 2013 10:08:23 +0000 Subject: [Bioperl-l] module for description of sequence variants (where to place code) Message-ID: Hi I'm writing a perl module to write a description of the variance between 2 sequences as described on http://www.hgvs.org/mutnomen/recs-prot.html Basically, given 2 sequences, would returns something like "p.Lys2del p.His25_Met26insGln" if those are the differences. It also accounts for the existence of - characters on the sequences that may come from their alignment. My question is, where on the project tree should I place the module? Also, is there something already written that would convert from 1 to 3 letter code? Carn? From andreas.leimbach at uni-wuerzburg.de Mon Feb 25 05:32:43 2013 From: andreas.leimbach at uni-wuerzburg.de (Andreas Leimbach) Date: Mon, 25 Feb 2013 11:32:43 +0100 Subject: [Bioperl-l] module for description of sequence variants (where to place code) In-Reply-To: References: Message-ID: <512B3DCB.7050008@uni-wuerzburg.de> Hi Carn?, for your last question: You can convert aa strings from one to three letter code with 'Bio::SeqUtils'. Cheers, Andreas -- Andreas Leimbach Universit?t M?nster Institut f?r Hygiene Mendelstr. 7 D-48149 M?nster Germany Tel.: +49 (0)551 39 3843 E-Mail: andreas.leimbach at uni-wuerzburg.de On 25.2.13 11:08, Carn? Draug wrote: > Hi > > I'm writing a perl module to write a description of the variance > between 2 sequences as described on > http://www.hgvs.org/mutnomen/recs-prot.html > > Basically, given 2 sequences, would returns something like "p.Lys2del > p.His25_Met26insGln" if those are the differences. It also accounts > for the existence of - characters on the sequences that may come from > their alignment. > > My question is, where on the project tree should I place the module? > > Also, is there something already written that would convert from 1 to > 3 letter code? > > Carn? > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From genehack at genehack.org Wed Feb 27 19:57:48 2013 From: genehack at genehack.org (John SJ Anderson) Date: Wed, 27 Feb 2013 16:57:48 -0800 Subject: [Bioperl-l] YAPC talks? Message-ID: Hi - Is there anyone that was planning on submitting a Bioperl talk to YAPC::NA? In an unrelated conversation, one of the organizers expressed an interest in getting a Bioperl talk this year. If no one else is planning on a talk submission, Jay Hannah (aka deafferret) and I are promising/threatening a tag-team style "Bioperl rules / Bioperl sucks" overview/state of the dist style talk... thanks, john. From cjfields at illinois.edu Wed Feb 27 21:48:55 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Thu, 28 Feb 2013 02:48:55 +0000 Subject: [Bioperl-l] YAPC talks? In-Reply-To: References: Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6E705CD3@CHIMBX5.ad.uillinois.edu> At the moment I personally have no plans on going, but I think a no-holds-barred bioperl talk is a good idea. chris On Feb 27, 2013, at 6:57 PM, John SJ Anderson wrote: > Hi - > > Is there anyone that was planning on submitting a Bioperl talk to > YAPC::NA? In an unrelated conversation, one of the organizers > expressed an interest in getting a Bioperl talk this year. > > If no one else is planning on a talk submission, Jay Hannah (aka > deafferret) and I are promising/threatening a tag-team style "Bioperl > rules / Bioperl sucks" overview/state of the dist style talk... > > thanks, > john. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From hlapp at drycafe.net Wed Feb 27 22:20:34 2013 From: hlapp at drycafe.net (Hilmar Lapp) Date: Wed, 27 Feb 2013 22:20:34 -0500 Subject: [Bioperl-l] YAPC talks? In-Reply-To: References: Message-ID: <42C1F1B8-FE26-43A8-B601-E80D17D215EC@drycafe.net> On Feb 27, 2013, at 7:57 PM, John SJ Anderson wrote: > Jay Hannah (aka deafferret) and I are promising/threatening a tag-team style "Bioperl > rules / Bioperl sucks" overview/state of the dist style talk... Please videotape. I'll be sure to watch and promote it :-) -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : =========================================================== From saladi1 at illinois.edu Thu Feb 28 01:58:20 2013 From: saladi1 at illinois.edu (Shyam Saladi) Date: Wed, 27 Feb 2013 22:58:20 -0800 Subject: [Bioperl-l] EUtilities Cookbook - Accn to gi Message-ID: Hi, I think that rettype for the section "Get GIs for a list of accessions" should be -rettype => 'gi'); instead of 'gilist' as it is now. I think this change is due to a change in NCBI eutils. webpage: http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook#Get_GIs_for_a_list_of_accessions Thanks, Shyam From fossandonc at hotmail.com Thu Feb 28 10:36:34 2013 From: fossandonc at hotmail.com (=?iso-8859-1?Q?Francisco_J._Ossand=F3n?=) Date: Thu, 28 Feb 2013 12:36:34 -0300 Subject: [Bioperl-l] Fix for Bug #3376 broke somewhere else Message-ID: Hi, I was re-checking Bug #3302 using the Bio::SearchIO modules of the repository and found that now it can't parse a Hmmer2 file that was previously fine. After tracking the problem, I discovered that a change in a regular expression to fix another bug broke the parse. The fix for the Bug #3376 consisted in adding an extra condition to omit lines where end of domain indicator is split across lines (https://redmine.open-bio.org/issues/3376): TEST: domain 1 of 1, from 8 to 97: score 184.7, E = 2.5e-56 *->svfqqqqssksttgstvtAiAiAigYRYRYRAvtWnsGsLssGvnDn sv+qqqq+ + +vtAiAiAigYRYRYRAv Wn GsLs G nDn Test 8 SVYQQQQGGSA----MVTAIAIAIGYRYRYRAVVWNKGSLSTGTNDN 50 DnDqqsdgLYtiYYsvtvpssslpsqtviHHHaHkasstkiiikiePr<- DnDq +d LYtiYYsvtv +ss+p q+v+HHHaH+asstkiiiki P Test 51 DNDQAAD-LYTIYYSVTVSASSWPGQSVTHHHAHPASSTKIIIKIAPS 97 * Test - - This case is characterized by the 2 dashes in the line... So the expression added in hmmer2.pm - ?next_result? (https://github.com/bioperl/bioperl-live/commit/142e5d79e3a6593db32bf0af9904 8f47d01bd3f2): elsif (CORE::length($_) == 0 || ( $count != 1 && /^\s+$/o ) || /^\s+\-?\*\s*$/ || /^.+\-\s+\-\s*$/ ) ### <--- This regex was designed for bug 3376 { next; } But the expression used is too broad because it uses the "^.+" just before the 2 dashes, and it broke these lines parsing, where is full of dashes: KyACrqCdtiVQAPaPakpIErGiptaGLLArvlVSKyaEHlPLYRQsEI lcl|gi|340 - -------------------------------------------------- - yaRqGVeiaRstLadWVgrtgarLaPLvdALaeyVLkeGklHADeTPVqV +i s L V++ + r lcl|gi|340 60938 ------AIMISGLIHGVSARCLRF-------------------------- 60955 I think a reasonable fix that still fixes the original bug and restore the function for this case is to add an extra \s+ in the regex just before the first dash, so the expression makes sure that the first dash is the one that comes AFTER the description (and is replacing the usual coordinate number) and is not the last of an alignment or a series of dashes like the one above: elsif (CORE::length($_) == 0 || ( $count != 1 && /^\s+$/o ) || /^\s+\-?\*\s*$/ || /^.+\s+\-\s+\-\s*$/ ) ### <--- Tweaked regex { next; } I tested it and it works fine, hope you find the fix acceptable. Cheers, -- Francisco J. Ossandon Bioinformatician. Ph.D. Candidate, University Andres Bello. Center for Bioinformatics and Genome Biology, Fundacion Ciencia para la Vida. Santiago, Chile. www.cienciavida.cl/CBGB.htm From PDagosto at edgebio.com Mon Feb 25 11:50:34 2013 From: PDagosto at edgebio.com (Phil Dagosto) Date: Mon, 25 Feb 2013 16:50:34 +0000 Subject: [Bioperl-l] Error when running Build.PL Message-ID: Greetings, I downloaded BioPerl 1.6.1 from this location: http://www.bioperl.org/wiki/Getting_BioPerl When I ran Build.PL with all of the default settings chosen in the interactive mode I got the following error message: Could not get valid metadata. Error is: Invalid metadata structure. Errors: 'Perl_5' for 'license' does not have a URL scheme (resources -> license) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::FeatureIO::gff -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::WebAgent -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::EUtilParameters -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::OntologyIO::InterProParser -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Biblio::IO::medlinexml -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::strider -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::PhyloNetwork::RandomFactory -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::Analysis::DNA::ESEfinder -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::game::gameSubs -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::FeatureIO::interpro -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::GFF::Adaptor::berkeleydb -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::entrezgene -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::tinyseq -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::chadoxml -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::game::gameWriter -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::FileCache -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::bsml_sax -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::Primer3 -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::GFF::Adaptor::ace -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::PopGen::HtSNP -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tree::Compatible -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::Ace -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::Taxonomy::entrez -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::agave -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::PopGen::TagHaplotype -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::SeqFeature::Store::FeatureFileLoader -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::* -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::Analysis::Protein* -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SearchIO::blastxml -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::EUtilities -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tree::Draw::Cladogram -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::SeqPattern -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::tigrxml -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqFeature::Collection -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Draw::Pictogram -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SearchIO::Writer::BSMLResultWriter -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::Query::HIVQuery -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::TreeIO::svggraph -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::Biblio::eutils -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::PhyloNetwork -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::SeqPattern::BackTranslate -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::Query::GenBank -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Variation::IO::xml -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::PhyloNetwork::GraphViz -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqFeature::Annotated -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::NCBIHelper -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::HIV -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::Analysis::DNA* -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::Run::RemoteBlast -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::excel -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::ClusterIO::dbsnp -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Microarray::Tools::ReseqChip -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::Biblio::soap -> requires) [Validation: 1.2] at /usr/local/lib/perl5/5.10.1/Module/Build/Base.pm line 4559 Could not create MYMETA files Creating new 'Build' script for 'BioPerl' version '1.006001' I have no idea whether this is a problem or not or if I can proceed. Also, I'm confused by the version number referenced in the last line. 1.006001 is our current version - I thought I was installing version 1.6.1. Are these version numbers equivalent, i.e., are the zeros not meaningful?. I was actually looking for version 1.2.3 (or greater) - where can I find that? Thanks, Phil Phil Dagosto Sr. Software Engineer Edge Bio 201 Perry Parkway, Suite 5 Gaithersburg, MD 20850 pdagosto at edgebio.com (240) 912-8669 From chapmanb at 50mail.com Thu Feb 28 21:30:01 2013 From: chapmanb at 50mail.com (Brad Chapman) Date: Thu, 28 Feb 2013 21:30:01 -0500 Subject: [Bioperl-l] Coming soon: BOSC/Broad Hackathon, BOSC Codefest Message-ID: <874ngvua1i.fsf@fastmail.fm> Hi all; There are some upcoming coding events and conferences of interest to open source biology programmers: - BOSC/Broad Interoperability Hackathon -- This is a two day coding session at the Broad Institute in Cambridge, MA on April 7-8 focused on improving tool interoperability. Sign up and details: http://j.mp/XJT6ew - Codefest at the Bioinformatics Open Source Conference -- This year BOSC is taking place in Berlin from July 19-20 and we'll have a two day coding session before the conference. This is the 4th year of Codefests and they've proven to be a productive and fun time to work collectively on open source projects. Sign up and details: http://www.open-bio.org/wiki/Codefest_2013 BOSC conference: http://www.open-bio.org/wiki/BOSC_2013 Here are the key dates for the events and abstracts: April 7-8, 2013: BOSC/Broad Interoperability Hackathon, Cambridge, MA April 12, 2013: BOSC abstracts due July 17-18, 2013: Codefest 2013, Berlin July 19-20, 2013: BOSC 2013, Berlin Looking forward to seeing everyone this spring and summer for plenty of fun science and code, Brad From koriege at googlemail.com Fri Feb 1 02:49:20 2013 From: koriege at googlemail.com (koriege at googlemail.com) Date: Thu, 31 Jan 2013 18:49:20 -0800 (PST) Subject: [Bioperl-l] problem with Bio::*::Fasta id_parser Message-ID: Hi, I tried two methods to create a bioperl FASTA database, but it failes by extracting the substring out of my headers. Can someone explain me why I get the standard header or show me a work around? thanks in advance. pyr0 i) my $objDB = Bio::Index::Fasta->new(-filename => $PATHdbIdx, -write_flag => 1); $objDB->id_parser(\&get_id); $objDB->make_index(glob($objParameter->dbGenome())); sub get_id { my $header = shift; $header =~ /^>.*\bsp\|([A-Z]\d{5}\b)/; $1; } output Use of uninitialized value $id in concatenation (.) or string at /usr/share/perl5/Bio/Index/Abstract.pm line 753, <$FASTA> line 1. Use of uninitialized value $id in exists at /usr/share/perl5/Bio/Index/Abstract.pm line 754, <$FASTA> line 1. Use of uninitialized value $id in hash element at /usr/share/perl5/Bio/Index/Abstract.pm line 757, <$FASTA> line 1. gi|376282008|ref|NC_016798.1| ii) my $PATHdbIdx=catfile($objParameter->DIR,'data','db.idx'); unlink($PATHdbIdx); my $objDB = Bio::DB::Fasta->new($objParameter->dbGenome(), -makeid => \&get_id); $objDBgenome->set(\$objDB); output: Use of uninitialized value $key in pattern match (m//) at /usr/share/perl5/Bio/DB/Fasta.pm line 1178. Use of uninitialized value $id in exists at /usr/share/perl5/Bio/DB/Fasta.pm line 617. gi|376282008|ref|NC_016798.1| From jason.stajich at gmail.com Fri Feb 1 06:58:57 2013 From: jason.stajich at gmail.com (Jason Stajich) Date: Thu, 31 Jan 2013 22:58:57 -0800 Subject: [Bioperl-l] Bioperl-l Digest, Vol 117, Issue 13 In-Reply-To: <575B184F-C051-4FEF-9BEB-7AB98E3C52A6@yahoo.com> References: <575B184F-C051-4FEF-9BEB-7AB98E3C52A6@yahoo.com> Message-ID: Dan - I think the answer is yes if others are doing it - I am not in a position to be much of a main coder. I don't know which format you speak of here or if you had to write something for the text blast changes or something else. Specific bug reports on formats that aren't working is always helpful. The XML format has been pretty stable so I would suggest that if you are simply parsing reports not looking at them. Chris posted instructions on how to contribute and the move to github simplifies this. That you had to write a whole new parser seems probably a bit severe - I hope that in the future people can speak to the problems sooner. If I hit a wall with something I can't do I usually write the code to fix it and contribute it back but I don't play follow-the-format-changes with the tools anymore, but hopefully others like yourself can make the contributions. If you speak to the response I made to the question below, I don't think anyone will be trying and support the NCBI's additional markups that refer to the upstream and downstream features as they are laid out in the text files without some serious effort. Perhaps in the future that information will be reported in the XML format and thus be more parseable. best wishes, Jason On Jan 30, 2013, at 1:40 PM, Dan kilburn wrote: > Hi Jason, > > Are there any plans to keep SearchIO up to date with ncbi blast? I know they change formats ridiculously often, but I had to write my own parser to get sequence identity, which I would rather not have done. I realize that this job would be a big load on anyone who takes it, but it's so fundamental. Maybe I can help. > > --Dan > Sent from my iPhone > > On Jan 30, 2013, at 12:00 PM, bioperl-l-request at lists.open-bio.org wrote: > >> Send Bioperl-l mailing list submissions to >> bioperl-l at lists.open-bio.org >> >> To subscribe or unsubscribe via the World Wide Web, visit >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> or, via email, send a message with subject or body 'help' to >> bioperl-l-request at lists.open-bio.org >> >> You can reach the person managing the list at >> bioperl-l-owner at lists.open-bio.org >> >> When replying, please edit your Subject line so it is more specific >> than "Re: Contents of Bioperl-l digest..." >> >> >> Today's Topics: >> >> 1. Re: Parsing Blast-Report extracting "Features flanking .." >> (Jason Stajich) >> >> >> ---------------------------------------------------------------------- >> >> Message: 1 >> Date: Tue, 29 Jan 2013 11:00:16 -0800 >> From: Jason Stajich >> Subject: Re: [Bioperl-l] Parsing Blast-Report extracting "Features >> flanking .." >> To: buschj at hhu.de >> Cc: bioperl-l at lists.open-bio.org >> Message-ID: <6E83E3F3-C304-4DC4-9A11-FE1CA90F207D at gmail.com> >> Content-Type: text/plain; charset=us-ascii >> >> We don't parse the NCBI feature info from the BLAST reports per your query. To look up a specific feature you can use Bio::DB::GenBank to query for sequence from a specific feature by accession number - see the HOWTOs for that. >> >> However, most people use tools that generate SAM/BAM files with short reads - then you can use a tool like bedtools to find overlaps of reads with the locations of features. >> >> basically: >> - download the genome and GFF for arabidopsis >> - align your sRNA to the genome with a short read aligner - bowtie, bwa, others >> - convert your sam to bam file with SAMtools or picard >> - compare the location of features with the reads to get expression summaries or individuals reads with BEDTools >> >> >> On Jan 25, 2013, at 2:20 AM, jobu wrote: >> >>> Am 22.01.2013 19:03, schrieb Mgavi Brathwaite: >>>> What upstream and downstream elements are you interested in? >>> >>> >>> I've got a huge pile of short RNA reads. >>> Part of the question now is whether those RNA fragments originate from >>> siRNA events, >>> or may represent miRNAs / parts of pre-miRNAs. >>> >>> So I did an online blast search against database nt. >>> The resulting report quite often just gives subject information like this: >>> >>> ----- >>>> gb|CP002686.1| Arabidopsis thaliana chromosome 3, complete sequence >>> Length=23459830 >>> ----- >>> >>> Now I would like to get the hit's neighbouring regions for further >>> analysis. >>> Preferably I would like to do that in an automized way, but the only >>> possible action with this kind of subject gi | description would be to >>> fetch the entire chromosomal sequence I guess ? >>> >>> However, >>> right below the line above, the report states more precisely: >>> >>> ------ >>> Features flanking this part of subject sequence: >>> 8872 bp at 5' side: cytochrome P450 90B1 >>> 402 bp at 3' side: U1 small nuclear ribonucleoprotein-70K >>> ------ >>> >>> Still I would like to have the possibility to automatically fetch the >>> subject's sequence(s), >>> as of now I think parsing the report with SearchIO won't let me aquire >>> that information, because SearchIO does not recognize report sections >>> like those. >>> >>> I hope I did not miss any of SearchIOs capabilities, but I could not >>> find any method covering my wish?! >>> >>> Right now maybe the only way to get the information I want is to >>> construct my own parser and write it out into a separate file, which in >>> turn again I could read into a hash before processing the Blast-Report >>> with SearchIO to combine both data for further automized work. >>> >>> I am aware though that even successfully getting the flanking features >>> would leave me with the more or less wide intergenic gap my hsp is >>> located in. >>> >>> However I'm in need of a way to get the flanking features including >>> their annotation and the region spanning between them. >>> But I hope I do not have to get complete sequences to accomplish that, >>> as this would be kind of an overkill. >>> >>> with kind regards >>> Jochen >>> >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> Jason Stajich >> jason.stajich at gmail.com >> jason at bioperl.org >> >> >> >> >> ------------------------------ >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> End of Bioperl-l Digest, Vol 117, Issue 13 >> ****************************************** > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Jason Stajich jason.stajich at gmail.com jason at bioperl.org From dr_kilburn59 at yahoo.com Fri Feb 1 14:25:34 2013 From: dr_kilburn59 at yahoo.com (Dan Kilburn) Date: Fri, 1 Feb 2013 06:25:34 -0800 (PST) Subject: [Bioperl-l] Bioperl-l Digest, Vol 117, Issue 13 In-Reply-To: References: <575B184F-C051-4FEF-9BEB-7AB98E3C52A6@yahoo.com> Message-ID: <1359728734.27412.YahooMailNeo@web162006.mail.bf1.yahoo.com> Hi Jason, ? Thanks for?the detailed feedback.? The real reason I had to write my own parser is that even with close, repeated support from NCBI we couldn't get XML output with short_web_blast.pl?because the parameter that turns on XML output was not functioning (they've probably fixed it by now), and I had to crank out a parser asap to support a job talk. ? I don't think the upstream and downstream feature reports are particulalry useful, becase in mammals they tend to be so far away that they are not likely to be biologically relevant.? But the internal motif reports are useful, maybe especially if you are blasting short reads, like I was.? A 16-mer preserved domain hit is really good if you're blasting 18-mer Illumina short reads, like I was. ? As far as my involvement goes, I got diagnosed with cancer on Wednesday, so I'll be taking a step back until next week's surgery and taking a lot a deep breaths.? On the other hand, this just makes me more motivated: I've been thinking alot about time, and timely contributions, the last two days. ? Cheers, Dan ________________________________ From: Jason Stajich To: Dan kilburn Cc: "bioperl-l at lists.open-bio.org" Sent: Friday, February 1, 2013 1:58 AM Subject: Re: [Bioperl-l] Bioperl-l Digest, Vol 117, Issue 13 Dan -? I think the answer is yes if others are doing it - I am not in a position to be much of a main coder. I don't know which format you speak of here or if you had to write something for the text blast changes or something else. ?Specific bug reports on formats that aren't working is always helpful. ?The XML format has been pretty stable so I would suggest that if you are simply parsing reports not looking at them. Chris posted instructions on how to contribute and the move to github simplifies this. ?That you had to write a whole new parser seems probably a bit severe - I hope that in the future people can speak to the problems sooner. If I hit a wall with something I can't do I usually write the code to fix it and contribute it back but I don't play follow-the-format-changes with the tools anymore, but hopefully others like yourself can make the contributions. If you speak to the response I made to the question below, I don't think anyone will be trying and support the NCBI's additional markups that refer to the upstream and downstream features as they are laid out in the text files without some serious effort. Perhaps in the future that information will be reported in the XML format and thus be more parseable. best wishes, Jason On Jan 30, 2013, at 1:40 PM, Dan kilburn wrote: Hi Jason, > >Are there any plans to keep SearchIO up to date with ncbi blast? I know they change formats ridiculously often, but I had to write my own parser to get sequence identity, which I would rather not have done. I realize that this job would be a big load on anyone who takes it, but it's so fundamental. Maybe I can help. > >--Dan >Sent from my iPhone > >On Jan 30, 2013, at 12:00 PM, bioperl-l-request at lists.open-bio.org wrote: > > >Send Bioperl-l mailing list submissions to >>??bioperl-l at lists.open-bio.org >> >>To subscribe or unsubscribe via the World Wide Web, visit >>??http://lists.open-bio.org/mailman/listinfo/bioperl-l >>or, via email, send a message with subject or body 'help' to >>??bioperl-l-request at lists.open-bio.org >> >>You can reach the person managing the list at >>??bioperl-l-owner at lists.open-bio.org >> >>When replying, please edit your Subject line so it is more specific >>than "Re: Contents of Bioperl-l digest..." >> >> >>Today's Topics: >> >>?1. Re: ?Parsing Blast-Report extracting "Features flanking ???.." >>????(Jason Stajich) >> >> >>---------------------------------------------------------------------- >> >>Message: 1 >>Date: Tue, 29 Jan 2013 11:00:16 -0800 >>From: Jason Stajich >>Subject: Re: [Bioperl-l] Parsing Blast-Report extracting "Features >>??flanking ???.." >>To: buschj at hhu.de >>Cc: bioperl-l at lists.open-bio.org >>Message-ID: <6E83E3F3-C304-4DC4-9A11-FE1CA90F207D at gmail.com> >>Content-Type: text/plain; ???charset=us-ascii >> >>We don't parse the NCBI feature info from the BLAST reports per your query. To look up a specific feature you can use Bio::DB::GenBank to query for sequence from a specific feature by accession number - see the HOWTOs for that. >> >>However, most people use tools that generate SAM/BAM files with short reads - then you can use a tool like bedtools to find overlaps of reads with the locations of features. >> >>basically: >>- download the genome and GFF for arabidopsis >>- align your sRNA to the genome with a short read aligner - bowtie, bwa, others >>- convert your sam to bam file with SAMtools or picard >>- compare the location of features with the reads to get expression summaries or individuals reads with BEDTools >> >> >>On Jan 25, 2013, at 2:20 AM, jobu wrote: >> >> >>Am 22.01.2013 19:03, schrieb Mgavi Brathwaite: >>> >>>What upstream and downstream elements are you interested in? >>>> >>> >>>I've got a huge pile of short RNA reads. >>>Part of the question now is whether those RNA fragments originate from >>>siRNA events, >>>or may represent miRNAs / parts of pre-miRNAs. >>> >>>So I did an online ?blast search against database nt. >>>The resulting report quite often just gives subject information like this: >>> >>>----- >>> >>>gb|CP002686.1| Arabidopsis thaliana chromosome 3, complete sequence >>>>Length=23459830 >>>----- >>> >>>Now I would like to get the hit's neighbouring regions ?for further >>>analysis. >>>Preferably I would like to do that ?in an automized way, but the only >>>possible action with this kind of subject gi | description would be to >>>fetch the entire chromosomal ?sequence I guess ? >>> >>>However, >>>right below the line above, the report states more precisely: >>> >>>------ >>>Features flanking this part of subject sequence: >>>8872 bp at 5' side: cytochrome P450 90B1 >>>402 bp at 3' side: U1 small nuclear ribonucleoprotein-70K >>>------ >>> >>>Still I would like to have the possibility to automatically fetch the >>>subject's sequence(s), >>>as of now I think ?parsing the report with SearchIO won't let me aquire >>>that information, because SearchIO does not recognize report sections >>>like those. >>> >>>I hope I did not miss any of SearchIOs capabilities, but I could not >>>find any method covering my wish?! >>> >>>Right now maybe the only way to get the information I want is to >>>construct my own parser and write it out into a separate file, which in >>>turn again ?I could read into a hash before processing the Blast-Report >>>with SearchIO to combine both data for further automized work. >>> >>>I am aware though that even successfully getting the flanking features >>>would leave me with the more or less wide ?intergenic gap my hsp is >>>located in. >>> >>>However I'm in need of a way to get the flanking features including >>>their annotation and the region spanning between them. >>>But I hope I do not have to get complete sequences to accomplish that, >>>as this would be kind of an overkill. >>> >>>with kind regards >>>Jochen >>> >>> >>> >>>_______________________________________________ >>>Bioperl-l mailing list >>>Bioperl-l at lists.open-bio.org >>>http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>Jason Stajich >>jason.stajich at gmail.com >>jason at bioperl.org >> >> >> >> >>------------------------------ >> >>_______________________________________________ >>Bioperl-l mailing list >>Bioperl-l at lists.open-bio.org >>http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >>End of Bioperl-l Digest, Vol 117, Issue 13 >>****************************************** >> >_______________________________________________ >Bioperl-l mailing list >Bioperl-l at lists.open-bio.org >http://lists.open-bio.org/mailman/listinfo/bioperl-l > Jason Stajich jason.stajich at gmail.com jason at bioperl.org From carandraug+dev at gmail.com Sun Feb 3 01:44:31 2013 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Sun, 3 Feb 2013 01:44:31 +0000 Subject: [Bioperl-l] TCofee does not accept named arguments and issue with output option Message-ID: Hi the TCoffee module does not options of the named argument type: -arg => option one needs to do like 'arg' => option Is there a special reason for this? I tracked down this to the commit 7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e 12 years ago[1]. A comment on the code actually says "don't want named parameters"[2] (though the commit message sounds pretty innocuous "migrated to new Bio::Root::RootI chained new"). Is there a reason for this? The rest of bioperl has no issue with named parameters, and the API should be the same as Clustalw which also has no problem with it. This is very easy to fix, I can submit a pull request no problem. Also, shouldn't the code complain in the case of non-supported options? Took me a very long time to find out the problem because there was no complaints coming from the code. There is also a problem with the way it handles the output option. I'll have to look closer into it, but the documentation is simply incorrect. "'output' => 'fasta_aln'" gives an error while just 'fasta' (undocumented), works fine. Carn? [1] https://github.com/carandraug/bioperl-run/commit/7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e [2] https://github.com/carandraug/bioperl-run/commit/7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e#L0R374 From cjfields at illinois.edu Sun Feb 3 21:54:51 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Sun, 3 Feb 2013 21:54:51 +0000 Subject: [Bioperl-l] TCofee does not accept named arguments and issue with output option In-Reply-To: References: Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE14D30@CHIMBX5.ad.uillinois.edu> Carn?, On Feb 2, 2013, at 7:44 PM, Carn? Draug wrote: > Hi > > the TCoffee module does not options of the named argument type: > > -arg => option > > one needs to do like > > 'arg' => option > > Is there a special reason for this? I tracked down this to the commit > > 7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e > > 12 years ago[1]. A comment on the code actually says "don't want named > parameters"[2] (though the commit message sounds pretty innocuous > "migrated to new Bio::Root::RootI chained new"). Is there a reason for > this? The rest of bioperl has no issue with named parameters, and the > API should be the same as Clustalw which also has no problem with it. > This is very easy to fix, I can submit a pull request no problem. IIRC the reasoning behind this was to differentiate Bioperl parameters from command-specific ones. This decision predates my involvement w/ core dev, but my general feeling is that anything that is an object attribute (regardless whether it is a direct representation of a value passed to a wrapped program or not) should be preceded by '-' for consistency. The downside of big changes like this: potential backwards compatibility issues. Such changes would need to be tested out rigorously, as there are a ton of old scripts that would potentially break with a direct change. I don't have a problem breaking this with a bioperl 2.0 release, though. > Also, shouldn't the code complain in the case of non-supported > options? Took me a very long time to find out the problem because > there was no complaints coming from the code. Yes, it should complain when options are given that do not make sense, some validation would help there. With some modules this might be a side-effect of using AUTOLOAD or simply not checking the parameters. > There is also a problem with the way it handles the output option. > I'll have to look closer into it, but the documentation is simply > incorrect. "'output' => 'fasta_aln'" gives an error while just 'fasta' > (undocumented), works fine. That's entirely possible. > Carn? > [1] https://github.com/carandraug/bioperl-run/commit/7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e > [2] https://github.com/carandraug/bioperl-run/commit/7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e#L0R374 As an aside, there are a few downsides of trying to implement command-line parameters as perl object attributes (getter/setter), one being that many can't be directly represented as an object attribute (namely, anything that can't be a getter/setter named subroutine, such as those having hyphens, starting with a number, etc) so you have to hack your way around it. Infernal was this way IIRC. Maybe these should just be simply stored as a semi-validated set of key-value pairs. chris From carandraug+dev at gmail.com Mon Feb 4 04:34:22 2013 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Mon, 4 Feb 2013 04:34:22 +0000 Subject: [Bioperl-l] TCofee does not accept named arguments and issue with output option In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE14D30@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE14D30@CHIMBX5.ad.uillinois.edu> Message-ID: On 3 February 2013 21:54, Fields, Christopher J wrote: > On Feb 2, 2013, at 7:44 PM, Carn? Draug wrote: > >> Hi >> >> the TCoffee module does not options of the named argument type: >> >> -arg => option >> >> one needs to do like >> >> 'arg' => option >> >> Is there a special reason for this? I tracked down this to the commit >> >> 7b2f9189fe2af4f99f1c3d7fe79c732bfe01261e >> >> 12 years ago[1]. A comment on the code actually says "don't want named >> parameters"[2] (though the commit message sounds pretty innocuous >> "migrated to new Bio::Root::RootI chained new"). Is there a reason for >> this? The rest of bioperl has no issue with named parameters, and the >> API should be the same as Clustalw which also has no problem with it. >> This is very easy to fix, I can submit a pull request no problem. > > IIRC the reasoning behind this was to differentiate Bioperl parameters from command-specific ones. This decision predates my involvement w/ core dev, but my general feeling is that anything that is an object attribute (regardless whether it is a direct representation of a value passed to a wrapped program or not) should be preceded by '-' for consistency. > > The downside of big changes like this: potential backwards compatibility issues. Such changes would need to be tested out rigorously, as there are a ton of old scripts that would potentially break with a direct change. I don't have a problem breaking this with a bioperl 2.0 release, though. Should passing the tests be enough? There's one for TCofee. At the moment I don't see how this would cause compatibility issues, we are adding an option, not removing it. But the comment on the code, stating plainly that the -param API was not wanted caught me by surpise and why I'm asking. > As an aside, there are a few downsides of trying to implement command-line parameters as perl object attributes (getter/setter), one being that many can't be directly represented as an object attribute (namely, anything that can't be a getter/setter named subroutine, such as those having hyphens, starting with a number, etc) so you have to hack your way around it. Infernal was this way IIRC. Maybe these should just be simply stored as a semi-validated set of key-value pairs. >From a quick glance at the list of TCoffee parameters I don't at the moment see any that should cause problem. I have submitted a bug report[1] which mentions some other issues I found with TCoffee. If someone could comment on them would be great and I can start fixing it. Carn? [1] https://redmine.open-bio.org/issues/3406 From yuf228 at hotmail.com Fri Feb 1 04:15:15 2013 From: yuf228 at hotmail.com (Rob) Date: Fri, 1 Feb 2013 04:15:15 +0000 (UTC) Subject: [Bioperl-l] Where to get BLASTCLUST or equivalent? References: <200305311150.h4VBopn2019091@localhost.localdomain> Message-ID: Cyril C.C. Chua bmb.leeds.ac.uk> writes: > > Hi, > > I have some difficulty in sourcing for BLASTCLUST or related > programs/mods. Does any1 know exactly how to locate them? > > Regards > > Cyril Chua > Hi Cyril, I heard of the following programmes that might do similar things (I HAVEN'T used any of them yet): Afree - http://www.vicbioinformatics.com/software.afree.shtml Uclust - http://drive5.com/uclust/uclust_userguide_2_1.pdf Usearch - http://www.drive5.com/usearch/ DomClust - http://mbgd.genome.ad.jp/domclust/ or Check this: http://ppod.princeton.edu/help/help_tech.html God bless, Robert From whereverroadgoes at gmail.com Mon Feb 4 15:39:19 2013 From: whereverroadgoes at gmail.com (Slym) Date: Mon, 4 Feb 2013 07:39:19 -0800 (PST) Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases Message-ID: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com> The result I get is: Number of bases of type A = Number of bases of type C = Number of bases of type G = Number of bases of type T = i.e. There's no expected values. Please help! #! /usr/bin/perl use Bio::Tools::SeqStats; use Bio::Seq; open (FILE, "seq.fasta"); @array = ; # Removing first line of fasta shift (@array); $array = join('', at array); open (FILE2, ">>seq2.fasta"); print FILE2 "$array"; $seqobj = Bio::PrimarySeq->new( -file => "sekw2.fasta", - alphabet => 'dna',); my $seq_stats = Bio::Tools::SeqStats->new(-seq=>$seqobj); my $monomer_ref = $seq_stats->count_monomers(); foreach $base (sort keys %$monomer_ref) { print "Liczba zasad typu ", $base," = ", $monomer_ref{$base},"\n"; } From hamish.mcwilliam at bioinfo-user.org.uk Mon Feb 4 16:59:16 2013 From: hamish.mcwilliam at bioinfo-user.org.uk (Hamish McWilliam) Date: Mon, 4 Feb 2013 16:59:16 +0000 Subject: [Bioperl-l] Where to get BLASTCLUST or equivalent? In-Reply-To: References: <200305311150.h4VBopn2019091@localhost.localdomain> Message-ID: BLASTCLUST is part of the legacy NCBI BLAST package (not NCBI BLAST+) and can be obtained from: ftp://ftp.ncbi.nlm.nih.gov/blast/executables/release/LATEST As Robert notes there are many other tools which can be used to perform sequence clustering, Wikipedia has a Sequence Clustering article (http://en.wikipedia.org/wiki/Sequence_clustering) which lists some of the most commonly used. All the best, Hamish On 1 February 2013 04:15, Rob wrote: > Cyril C.C. Chua bmb.leeds.ac.uk> writes: > >> >> Hi, >> >> I have some difficulty in sourcing for BLASTCLUST or related >> programs/mods. Does any1 know exactly how to locate them? >> >> Regards >> >> Cyril Chua >> > > > Hi Cyril, > > I heard of the following programmes that might do similar things (I HAVEN'T > used any of them yet): > > Afree - http://www.vicbioinformatics.com/software.afree.shtml > Uclust - http://drive5.com/uclust/uclust_userguide_2_1.pdf > Usearch - http://www.drive5.com/usearch/ > DomClust - http://mbgd.genome.ad.jp/domclust/ > > or > > Check this: > > http://ppod.princeton.edu/help/help_tech.html > > God bless, > > > Robert > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- ---- "Saying the internet has changed dramatically over the last five years is clich? ? the internet is always changing dramatically" - Craig Labovitz, Arbor Networks. From whereverroadgoes at gmail.com Mon Feb 4 17:34:10 2013 From: whereverroadgoes at gmail.com (Slym) Date: Mon, 4 Feb 2013 09:34:10 -0800 (PST) Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases In-Reply-To: References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com> Message-ID: Thanks Roy, It still doesn't seem to produce anything. :/ From roy.chaudhuri at gmail.com Mon Feb 4 17:51:03 2013 From: roy.chaudhuri at gmail.com (Roy Chaudhuri) Date: Mon, 4 Feb 2013 17:51:03 +0000 Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases In-Reply-To: References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com> Message-ID: Sorry, I'd missed another problem in your code - you are trying to load a fasta file using Bio::PrimarySeq. To read sequence data from a file you should use Bio::SeqIO, see: http://www.bioperl.org/wiki/HOWTO:Beginners#Retrieving_a_sequence_from_a_file http://www.bioperl.org/wiki/HOWTO:SeqIO Cheers, Roy. From asjo at koldfront.dk Mon Feb 4 17:58:25 2013 From: asjo at koldfront.dk (Adam =?iso-8859-1?Q?Sj=F8gren?=) Date: Mon, 04 Feb 2013 18:58:25 +0100 Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases In-Reply-To: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com> (Slym's message of "Mon, 4 Feb 2013 07:39:19 -0800 (PST)") References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com> Message-ID: <8738xc2c72.fsf@topper.koldfront.dk> On Mon, 4 Feb 2013 07:39:19 -0800 (PST), Slym wrote: > #! /usr/bin/perl > use Bio::Tools::SeqStats; > use Bio::Seq; It can be a good idea to add "use strict; use warnings;" to the top of your script. At least two problems in your program would have been caught by perl if you had. > open (FILE, "seq.fasta"); Using (global) literal filehandles and the two parameter open() is somewhat outdated, a more current way to do it could be: open my $fh, '<', 'seq.fasta'; > @array = ; > # Removing first line of fasta > shift (@array); > $array = join('', at array); > open (FILE2, ">>seq2.fasta"); > print FILE2 "$array"; Note that you are writing just the sequence to your seq2.fasta file here, so the new file isn't really a fasta file. > $seqobj = Bio::PrimarySeq->new( -file => "sekw2.fasta", > - alphabet => 'dna',); Bio::PrimarySeq doesn't take a '-file' parameter. Also, note that the filename is different than before "sekw2" vs. "seq2"! Either you should use Bio::SeqIO with a '-file' parameter, or you can use Bio::PrimarySeq with a '-seq' parameter. > my $seq_stats = Bio::Tools::SeqStats->new(-seq=>$seqobj); > my $monomer_ref = $seq_stats->count_monomers(); > foreach $base (sort keys %$monomer_ref) { > print "Liczba zasad typu ", $base," = ", $monomer_ref{$base},"\n"; Here you wanted $monomer_ref->{$base}, as %monomer_ref isn't mentioned anywhere else. > } Here is a complete version of your script - I chose to use Bio::SeqIO - that works: #!/usr/bin/perl use strict; use warnings; use Bio::SeqIO; use Bio::Tools::SeqStats; my $io=Bio::SeqIO->new(-file=>'seq.fasta', -alphabet=>'dna'); my $seqobj=$io->next_seq; # Get the first sequence from the file my $seq_stats = Bio::Tools::SeqStats->new(-seq=>$seqobj); my $monomer_ref = $seq_stats->count_monomers(); foreach my $base (sort keys %$monomer_ref) { print "Liczba zasad typu ", $base," = ", $monomer_ref->{$base},"\n"; } E.g.: $ cat seq.fasta >test aaaacccggt $ ./slym.pl Liczba zasad typu A = 4 Liczba zasad typu C = 3 Liczba zasad typu G = 2 Liczba zasad typu T = 1 $ Best regards, Adam -- "Grittings. Ma nam is Kahlfin." Adam Sj?gren asjo at koldfront.dk From whereverroadgoes at gmail.com Mon Feb 4 18:02:29 2013 From: whereverroadgoes at gmail.com (Slym) Date: Mon, 4 Feb 2013 10:02:29 -0800 (PST) Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases In-Reply-To: References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com> Message-ID: The thing is, if I use Bio::SeqIO then Bio::Tools::SeqStats produces an error (saying that it wants input provided by Bio::PrimarySeq). (btw in this line $seqobj = Bio::PrimarySeq->new( -file => "sekw2.fasta", - alphabet => 'dna',); there's a typo "sekw2" instead of "seq2" but this is correct in my original code). From whereverroadgoes at gmail.com Mon Feb 4 18:02:29 2013 From: whereverroadgoes at gmail.com (Slym) Date: Mon, 4 Feb 2013 10:02:29 -0800 (PST) Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases In-Reply-To: References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com> Message-ID: The thing is, if I use Bio::SeqIO then Bio::Tools::SeqStats produces an error (saying that it wants input provided by Bio::PrimarySeq). (btw in this line $seqobj = Bio::PrimarySeq->new( -file => "sekw2.fasta", - alphabet => 'dna',); there's a typo "sekw2" instead of "seq2" but this is correct in my original code). From cjfields at illinois.edu Mon Feb 4 18:54:39 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Mon, 4 Feb 2013 18:54:39 +0000 Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases In-Reply-To: References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE161ED@CHIMBX5.ad.uillinois.edu> Please make sure and read both Roy's and Adam's responses all the way through; Bio::SeqIO is not a sequence object but the front-end for format parsing (e.g. FASTA, etc). Bio::PrimarySeq does not have a '-file' parameter, Bio::SeqIO does. If SeqStats truly doesn't work with Bio::Seq we can fix that, but according to Adam he has tested using Bio::SeqIO out and it seems to work. chris On Feb 4, 2013, at 12:02 PM, Slym wrote: > The thing is, if I use Bio::SeqIO then Bio::Tools::SeqStats produces an > error (saying that it wants input provided by Bio::PrimarySeq). > (btw in this line > $seqobj = Bio::PrimarySeq->new( -file => "sekw2.fasta", - alphabet => > 'dna',); > there's a typo "sekw2" instead of "seq2" but this is correct in my original > code). > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From asjo at koldfront.dk Mon Feb 4 20:00:32 2013 From: asjo at koldfront.dk (Adam =?iso-8859-1?Q?Sj=F8gren?=) Date: Mon, 04 Feb 2013 21:00:32 +0100 Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases In-Reply-To: (Slym's message of "Mon, 4 Feb 2013 10:02:29 -0800 (PST)") References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com> Message-ID: <87txpr26jj.fsf@topper.koldfront.dk> On Mon, 4 Feb 2013 10:02:29 -0800 (PST), Slym wrote: > The thing is, if I use Bio::SeqIO then Bio::Tools::SeqStats produces an > error (saying that it wants input provided by Bio::PrimarySeq). That sounds like you forgot to call ->next_seq() on the Bio::SeqIO object - to get a sequence object - please see the complete, working example I sent earlier. Best regards, Adam -- "Denial springs eternal." Adam Sj?gren asjo at koldfront.dk From scott at scottcain.net Tue Feb 5 14:45:14 2013 From: scott at scottcain.net (Scott Cain) Date: Tue, 5 Feb 2013 09:45:14 -0500 Subject: [Bioperl-l] Have your say in the 2013 GMOD Community Survey! Message-ID: Give us your thoughts on the GMOD project and win a personal DNA test from 23andMe! The GMOD project provides tools like GBrowse, Galaxy, MAKER, JBrowse, Tripal, Apollo, Chado, and many more to a huge community of users and developers around the world. To make sure that GMOD is giving you the support you need, we want to know how you use GMOD, which components you find valuable, your opinion on support, training, and GMOD's strengths and weaknesses. Your feedback is vital in helping GMOD to serve its user community more effectively and to suggest future directions for the project. Do the survey: http://gmod.org/survey.html The survey should take between 10 and 15 minutes (including thinking time), and participants can enter a draw to win "A Journey Through Your DNA", the personal DNA test from 23andMe (the winner can pick a $50 Amazon gift voucher if they prefer). The survey will be open until March 1st. Results will be collated and discussed at the April 2013 GMOD Meeting in Cambridge, UK, and posted on the GMOD wiki at http://gmod.org. Please spread the word to other friends and colleagues who use GMOD: the more voices we hear, the better the picture we get of the needs of our users, and the better we can help you! Do the survey: http://gmod.org/survey.html If you have any questions or problems with the survey, please email me -- I will be happy to help out! Thanks, Scott -- ------------------------------------------------------------------------ Scott Cain, Ph. D. scott at scottcain dot net GMOD Coordinator (http://gmod.org/) 216-392-3087 Ontario Institute for Cancer Research From tiago.hori at gmail.com Tue Feb 5 15:21:55 2013 From: tiago.hori at gmail.com (Tiago Hori) Date: Tue, 5 Feb 2013 07:21:55 -0800 (PST) Subject: [Bioperl-l] Search I::O Message-ID: <39b1269f-63a7-4b29-af79-8c93ab231abf@googlegroups.com> Hi All, I am trying to find the best putative orthologs for 44K Atlantic Salmon sequences, and so I need to parse 44K BLAST reports to find the best human hit. I am trying to learn Seach::IO, but when I try the first example on the HOWTO: use strict; use Bio::SearchIO; my $in = new Bio::SearchIO(-format => 'blast' -file => 'C001R047.txt'); while( my $result = $in->next_result ) { ## $result is a Bio::Search::Result::ResultI compliant object while( my $hit = $result->next_hit ) { ## $hit is a Bio::Search::Hit::HitI compliant object while( my $hsp = $hit->next_hsp ) { ## $hsp is a Bio::Search::HSP::HSPI compliant object if( $hsp->length('total') > 50 ) { if ( $hsp->percent_identity >= 75 ) { print "Query=", $result->query_name, " Hit=", $hit->name, " Length=", $hsp->length('total'), " Percent_id=", $hsp->percent_identity, "\n"; } } } } } I get this error: Odd number of elements in hash assignment at /usr/local/share/perl/5.14.2/Bio/SearchIO.pm line 189. I am using BioPerl version 1.6.901. Is there a format problem with the blast reports? Any help would be greatly appreciated! T. From tiago.hori at gmail.com Tue Feb 5 15:33:32 2013 From: tiago.hori at gmail.com (Tiago Hori) Date: Tue, 5 Feb 2013 07:33:32 -0800 (PST) Subject: [Bioperl-l] Search::IO example from HOWTO Message-ID: Hi All, I am trying to run tha example from the Search::IO how to use strict; use Bio::SearchIO; my $in = new Bio::SearchIO(-format => 'blast' -file => 'test.txt'); while( my $result = $in->next_result ) { ## $result is a Bio::Search::Result::ResultI compliant object while( my $hit = $result->next_hit ) { ## $hit is a Bio::Search::Hit::HitI compliant object while( my $hsp = $hit->next_hsp ) { ## $hsp is a Bio::Search::HSP::HSPI compliant object if( $hsp->length('total') > 50 ) { if ( $hsp->percent_identity >= 75 ) { print "Query=", $result->query_name, " Hit=", $hit->name, " Length=", $hsp->length('total'), " Percent_id=", $hsp->percent_identity, "\n"; } } } } } And I get this error:Odd number of elements in hash assignment at /usr/local/share/perl/5.14.2/Bio/SearchIO.pm line 189. Can anybody help! Cheers, T. From carandraug+dev at gmail.com Tue Feb 5 18:56:21 2013 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Tue, 5 Feb 2013 18:56:21 +0000 Subject: [Bioperl-l] removing packages from bioperl-live Message-ID: Hi some of the bioperl-live packages have already been split into separate repositories. However, they were never actually removed from bioperl-live. This creates 2 entry points for bug fixes and implementations. After a chat on #bioperl, I was told to ask here. Should these be removed? For example, there's bioperl-FeatureIO but that code alo exists in bioperl-live. Can I remove it from bioperl-live? Carn? From cjfields at illinois.edu Tue Feb 5 19:34:07 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Tue, 5 Feb 2013 19:34:07 +0000 Subject: [Bioperl-l] Removing Bio::FeatureIO? was Re: removing packages from bioperl-live In-Reply-To: References: Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu> Probably should retitle this to ask the question directly (make sure the right radars are pinged). My vote is yes, it should be removed. There were a lot of implementation issues with it that ended up becoming problematic. I do believe it is used, though, so I would like to get additional responses from the community before removing it and pointing to the separate repository (where there has been a lot of experimenting going on). chris On Feb 5, 2013, at 12:56 PM, Carn? Draug wrote: > Hi > > some of the bioperl-live packages have already been split into > separate repositories. However, they were never actually removed from > bioperl-live. This creates 2 entry points for bug fixes and > implementations. After a chat on #bioperl, I was told to ask here. > > Should these be removed? For example, there's bioperl-FeatureIO but > that code alo exists in bioperl-live. Can I remove it from > bioperl-live? > > Carn? > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From scott at scottcain.net Tue Feb 5 19:36:10 2013 From: scott at scottcain.net (Scott Cain) Date: Tue, 5 Feb 2013 14:36:10 -0500 Subject: [Bioperl-l] Removing Bio::FeatureIO? was Re: removing packages from bioperl-live In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu> Message-ID: I'm sure it will lead to lots of fun, but I suspect you are right and it should be removed. It's time you yank on that bandaid :-) Scott On Tue, Feb 5, 2013 at 2:34 PM, Fields, Christopher J wrote: > Probably should retitle this to ask the question directly (make sure the right radars are pinged). > > My vote is yes, it should be removed. There were a lot of implementation issues with it that ended up becoming problematic. I do believe it is used, though, so I would like to get additional responses from the community before removing it and pointing to the separate repository (where there has been a lot of experimenting going on). > > chris > > On Feb 5, 2013, at 12:56 PM, Carn? Draug wrote: > >> Hi >> >> some of the bioperl-live packages have already been split into >> separate repositories. However, they were never actually removed from >> bioperl-live. This creates 2 entry points for bug fixes and >> implementations. After a chat on #bioperl, I was told to ask here. >> >> Should these be removed? For example, there's bioperl-FeatureIO but >> that code alo exists in bioperl-live. Can I remove it from >> bioperl-live? >> >> Carn? >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- ------------------------------------------------------------------------ Scott Cain, Ph. D. scott at scottcain dot net GMOD Coordinator (http://gmod.org/) 216-392-3087 Ontario Institute for Cancer Research From carandraug+dev at gmail.com Tue Feb 5 20:06:23 2013 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Tue, 5 Feb 2013 20:06:23 +0000 Subject: [Bioperl-l] dependencies on perl version Message-ID: Hi how much perl backwards compatibility does bioperl needs to keep? If I have something I want to implement and use state (requires 5.010), is it acceptable? 5.010 is already a quite old perl version. Of course, there are other less elegant ways to implement those features. If I can't use modern perl stuff, what version number is the limit? Carn? From carandraug+dev at gmail.com Tue Feb 5 20:10:01 2013 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Tue, 5 Feb 2013 20:10:01 +0000 Subject: [Bioperl-l] Removing Bio::FeatureIO? was Re: removing packages from bioperl-live In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu> Message-ID: On 5 February 2013 19:34, Fields, Christopher J wrote: > Probably should retitle this to ask the question directly (make sure the right radars are pinged). > > My vote is yes, it should be removed. There were a lot of implementation issues with it that ended up becoming problematic. I do believe it is used, though, so I would like to get additional responses from the community before removing it and pointing to the separate repository (where there has been a lot of experimenting going on). Mentioning Bio::FeatureIO was just an example. I meant to ask it as more general. If the code is already in a separate repository, should it be removed from bioperl-live? Carn? From cjfields at illinois.edu Tue Feb 5 20:56:48 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Tue, 5 Feb 2013 20:56:48 +0000 Subject: [Bioperl-l] dependencies on perl version In-Reply-To: References: Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> Aim for 5.10.1, but be careful of smart-match. If you do this, make sure to add a 'use 5.010' pragma at the top. (for those who don't like this, please speak up. perl 5.8 has been around almost 8 yrs, we would like to allow using new features if at all possible) chris On Feb 5, 2013, at 2:06 PM, Carn? Draug wrote: > Hi > > how much perl backwards compatibility does bioperl needs to keep? > > If I have something I want to implement and use state (requires > 5.010), is it acceptable? 5.010 is already a quite old perl version. > Of course, there are other less elegant ways to implement those > features. If I can't use modern perl stuff, what version number is the > limit? > > Carn? > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Tue Feb 5 20:59:38 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Tue, 5 Feb 2013 20:59:38 +0000 Subject: [Bioperl-l] Removing Bio::FeatureIO? was Re: removing packages from bioperl-live In-Reply-To: References: <118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1829D@CHIMBX5.ad.uillinois.edu> On Feb 5, 2013, at 2:10 PM, Carn? Draug wrote: > On 5 February 2013 19:34, Fields, Christopher J wrote: >> Probably should retitle this to ask the question directly (make sure the right radars are pinged). >> >> My vote is yes, it should be removed. There were a lot of implementation issues with it that ended up becoming problematic. I do believe it is used, though, so I would like to get additional responses from the community before removing it and pointing to the separate repository (where there has been a lot of experimenting going on). > > Mentioning Bio::FeatureIO was just an example. I meant to ask it as > more general. If the code is already in a separate repository, should > it be removed from bioperl-live? > > Carn? Yes for Bio::FeatureIO, no for Bio::Root::Root and the others at the moment (I want to get a release out by March 1, which I'm planning on announcing later today, so the less disruptive it is the better). Once we get a new release out we should remove the rest. chris From cjfields at illinois.edu Tue Feb 5 21:53:29 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Tue, 5 Feb 2013 21:53:29 +0000 Subject: [Bioperl-l] Next BioPerl release Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu> All, I am scheduling the next BioPerl CPAN release tentatively for March 1. Any help in triaging bug reports would be greatly appreciated! Amongst all other changes, as mentioned in a separate thread we will remove Bio::FeatureIO, now developed in a separate repository: https://github.com/bioperl/Bio-FeatureIO Feedback, suggestions, etc are greatly appreciated. chris From miker at htblis.com Wed Feb 6 00:54:17 2013 From: miker at htblis.com (Michael Rogoff) Date: Tue, 5 Feb 2013 16:54:17 -0800 Subject: [Bioperl-l] Bio::Graphics error when rendering features with Split locations Message-ID: When trying to render features from a genbank file that include a split location e.g.: promoter join(1000..1080,1..5) /label=PROM1 The following exception is raised: Can't locate object method "has_tag" via package "Bio::Location::Simple" at lib/perl5/site_perl/5.10.1/Bio/Graphics/Glyph.pm line 704, line 36. This can be reproduced with the code in the example "Rendering Features from a GenBank or EMBL File" from the Graphics HOW-TO: http://www.bioperl.org/wiki/HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File Is there a way to change the script so that split locations would, at the very least, not cause a fatal error? Is there a different glyph type that needs to be used? Thanks in advance for any help. I've attached a simple genbank input that will reproduce the error: LOCUS sample2 1080 bp DNA circular DEFINITION Cloning vector sample2 ACCESSION sample2 VERSION sample2.1 GI:4352432 COMMENT Component Fragments FEATURES Location/Qualifiers terminator 39..328 /label=TERM1 /note="terminator 1" misc_feature 393..488 /label=MF1 CDS complement(800..900) /label=CDS1 /note="resistence gene" promoter join(1000..1080,1..5) /label=PROM1 ORIGIN 1 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 61 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 121 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 181 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 241 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 301 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 361 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 421 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 481 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 541 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 601 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 661 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 721 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 781 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 841 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 901 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 961 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1021 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn // P.S. I think I have traced the source of the problem to Glyph's _subfeat method, which in the case of a feature with split locations is returning location objects instead of feature objects. Is this a bug? sub _subfeat { my $class = shift; my $feature = shift; return $feature->segments if $feature->can('segments'); my @split = eval { my $id = $feature->location->seq_id; my @subs = $feature->location->sub_Location; grep {$id eq $_->seq_id} @subs; }; return @split if @split; # Either the APIs have changed, or I got confused at some point... return $feature->get_SeqFeatures if $feature->can('get_SeqFeatures'); return $feature->sub_SeqFeature if $feature->can('sub_SeqFeature'); return; } From l.m.timmermans at students.uu.nl Wed Feb 6 02:40:27 2013 From: l.m.timmermans at students.uu.nl (Leon Timmermans) Date: Wed, 6 Feb 2013 03:40:27 +0100 Subject: [Bioperl-l] dependencies on perl version In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> Message-ID: On Tue, Feb 5, 2013 at 9:56 PM, Fields, Christopher J wrote: > Aim for 5.10.1, but be careful of smart-match. If you do this, make sure to add a 'use 5.010' pragma at the top. > > (for those who don't like this, please speak up. perl 5.8 has been around almost 8 yrs, we would like to allow using new features if at all possible) I *really* hate saying it, but I fear a lot of places are still stuck on 5.8, in particular on 5.8.8 because of CentOS 5. I know my department still is and doesn't seem to be in a hurry to upgrade, and I'm pretty sure it won't be the only one (though personally I use a self-compiled 5.16). Leon From florent.angly at gmail.com Wed Feb 6 02:51:27 2013 From: florent.angly at gmail.com (Florent Angly) Date: Wed, 06 Feb 2013 12:51:27 +1000 Subject: [Bioperl-l] Removing Bio::FeatureIO? was Re: removing packages from bioperl-live In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1829D@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE17EDA@CHIMBX5.ad.uillinois.edu> <118F034CF4C3EF48A96F86CE585B94BF6CE1829D@CHIMBX5.ad.uillinois.edu> Message-ID: <5111C52F.50101@gmail.com> On 06/02/13 06:59, Fields, Christopher J wrote: > On Feb 5, 2013, at 2:10 PM, Carn? Draug wrote: > >> On 5 February 2013 19:34, Fields, Christopher J wrote: >>> Probably should retitle this to ask the question directly (make sure the right radars are pinged). >>> >>> My vote is yes, it should be removed. There were a lot of implementation issues with it that ended up becoming problematic. I do believe it is used, though, so I would like to get additional responses from the community before removing it and pointing to the separate repository (where there has been a lot of experimenting going on). >> Mentioning Bio::FeatureIO was just an example. I meant to ask it as >> more general. If the code is already in a separate repository, should >> it be removed from bioperl-live? >> >> Carn? > Yes for Bio::FeatureIO, no for Bio::Root::Root and the others at the moment (I want to get a release out by March 1, which I'm planning on announcing later today, so the less disruptive it is the better). Once we get a new release out we should remove the rest. > > chris > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Sounds good to me (I've been burnt once by the fact that Bio::FeatureIO is in two places). Florent From florent.angly at gmail.com Wed Feb 6 02:56:19 2013 From: florent.angly at gmail.com (Florent Angly) Date: Wed, 06 Feb 2013 12:56:19 +1000 Subject: [Bioperl-l] dependencies on perl version In-Reply-To: References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> Message-ID: <5111C653.2010703@gmail.com> For what it's worth, the current stable version of Debian uses perl 5.10.1 (http://packages.debian.org/stable/perl/perl). Florent On 06/02/13 12:40, Leon Timmermans wrote: > On Tue, Feb 5, 2013 at 9:56 PM, Fields, Christopher J > wrote: >> Aim for 5.10.1, but be careful of smart-match. If you do this, make sure to add a 'use 5.010' pragma at the top. >> >> (for those who don't like this, please speak up. perl 5.8 has been around almost 8 yrs, we would like to allow using new features if at all possible) > I *really* hate saying it, but I fear a lot of places are still stuck > on 5.8, in particular on 5.8.8 because of CentOS 5. I know my > department still is and doesn't seem to be in a hurry to upgrade, and > I'm pretty sure it won't be the only one (though personally I use a > self-compiled 5.16). > > Leon > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From hlapp at drycafe.net Wed Feb 6 03:27:35 2013 From: hlapp at drycafe.net (Hilmar Lapp) Date: Tue, 5 Feb 2013 22:27:35 -0500 Subject: [Bioperl-l] Next BioPerl release In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu> Message-ID: <09524241-59F8-4BFF-8054-53CD0A649C11@drycafe.net> On Feb 5, 2013, at 4:53 PM, Fields, Christopher J wrote: > I am scheduling the next BioPerl CPAN release tentatively for March 1. Yay!! Thanks for your leadership again, Chris, and for volunteering your time for the project. If nothing else, and I know this is no compensation really worth speaking of, we owe you beer, and I'll certainly pay my debt to you in Berlin if you come there. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : =========================================================== From hlapp at drycafe.net Wed Feb 6 03:32:40 2013 From: hlapp at drycafe.net (Hilmar Lapp) Date: Tue, 5 Feb 2013 22:32:40 -0500 Subject: [Bioperl-l] dependencies on perl version In-Reply-To: <5111C653.2010703@gmail.com> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> Message-ID: Does anyone know what Ubuntu uses? I've heard lots of other old version problems with CentOS. 8 years is really old, and at some point I fear that weighing backwards compatibility too much just holds us back in a real detrimental way. -hilmar On Feb 5, 2013, at 9:56 PM, Florent Angly wrote: > For what it's worth, the current stable version of Debian uses perl 5.10.1 (http://packages.debian.org/stable/perl/perl). > Florent > > On 06/02/13 12:40, Leon Timmermans wrote: >> On Tue, Feb 5, 2013 at 9:56 PM, Fields, Christopher J >> wrote: >>> Aim for 5.10.1, but be careful of smart-match. If you do this, make sure to add a 'use 5.010' pragma at the top. >>> >>> (for those who don't like this, please speak up. perl 5.8 has been around almost 8 yrs, we would like to allow using new features if at all possible) >> I *really* hate saying it, but I fear a lot of places are still stuck >> on 5.8, in particular on 5.8.8 because of CentOS 5. I know my >> department still is and doesn't seem to be in a hurry to upgrade, and >> I'm pretty sure it won't be the only one (though personally I use a >> self-compiled 5.16). >> >> Leon >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : =========================================================== From cjfields at illinois.edu Wed Feb 6 03:58:08 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 6 Feb 2013 03:58:08 +0000 Subject: [Bioperl-l] dependencies on perl version In-Reply-To: References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE18CBE@CHIMBX5.ad.uillinois.edu> Re: being held back, I agree. I don't necessarily want to intentionally break current modules by adding modern code unless it can be demonstrated to be a decent benefit performance-wise, but I don't want to impede new additions by requiring compat with perl 5.8 (hence my suggestion of a 'use 5.01x' pragma when appropriate). Ubuntu 12.04 LTS is on perl 5.14.2: http://askubuntu.com/questions/80672/what-perl-version-will-be-in-12-04-lts BTW, I was wrong about perl 5.8 being 8 yrs old; it's almost 11 yrs old (perl 5.8.0 was released on 7/18/2002). perl 5.8 reached end-of-life in 2008, fixes being only for security reasons. So, I support dropping perl 5.8 support, but we should have a decent route of use for the folks stuck on old clusters. chris On Feb 5, 2013, at 9:32 PM, Hilmar Lapp wrote: > Does anyone know what Ubuntu uses? I've heard lots of other old version problems with CentOS. > > 8 years is really old, and at some point I fear that weighing backwards compatibility too much just holds us back in a real detrimental way. > > -hilmar > > On Feb 5, 2013, at 9:56 PM, Florent Angly wrote: > >> For what it's worth, the current stable version of Debian uses perl 5.10.1 (http://packages.debian.org/stable/perl/perl). >> Florent >> >> On 06/02/13 12:40, Leon Timmermans wrote: >>> On Tue, Feb 5, 2013 at 9:56 PM, Fields, Christopher J >>> wrote: >>>> Aim for 5.10.1, but be careful of smart-match. If you do this, make sure to add a 'use 5.010' pragma at the top. >>>> >>>> (for those who don't like this, please speak up. perl 5.8 has been around almost 8 yrs, we would like to allow using new features if at all possible) >>> I *really* hate saying it, but I fear a lot of places are still stuck >>> on 5.8, in particular on 5.8.8 because of CentOS 5. I know my >>> department still is and doesn't seem to be in a hurry to upgrade, and >>> I'm pretty sure it won't be the only one (though personally I use a >>> self-compiled 5.16). >>> >>> Leon >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From l.m.timmermans at students.uu.nl Wed Feb 6 04:11:52 2013 From: l.m.timmermans at students.uu.nl (Leon Timmermans) Date: Wed, 6 Feb 2013 05:11:52 +0100 Subject: [Bioperl-l] dependencies on perl version In-Reply-To: References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> Message-ID: On Wed, Feb 6, 2013 at 4:32 AM, Hilmar Lapp wrote: > Does anyone know what Ubuntu uses? 5.14.2, distrowatch is your friend ;-) > I've heard lots of other old version problems with CentOS. I know people who still use CentOS 4 in production :-| > 8 years is really old, and at some point I fear that weighing backwards compatibility too much just holds us back in a real detrimental way. CentOS 5 is 6 years old (and will be supported another 4), but CentOS 6 is 'only' 19 months. perl missing a release in the 5.8-5.10 timeframe combined with an unfortunate alignment of its release schedule with Red Hat's don't do us any favors here. Leon From cjfields at illinois.edu Wed Feb 6 04:14:24 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 6 Feb 2013 04:14:24 +0000 Subject: [Bioperl-l] dependencies on perl version In-Reply-To: References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE18E52@CHIMBX5.ad.uillinois.edu> On Feb 5, 2013, at 8:40 PM, Leon Timmermans wrote: > On Tue, Feb 5, 2013 at 9:56 PM, Fields, Christopher J > wrote: >> Aim for 5.10.1, but be careful of smart-match. If you do this, make sure to add a 'use 5.010' pragma at the top. >> >> (for those who don't like this, please speak up. perl 5.8 has been around almost 8 yrs, we would like to allow using new features if at all possible) > > I *really* hate saying it, but I fear a lot of places are still stuck > on 5.8, in particular on 5.8.8 because of CentOS 5. I know my > department still is and doesn't seem to be in a hurry to upgrade, and > I'm pretty sure it won't be the only one (though personally I use a > self-compiled 5.16). > > Leon We had the same problem for a while, but our sysadmins were willing to set up perl 5.12 (at that time) loadable as a module (we can of course set up a local perl as well). We're now using a sysadmin-installed perl 5.16 with our current cluster. chris From cjfields at illinois.edu Wed Feb 6 04:24:31 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 6 Feb 2013 04:24:31 +0000 Subject: [Bioperl-l] dependencies on perl version In-Reply-To: References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> On Feb 5, 2013, at 10:11 PM, Leon Timmermans wrote: > On Wed, Feb 6, 2013 at 4:32 AM, Hilmar Lapp wrote: >> Does anyone know what Ubuntu uses? > > 5.14.2, distrowatch is your friend ;-) > >> I've heard lots of other old version problems with CentOS. > > I know people who still use CentOS 4 in production :-| > >> 8 years is really old, and at some point I fear that weighing backwards compatibility too much just holds us back in a real detrimental way. > > CentOS 5 is 6 years old (and will be supported another 4), but CentOS > 6 is 'only' 19 months. perl missing a release in the 5.8-5.10 > timeframe combined with an unfortunate alignment of its release > schedule with Red Hat's don't do us any favors here. > > Leon Right, it took ~8 yrs to go from 5.8 to 5.10. I'd like to point out that Python users are in the same boat: the Python version for CentOS 5 is 2.4.3, and Biopython requires a minimum of python 2.5 (and recommends python 2.7). We can always state that perl 5.8 is supported for the upcoming Bioperl release, but we're dropping v5.8 support for any future releases. chris From l.m.timmermans at students.uu.nl Wed Feb 6 04:33:57 2013 From: l.m.timmermans at students.uu.nl (Leon Timmermans) Date: Wed, 6 Feb 2013 05:33:57 +0100 Subject: [Bioperl-l] dependencies on perl version In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> Message-ID: On Wed, Feb 6, 2013 at 5:24 AM, Fields, Christopher J wrote: > Right, it took ~8 yrs to go from 5.8 to 5.10. I'd like to point out that Python users are in the same boat: the Python version for CentOS 5 is 2.4.3, and Biopython requires a minimum of python 2.5 (and recommends python 2.7). > > We can always state that perl 5.8 is supported for the upcoming Bioperl release, but we're dropping v5.8 support for any future releases. Sounds reasonable. These things shouldn't come as a surprise. I suspect that the thing that will save us is that most of these people install it once and then never upgrade. Leon From hartzell at alerce.com Wed Feb 6 17:58:07 2013 From: hartzell at alerce.com (George Hartzell) Date: Wed, 6 Feb 2013 09:58:07 -0800 Subject: [Bioperl-l] dependencies on perl version In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> Message-ID: <20754.39343.128576.743448@gargle.gargle.HOWL> Fields, Christopher J writes: > [...] > Right, it took ~8 yrs to go from 5.8 to 5.10. I'd like to point > out that Python users are in the same boat: the Python version for > CentOS 5 is 2.4.3, and Biopython requires a minimum of python 2.5 > (and recommends python 2.7). > > We can always state that perl 5.8 is supported for the upcoming > Bioperl release, but we're dropping v5.8 support for any future > releases. Do more than drop support for 5.8. The Perl community has put a transparent and predictable process in place for releasing [generally] better versions of the language. It means that Perl has a chance of continuing to be relevant, attracting new talent and actually *fixing* some of the s&%t that gives Perl a bad rap. It gives people something to plan around, no one should be surprised that v 5.X.Y is coming out in mid 20ZZ. BioPerl should do the same thing, declare a release policy that trails along with the Perl release schedule. Keep it simple and no one can argue with it. Support Perl releases as long as the releases themselves are supported. Rather than expending energy supporting out of date platforms, put the energy into being modern (or Modern...), better distro building and packaging, testing, documentation and releasing so that the process of staying current is painless. Look forward. Keep it interesting and fun. Everyone running Mac OS 9 on their Pismo, raise your hand. Anyone make their living running sequencing gels in Plexiglas doohickeys on their lab bench? I'm not suggesting that the BioPerl community is free to make arbitrary and capricious changes that makes it difficult for *anyone* to get anything done. Churn is a waste of time. But why should the all-volunteer BioPerl community be stuck supporting code from 12 years ago because it's cost effective for someone else to avoid spending *their* $/time/people to stay up to date. Those sites that value stability/maturity/stagnation so highly have already accepted the cost/difficulty of nailing one of their feet to the floor as they try to run forward. They recognize and depend on the benefits of having that stable base but generally they've also accepted the costs associated with their restrictive choices. They know how to pull in separate kernel/driver updates so that they can actually run on nearly modern hardware. They know, and live with, the fact that they're not going to have access to the shiny new stuff. And they know how to stay up to date, when they need to, with the software that their users need to be competitive (e.g. BioConductor and R). As long as (if/when...) updating a BioPerl release is something that can reliably happen with a few cpanm invocations then the sites that otherwise favor punctuated equilibrium will learn to handle gradual change. Those folks that are "stuck" on older releases always have the option of supporting professional Perl programmers to keep older releases going, backport changes, etc.... They're already buying support for their platforms (or freeloading and coping), let them put bread on the table at one of the bioinformatics consultancies or labs if they have something special they need. Have fun. Use sharp tools. Do cool science. Build cool things. No one is paying you to be backwards compatible with the previous millennium. g. From amackey at virginia.edu Wed Feb 6 18:47:46 2013 From: amackey at virginia.edu (Aaron Mackey) Date: Wed, 6 Feb 2013 13:47:46 -0500 Subject: [Bioperl-l] dependencies on perl version In-Reply-To: <20754.39343.128576.743448@gargle.gargle.HOWL> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> Message-ID: Huzzah! -- Aaron J. Mackey, PhD Assistant Professor Center for Public Health Genomics University of Virginia amackey at virginia.edu http://www.cphg.virginia.edu/mackey On Wed, Feb 6, 2013 at 12:58 PM, George Hartzell wrote: > Fields, Christopher J writes: > > [...] > > Right, it took ~8 yrs to go from 5.8 to 5.10. I'd like to point > > out that Python users are in the same boat: the Python version for > > CentOS 5 is 2.4.3, and Biopython requires a minimum of python 2.5 > > (and recommends python 2.7). > > > > We can always state that perl 5.8 is supported for the upcoming > > Bioperl release, but we're dropping v5.8 support for any future > > releases. > > Do more than drop support for 5.8. > > The Perl community has put a transparent and predictable process in > place for releasing [generally] better versions of the language. It > means that Perl has a chance of continuing to be relevant, attracting > new talent and actually *fixing* some of the s&%t that gives Perl a > bad rap. It gives people something to plan around, no one should be > surprised that v 5.X.Y is coming out in mid 20ZZ. > > BioPerl should do the same thing, declare a release policy that trails > along with the Perl release schedule. Keep it simple and no one can > argue with it. Support Perl releases as long as the releases > themselves are supported. > > Rather than expending energy supporting out of date platforms, put the > energy into being modern (or Modern...), better distro building and > packaging, testing, documentation and releasing so that the process of > staying current is painless. > > Look forward. Keep it interesting and fun. > > Everyone running Mac OS 9 on their Pismo, raise your hand. Anyone > make their living running sequencing gels in Plexiglas doohickeys on > their lab bench? > > I'm not suggesting that the BioPerl community is free to make > arbitrary and capricious changes that makes it difficult for *anyone* > to get anything done. Churn is a waste of time. > > But why should the all-volunteer BioPerl community be stuck supporting > code from 12 years ago because it's cost effective for someone else to > avoid spending *their* $/time/people to stay up to date. > > Those sites that value stability/maturity/stagnation so highly have > already accepted the cost/difficulty of nailing one of their feet to > the floor as they try to run forward. They recognize and depend on > the benefits of having that stable base but generally they've also > accepted the costs associated with their restrictive choices. They > know how to pull in separate kernel/driver updates so that they can > actually run on nearly modern hardware. They know, and live with, the > fact that they're not going to have access to the shiny new stuff. > And they know how to stay up to date, when they need to, with the > software that their users need to be competitive (e.g. BioConductor > and R). > > As long as (if/when...) updating a BioPerl release is something that > can reliably happen with a few cpanm invocations then the sites that > otherwise favor punctuated equilibrium will learn to handle gradual > change. > > Those folks that are "stuck" on older releases always have the option > of supporting professional Perl programmers to keep older releases > going, backport changes, etc.... They're already buying support for > their platforms (or freeloading and coping), let them put bread on the > table at one of the bioinformatics consultancies or labs if they have > something special they need. > > Have fun. Use sharp tools. Do cool science. Build cool things. No > one is paying you to be backwards compatible with the previous > millennium. > > g. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From tiago.hori at gmail.com Wed Feb 6 13:25:41 2013 From: tiago.hori at gmail.com (Tiago Hori) Date: Wed, 6 Feb 2013 05:25:41 -0800 (PST) Subject: [Bioperl-l] Problems installing Bio::Tools::Run:StandAloneBlastPlus Message-ID: <9b488c6e-34b3-4269-a7ac-e2206720939a@googlegroups.com> Hi Guys, I am trying to install the module Bio::Tools::Run:StandAloneBlastPlus, but it has been hard so far. I managed to install and compile samtools, after finding all the dependencies, but I am still missing something! I posted the complete report below! Any help, would be great! Cheers, T. cpan[1]> install Bio::Tools::Run::StandAloneBlastPlus Reading '/home/tiagohori/.cpan/Metadata' Database was generated on Tue, 05 Feb 2013 18:41:03 GMT Running install for module 'Bio::Tools::Run::StandAloneBlastPlus' Running make for C/CJ/CJFIELDS/BioPerl-Run-1.006900.tar.gz Checksum for /home/tiagohori/.cpan/sources/authors/id/C/CJ/CJFIELDS/BioPerl-Run-1.006900.tar.gz ok Scanning cache /home/tiagohori/.cpan/build for sizes ..................................------------------------------------------DONE DEL(1/20): /home/tiagohori/.cpan/build/BioPerl-Run-1.006900-qpHfzz DEL(2/20): /home/tiagohori/.cpan/build/BioPerl-Run-1.006900-qpHfzz.yml DEL(3/20): /home/tiagohori/.cpan/build/BioPerl-Run-1.006900-nMOXgO DEL(4/20): /home/tiagohori/.cpan/build/BioPerl-Run-1.006900-nMOXgO.yml DEL(5/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-bgBQyC DEL(6/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-bgBQyC.yml DEL(7/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-Ki3dbt DEL(8/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-Ki3dbt.yml DEL(9/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-ciM7U4 DEL(10/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-ciM7U4.yml DEL(11/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-oDyi_5 DEL(12/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-oDyi_5.yml DEL(13/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-AQiiAn DEL(14/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-AQiiAn.yml DEL(15/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-0H2Z9o DEL(16/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-0H2Z9o.yml DEL(17/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-c_8A_U DEL(18/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-c_8A_U.yml DEL(19/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-lWtV8v DEL(20/20): /home/tiagohori/.cpan/build/Bio-SamTools-1.37-lWtV8v.yml CPAN.pm: Building C/CJ/CJFIELDS/BioPerl-Run-1.006900.tar.gz Install scripts? y/n [n ] n Do you want to run tests that require connection to servers across the internet (likely to cause some failures)? y/n [n ] n - will not run internet-requiring tests Created MYMETA.yml and MYMETA.json Creating new 'Build' script for 'BioPerl-Run' version '1.006900' Building BioPerl-Run CJFIELDS/BioPerl-Run-1.006900.tar.gz ./Build -- OK Running Build test t/Amap.t ...................... 1/18 # Required executable for Bio::Tools::Run::Alignment::Amap is not present t/Amap.t ...................... ok t/AnalysisFactory_soap.t ...... skipped: Network tests have not been requested t/Analysis_soap.t ............. skipped: Network tests have not been requested t/BEDTools.t .................. 3/423 # Required executable for Bio::Tools::Run::BEDTools is not present t/BEDTools.t .................. ok t/BWA.t ....................... 1/36 # Required executable for Bio::Tools::Run::BWA is not present t/BWA.t ....................... ok t/Blat.t ...................... 1/33 # Required executable for Bio::Tools::Run::Alignment::Blat is not present # Looks like you planned 33 tests but ran 20. t/Blat.t ...................... Dubious, test returned 255 (wstat 65280, 0xff00) Failed 13/33 subtests (less 15 skipped subtests: 5 okay) t/Bowtie.t .................... 1/73 # Required executable for Bio::Tools::Run::Bowtie is not present t/Bowtie.t .................... ok t/Cap3.t ...................... 1/91 # Required executable for Bio::Tools::Run::Cap3 is not present t/Cap3.t ...................... ok t/Clustalw.t .................. 1/45 # Required executable for Bio::Tools::Run::Alignment::Clustalw is not present t/Clustalw.t .................. ok t/Coil.t ...................... 2/6 # Required executable for Bio::Tools::Run::Coil is not present t/Coil.t ...................... ok t/Consense.t .................. 1/9 # Required executable for Bio::Tools::Run::Phylo::Phylip::Consense is not present t/Consense.t .................. ok t/DBA.t ....................... 1/18 # Required executable for Bio::Tools::Run::Alignment::DBA is not present t/DBA.t ....................... ok t/DrawGram.t .................. 1/6 # Required executable for Bio::Tools::Run::Phylo::Phylip::DrawGram is not present t/DrawGram.t .................. ok t/DrawTree.t .................. 1/6 # Required executable for Bio::Tools::Run::Phylo::Phylip::DrawTree is not present t/DrawTree.t .................. ok t/EMBOSS.t .................... ok t/Ensembl.t ................... skipped: Network tests have not been requested t/Eponine.t ................... 1/7 # Looks like you planned 7 tests but ran 2. t/Eponine.t ................... Dubious, test returned 255 (wstat 65280, 0xff00) Failed 5/7 subtests t/Exonerate.t ................. 1/89 # Required executable for Bio::Tools::Run::Alignment::Exonerate is not present t/Exonerate.t ................. ok t/FootPrinter.t ............... 1/24 # Required executable for Bio::Tools::Run::FootPrinter is not present t/FootPrinter.t ............... ok t/Genemark.hmm.prokaryotic.t .. 1/99 # Required environment variable $GENEMARK_MODELS is not set t/Genemark.hmm.prokaryotic.t .. ok t/Genewise.t .................. 1/20 # Required executable for Bio::Tools::Run::Genewise is not present t/Genewise.t .................. ok t/Genscan.t ................... 1/6 # Required environment variable $GENSCANDIR is not set t/Genscan.t ................... ok t/Gerp.t ...................... 1/33 # Required executable for Bio::Tools::Run::Phylo::Gerp is not present t/Gerp.t ...................... ok t/Glimmer2.t .................. 1/217 # Required executable for Bio::Tools::Run::Glimmer is not present t/Glimmer2.t .................. ok t/Glimmer3.t .................. 1/111 # Required executable for Bio::Tools::Run::Glimmer is not present t/Glimmer3.t .................. ok t/Gumby.t ..................... 1/124 # Required executable for Bio::Tools::Run::Phylo::Gumby is not present t/Gumby.t ..................... ok t/Hmmer.t ..................... 1/27 # Required executable for Bio::Tools::Run::Hmmer is not present t/Hmmer.t ..................... ok t/Hyphy.t ..................... 2/15 # Required executable for Bio::Tools::Run::Phylo::Hyphy::SLAC is not present t/Hyphy.t ..................... ok t/Infernal.t .................. 1/43 # Required executable for Bio::Tools::Run::Infernal is not present t/Infernal.t .................. ok t/Kalign.t .................... 1/8 # Required executable for Bio::Tools::Run::Alignment::Kalign is not present t/Kalign.t .................... ok t/LVB.t ....................... 1/19 # Required executable for Bio::Tools::Run::Phylo::LVB is not present t/LVB.t ....................... ok t/Lagan.t ..................... 1/12 # Required executable for Bio::Tools::Run::Alignment::Lagan is not present t/Lagan.t ..................... ok t/MAFFT.t ..................... 1/17 # Required executable for Bio::Tools::Run::Alignment::MAFFT is not present t/MAFFT.t ..................... ok t/MCS.t ....................... 1/24 # Required executable for Bio::Tools::Run::MCS is not present t/MCS.t ....................... ok t/Maq.t ....................... 1/51 # Required executable for Bio::Tools::Run::Maq is not present t/Maq.t ....................... ok t/Match.t ..................... 1/7 # Required executable for Bio::Tools::Run::Match is not present t/Match.t ..................... ok t/Mdust.t ..................... 1/5 # Required executable for Bio::Tools::Run::Mdust is not present t/Mdust.t ..................... ok t/Meme.t ...................... 1/25 # Required executable for Bio::Tools::Run::Meme is not present t/Meme.t ...................... ok t/Minimo.t .................... 1/72 # Required executable for Bio::Tools::Run::Minimo is not present t/Minimo.t .................... ok t/Molphy.t .................... 1/10 # Required executable for Bio::Tools::Run::Phylo::Molphy::ProtML is not present t/Molphy.t .................... ok t/Muscle.t .................... 1/16 # Required executable for Bio::Tools::Run::Alignment::Muscle is not present t/Muscle.t .................... ok t/Neighbor.t .................. 1/17 # Required executable for Bio::Tools::Run::Phylo::Phylip::Neighbor is not present t/Neighbor.t .................. ok t/Newbler.t ................... 1/98 # Required executable for Bio::Tools::Run::Newbler is not present t/Newbler.t ................... ok t/Njtree.t .................... 1/6 # Required executable for Bio::Tools::Run::Phylo::Njtree::Best is not present t/Njtree.t .................... ok t/PAML.t ...................... 1/28 # Required executable for Bio::Tools::Run::Phylo::PAML::Codeml is not present t/PAML.t ...................... ok t/Pal2Nal.t ................... 1/9 # Required executable for Bio::Tools::Run::Alignment::Pal2Nal is not present t/Pal2Nal.t ................... ok t/PhastCons.t ................. 1/181 # Required executable for Bio::Tools::Run::Phylo::Phast::PhastCons is not present t/PhastCons.t ................. ok t/Phrap.t ..................... 1/127 # Required executable for Bio::Tools::Run::Phrap is not present t/Phrap.t ..................... ok t/Phyml.t ..................... 1/47 # Required executable for Bio::Tools::Run::Phylo::Phyml is not present t/Phyml.t ..................... ok t/Primate.t ................... 1/8 # Required executable for Bio::Tools::Run::Primate is not present t/Primate.t ................... ok t/Primer3.t ................... 1/9 # Required executable for Bio::Tools::Run::Primer3 is not present t/Primer3.t ................... ok t/Prints.t .................... 1/7 # Required executable for Bio::Tools::Run::Prints is not present t/Prints.t .................... ok t/Probalign.t ................. 1/13 # Required executable for Bio::Tools::Run::Alignment::Probalign is not present t/Probalign.t ................. ok t/Probcons.t .................. 1/11 # Required executable for Bio::Tools::Run::Alignment::Probcons is not present t/Probcons.t .................. ok t/Profile.t ................... 1/7 # Required executable for Bio::Tools::Run::Profile is not present t/Profile.t ................... ok t/Promoterwise.t .............. 1/9 # Required executable for Bio::Tools::Run::Promoterwise is not present t/Promoterwise.t .............. ok t/ProtDist.t .................. 1/14 # Required executable for Bio::Tools::Run::Phylo::Phylip::ProtDist is not present t/ProtDist.t .................. ok t/ProtPars.t .................. 1/11 # Required executable for Bio::Tools::Run::Phylo::Phylip::ProtPars is not present t/ProtPars.t .................. ok t/Pseudowise.t ................ 1/18 # Required executable for Bio::Tools::Run::Pseudowise is not present t/Pseudowise.t ................ ok t/QuickTree.t ................. 1/13 # Required executable for Bio::Tools::Run::Phylo::QuickTree is not present t/QuickTree.t ................. ok t/RepeatMasker.t .............. 1/12 RepeatMasker program not found as or not executable. # Required executable for Bio::Tools::Run::RepeatMasker is not present t/RepeatMasker.t .............. ok t/SABlastPlus.t ............... 1/65 # Required executable for Bio::Tools::Run::BlastPlus is not present # Looks like you planned 65 tests but ran 63. t/SABlastPlus.t ............... Dubious, test returned 255 (wstat 65280, 0xff00) Failed 2/65 subtests (less 59 skipped subtests: 4 okay) t/SLR.t ....................... 1/7 # Required executable for Bio::Tools::Run::Phylo::SLR is not present t/SLR.t ....................... ok t/Samtools.t .................. ok t/Seg.t ....................... 1/8 # Required executable for Bio::Tools::Run::Seg is not present t/Seg.t ....................... ok t/Semphy.t .................... 1/19 # Required executable for Bio::Tools::Run::Phylo::Semphy is not present t/Semphy.t .................... ok t/SeqBoot.t ................... 1/9 # Required executable for Bio::Tools::Run::Phylo::Phylip::SeqBoot is not present t/SeqBoot.t ................... ok t/Signalp.t ................... 1/7 # Required executable for Bio::Tools::Run::Signalp is not present t/Signalp.t ................... ok t/Sim4.t ...................... 1/23 # Required executable for Bio::Tools::Run::Alignment::Sim4 is not present t/Sim4.t ...................... ok t/Simprot.t ................... 1/6 # Required executable for Bio::Tools::Run::Simprot is not present t/Simprot.t ................... ok t/SoapEU-function.t ........... skipped: The optional module Bio::DB::ESoap (or dependencies thereof) was not installed t/SoapEU-unit.t ............... skipped: The optional module Bio::DB::ESoap (or dependencies thereof) was not installed t/StandAloneFasta.t ........... 1/15 # Required executable for Bio::Tools::Run::Alignment::StandAloneFasta is not present t/StandAloneFasta.t ........... ok t/TCoffee.t ................... 1/27 # Required executable for Bio::Tools::Run::Alignment::TCoffee is not present t/TCoffee.t ................... ok t/TigrAssembler.t ............. 1/88 # Required executable for Bio::Tools::Run::TigrAssembler is not present # Required executable for Bio::Tools::Run::TigrAssembler is not present t/TigrAssembler.t ............. ok t/Tmhmm.t ..................... 1/9 # Required executable for Bio::Tools::Run::Tmhmm is not present t/Tmhmm.t ..................... ok t/TribeMCL.t .................. ok t/Vista.t ..................... ok t/gmap-run.t .................. 1/8 # Required executable for Bio::Tools::Run::Alignment::Gmap is not present t/gmap-run.t .................. ok t/tRNAscanSE.t ................ 1/12 # Required executable for Bio::Tools::Run::tRNAscanSE is not present t/tRNAscanSE.t ................ ok Test Summary Report ------------------- t/Blat.t (Wstat: 65280 Tests: 20 Failed: 0) Non-zero exit status: 255 Parse errors: Bad plan. You planned 33 tests but ran 20. t/Eponine.t (Wstat: 65280 Tests: 2 Failed: 0) Non-zero exit status: 255 Parse errors: Bad plan. You planned 7 tests but ran 2. t/SABlastPlus.t (Wstat: 65280 Tests: 63 Failed: 0) Non-zero exit status: 255 Parse errors: Bad plan. You planned 65 tests but ran 63. Files=80, Tests=2876, 39 wallclock secs ( 0.54 usr 0.23 sys + 32.54 cusr 4.94 csys = 38.25 CPU) Result: FAIL Failed 3/80 test programs. 0/2876 subtests failed. CJFIELDS/BioPerl-Run-1.006900.tar.gz ./Build test -- NOT OK //hint// to see the cpan-testers results for installing this module, try: reports CJFIELDS/BioPerl-Run-1.006900.tar.gz Running Build install make test had returned bad status, won't install without force From guy.leonard at gmail.com Wed Feb 6 18:35:38 2013 From: guy.leonard at gmail.com (guy.leonard at gmail.com) Date: Wed, 6 Feb 2013 10:35:38 -0800 (PST) Subject: [Bioperl-l] Next BioPerl release In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu> Message-ID: <3e4d717e-b58a-4bfd-943d-6f213bfae260@googlegroups.com> Nice, super work. Will there be a rough list of feature changes/addition/deprecation, or shall I consult git logs? On Tuesday, 5 February 2013 21:53:29 UTC, Christopher Fields wrote: > > All, > > I am scheduling the next BioPerl CPAN release tentatively for March 1. > Any help in triaging bug reports would be greatly appreciated! > > Amongst all other changes, as mentioned in a separate thread we will > remove Bio::FeatureIO, now developed in a separate repository: > > https://github.com/bioperl/Bio-FeatureIO > > Feedback, suggestions, etc are greatly appreciated. > > chris > _______________________________________________ > Bioperl-l mailing list > Biop... at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From guy.leonard at gmail.com Wed Feb 6 18:35:38 2013 From: guy.leonard at gmail.com (guy.leonard at gmail.com) Date: Wed, 6 Feb 2013 10:35:38 -0800 (PST) Subject: [Bioperl-l] Next BioPerl release In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu> Message-ID: <3e4d717e-b58a-4bfd-943d-6f213bfae260@googlegroups.com> Nice, super work. Will there be a rough list of feature changes/addition/deprecation, or shall I consult git logs? On Tuesday, 5 February 2013 21:53:29 UTC, Christopher Fields wrote: > > All, > > I am scheduling the next BioPerl CPAN release tentatively for March 1. > Any help in triaging bug reports would be greatly appreciated! > > Amongst all other changes, as mentioned in a separate thread we will > remove Bio::FeatureIO, now developed in a separate repository: > > https://github.com/bioperl/Bio-FeatureIO > > Feedback, suggestions, etc are greatly appreciated. > > chris > _______________________________________________ > Bioperl-l mailing list > Biop... at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From sidd.basu at gmail.com Wed Feb 6 19:36:17 2013 From: sidd.basu at gmail.com (Siddhartha Basu) Date: Wed, 6 Feb 2013 13:36:17 -0600 Subject: [Bioperl-l] Re: Next BioPerl release In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu> Message-ID: <5112b0b3.a5dc320a.4105.1fe3@mx.google.com> Hi, On Tue, 05 Feb 2013, Fields, Christopher J wrote: > All, > > I am scheduling the next BioPerl CPAN release tentatively for March 1. Any help in triaging bug reports would be greatly appreciated! > > Amongst all other changes, as mentioned in a separate thread we will remove Bio::FeatureIO, now developed in a separate repository: > > https://github.com/bioperl/Bio-FeatureIO > > Feedback, suggestions, etc are greatly appreciated. Here are CI build report on 5.12, 5.14 and 5.16 using travis. https://api.travis-ci.org/jobs/4623997/log.txt?deansi=true https://api.travis-ci.org/jobs/4623998/log.txt?deansi=true https://api.travis-ci.org/jobs/4623999/log.txt?deansi=true Could not get 5.10 to work on travis. Though i activated the (--network) option, it still didn't run one of the test that needs network. Also, initially got confused by the fact that though it has dist.ini, the tests still has to run through Build.PL. Running **dzil test** do not work. Hope this helps. thanks, -siddhartha > > chris > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Wed Feb 6 19:46:49 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 6 Feb 2013 19:46:49 +0000 Subject: [Bioperl-l] Next BioPerl release In-Reply-To: <3e4d717e-b58a-4bfd-943d-6f213bfae260@googlegroups.com> References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu> <3e4d717e-b58a-4bfd-943d-6f213bfae260@googlegroups.com> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1A109@CHIMBX5.ad.uillinois.edu> We've been a little better at keeping track of significant changes this time 'round. There aren't a lot of major updates, but it's important to make sure we get a release out to ensure everyone (not just those familiar with git) can access them. chris On Feb 6, 2013, at 12:35 PM, wrote: > Nice, super work. > > Will there be a rough list of feature changes/addition/deprecation, or > shall I consult git logs? > > On Tuesday, 5 February 2013 21:53:29 UTC, Christopher Fields wrote: >> >> All, >> >> I am scheduling the next BioPerl CPAN release tentatively for March 1. >> Any help in triaging bug reports would be greatly appreciated! >> >> Amongst all other changes, as mentioned in a separate thread we will >> remove Bio::FeatureIO, now developed in a separate repository: >> >> https://github.com/bioperl/Bio-FeatureIO >> >> Feedback, suggestions, etc are greatly appreciated. >> >> chris >> _______________________________________________ >> Bioperl-l mailing list >> Biop... at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Wed Feb 6 19:54:58 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 6 Feb 2013 19:54:58 +0000 Subject: [Bioperl-l] Next BioPerl release In-Reply-To: <5112b0b3.a5dc320a.4105.1fe3@mx.google.com> References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu> <5112b0b3.a5dc320a.4105.1fe3@mx.google.com> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1A1AB@CHIMBX5.ad.uillinois.edu> On Feb 6, 2013, at 1:36 PM, Siddhartha Basu wrote: > Hi, > > On Tue, 05 Feb 2013, Fields, Christopher J wrote: > >> All, >> >> I am scheduling the next BioPerl CPAN release tentatively for March 1. Any help in triaging bug reports would be greatly appreciated! >> >> Amongst all other changes, as mentioned in a separate thread we will remove Bio::FeatureIO, now developed in a separate repository: >> >> https://github.com/bioperl/Bio-FeatureIO >> >> Feedback, suggestions, etc are greatly appreciated. > > Here are CI build report on 5.12, 5.14 and 5.16 using travis. > https://api.travis-ci.org/jobs/4623997/log.txt?deansi=true > https://api.travis-ci.org/jobs/4623998/log.txt?deansi=true > https://api.travis-ci.org/jobs/4623999/log.txt?deansi=true > > Could not get 5.10 to work on travis. Though i activated the (--network) > option, it still didn't run one of the test that needs network. Also, initially got > confused by the fact that though it has dist.ini, the tests still has > to run through Build.PL. Running **dzil test** do not work. > > Hope this helps. > > thanks, > -siddhartha Just to point out, that was for Bio-FeatureIO. Truthfully I'm not worried about that one yet; got to get over Mt. Everest first (the main release). Build.PL is there mainly as a convenience for users w/o Dist::Zilla, which, last I recall, had a higher dependency list than even BioPerl (though I may be mistaken). I'll probably have to set up a Build.PL that can be clobbered by Dist::Zilla as needed. Or we can just get rid of it and insist that dev. code has to be added via 'use lib' or PERL5LIB, and not allow installation. chris From sidd.basu at gmail.com Wed Feb 6 20:26:06 2013 From: sidd.basu at gmail.com (Siddhartha Basu) Date: Wed, 6 Feb 2013 14:26:06 -0600 Subject: [Bioperl-l] Re: Next BioPerl release In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1A1AB@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu> <5112b0b3.a5dc320a.4105.1fe3@mx.google.com> <118F034CF4C3EF48A96F86CE585B94BF6CE1A1AB@CHIMBX5.ad.uillinois.edu> Message-ID: <5112bc60.c69e320a.1e98.2028@mx.google.com> On Wed, 06 Feb 2013, Fields, Christopher J wrote: > On Feb 6, 2013, at 1:36 PM, Siddhartha Basu > wrote: > > > Hi, > > > > On Tue, 05 Feb 2013, Fields, Christopher J wrote: > > > >> All, > >> > >> I am scheduling the next BioPerl CPAN release tentatively for March 1. Any help in triaging bug reports would be greatly appreciated! > >> > >> Amongst all other changes, as mentioned in a separate thread we will remove Bio::FeatureIO, now developed in a separate repository: > >> > >> https://github.com/bioperl/Bio-FeatureIO > >> > >> Feedback, suggestions, etc are greatly appreciated. > > > > Here are CI build report on 5.12, 5.14 and 5.16 using travis. > > https://api.travis-ci.org/jobs/4623997/log.txt?deansi=true > > https://api.travis-ci.org/jobs/4623998/log.txt?deansi=true > > https://api.travis-ci.org/jobs/4623999/log.txt?deansi=true > > > > Could not get 5.10 to work on travis. Though i activated the (--network) > > option, it still didn't run one of the test that needs network. Also, initially got > > confused by the fact that though it has dist.ini, the tests still has > > to run through Build.PL. Running **dzil test** do not work. > > > > Hope this helps. > > > > thanks, > > -siddhartha > > Just to point out, that was for Bio-FeatureIO. Truthfully I'm not worried about that one yet; got to get over Mt. Everest first (the main release). So, what are steps left for getting the release out to CPAN. Like are there lot of feature branches still left to be merged, are there a lot of unit tests still not passing. Just trying to figure out anyway i could be of any help to expedite the release process. However, if they are already taken care of, please ignore. > > Build.PL is there mainly as a convenience for users w/o Dist::Zilla, which, last I recall, had a higher dependency list than even BioPerl (though I may be mistaken). I'll probably have to set up a Build.PL that can be clobbered by Dist::Zilla as needed. As far as the error i encountered, presence of Build.PL was blocking dzil build/release process. And by default, dzil expects to generate Build.PL during its build/release process. However, i am not sure which mode is the most suitable for bioperl devs. > Or we can just get rid of it and insist that dev. code has to be added via 'use lib' or PERL5LIB, and not allow installation. thanks, -siddhartha > > chris From hlapp at drycafe.net Wed Feb 6 21:30:33 2013 From: hlapp at drycafe.net (Hilmar Lapp) Date: Wed, 6 Feb 2013 16:30:33 -0500 Subject: [Bioperl-l] dependencies on perl version In-Reply-To: <20754.39343.128576.743448@gargle.gargle.HOWL> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> Message-ID: Great points, George, and you're making a very compelling argument. I'm in total agreement. It's almost becoming a reason to having to be embarrassed to still be programming in Perl these days, so one might as well have fun while it lasts. -hilmar On Feb 6, 2013, at 12:58 PM, George Hartzell wrote: > Fields, Christopher J writes: >> [...] >> Right, it took ~8 yrs to go from 5.8 to 5.10. I'd like to point >> out that Python users are in the same boat: the Python version for >> CentOS 5 is 2.4.3, and Biopython requires a minimum of python 2.5 >> (and recommends python 2.7). >> >> We can always state that perl 5.8 is supported for the upcoming >> Bioperl release, but we're dropping v5.8 support for any future >> releases. > > Do more than drop support for 5.8. > > The Perl community has put a transparent and predictable process in > place for releasing [generally] better versions of the language. It > means that Perl has a chance of continuing to be relevant, attracting > new talent and actually *fixing* some of the s&%t that gives Perl a > bad rap. It gives people something to plan around, no one should be > surprised that v 5.X.Y is coming out in mid 20ZZ. > > BioPerl should do the same thing, declare a release policy that trails > along with the Perl release schedule. Keep it simple and no one can > argue with it. Support Perl releases as long as the releases > themselves are supported. > > Rather than expending energy supporting out of date platforms, put the > energy into being modern (or Modern...), better distro building and > packaging, testing, documentation and releasing so that the process of > staying current is painless. > > Look forward. Keep it interesting and fun. > > Everyone running Mac OS 9 on their Pismo, raise your hand. Anyone > make their living running sequencing gels in Plexiglas doohickeys on > their lab bench? > > I'm not suggesting that the BioPerl community is free to make > arbitrary and capricious changes that makes it difficult for *anyone* > to get anything done. Churn is a waste of time. > > But why should the all-volunteer BioPerl community be stuck supporting > code from 12 years ago because it's cost effective for someone else to > avoid spending *their* $/time/people to stay up to date. > > Those sites that value stability/maturity/stagnation so highly have > already accepted the cost/difficulty of nailing one of their feet to > the floor as they try to run forward. They recognize and depend on > the benefits of having that stable base but generally they've also > accepted the costs associated with their restrictive choices. They > know how to pull in separate kernel/driver updates so that they can > actually run on nearly modern hardware. They know, and live with, the > fact that they're not going to have access to the shiny new stuff. > And they know how to stay up to date, when they need to, with the > software that their users need to be competitive (e.g. BioConductor > and R). > > As long as (if/when...) updating a BioPerl release is something that > can reliably happen with a few cpanm invocations then the sites that > otherwise favor punctuated equilibrium will learn to handle gradual > change. > > Those folks that are "stuck" on older releases always have the option > of supporting professional Perl programmers to keep older releases > going, backport changes, etc.... They're already buying support for > their platforms (or freeloading and coping), let them put bread on the > table at one of the bioinformatics consultancies or labs if they have > something special they need. > > Have fun. Use sharp tools. Do cool science. Build cool things. No > one is paying you to be backwards compatible with the previous > millennium. > > g. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : =========================================================== From cjfields at illinois.edu Wed Feb 6 22:11:06 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 6 Feb 2013 22:11:06 +0000 Subject: [Bioperl-l] BioPerl long-term, was Re: dependencies on perl version In-Reply-To: References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> George, Should put your post on a pedestal :) tl;dr version: I completely agree, but we need help in order to do this. Long(-winded) version: I agree completely, backwards compatibility is killing us. But, we do need current and new people to get involved and help drive this forward. We need people on all fronts, from coding and bug fixes to documentation and web site maintenance. I've been driving this bus for a number of years now. Not getting tired yet, but I am getting substantially busier with my current endeavors, so my time spent working on BioPerl has dwindled considerably. Any additional support or sharing of responsibilities will help tremendously in keeping up momentum (if someone else wants to take the wheel for a bit, please let me know :). If we follow the perl release route, we should streamline the release process (think Dist::Zilla), end support of older versions of Perl, and work on a sustainable release schedule. The fact that we have so many of us so-called 'old folks' speaking up in favor of this is a very good sign. We do need a bit more than that; we need help. BioPerl is a very large project. A key point we need to address, which is very important for the future of BioPerl. I use Perl quite a bit in my current work (dabble with Ruby and Python as well when I have to). BioPerl? A little, but not as much as I could. Shocked? The main three reason I don't use it 'in anger': performance, performance, and performance. It is very important that we make a concerted effort to address this at all levels. It could be as simple as completely separating parsing from object creation (where the bulk of performance problems seem to lie, but not all of them). A specific example: Heng Li once tested the performance of FASTQ parsing (perl, python, bioperl, biopython, his C code, etc). BioPerl's FASTQ couldn't even be measured; IIRC it went on for many hours until he killed it. This was with the older version of the parser, but I'm willing to bet the newer one I wrote isn't any better. This. needs. to. change. I see no problem in stating any generic parsing and low-level interfaces are just as much a part of what BioPerl encompasses as the higher-level Bio::* classes themselves. Steve and Jason were on to something with SearchIO; it's maybe not as performant as we would like, but it certainly is more flexible in terms of what can be done, b/c it separates out low-level parsing from object creation. That's the general model we should look at. There is a good reason Biopython is following this model with their SearchIO implementation (Peter C, are you reading this?) We have a lot of very talented people involved with this project, both on the purely computational and purely biological end as well as the folks like me who straddle the two domains. A lot of good code out there that can be used, wrapped, taken advantage of, including everything we currently have in BioPerl. Let's come up with something that both works and works well, that people can use on a regular basis, even at a low level if they choose. That alone would dissuade new users from writing up (yet another) custom FASTA/FASTQ/BLAST/GenBank/etc parser b/c the BioPerl one takes millennia to finish. A few examples on this front: Rob Buels created a generic parser for GFF3 (Bio::GFF3::LowLevel) with very few dependencies, we wrap this with the newer Bio::FeatureIO code. Leon has Bio::SFF. Lincoln of course wrote Bio::DB::Sam and Bio::DB::BigFile. I have started a wrapper around Heng's FASTQ/FASTA parsing code (kseq), it seems to work quite well (~20M FASTQ in 30 sec last I recall?). So: If it means targeting performance, backwards-compatibility be damned (using Devel::NYTProf?), we do that. If it means creating a new Bio-NGS repo to focus some of these efforts, so be it. If it means we get away from the Java-based interface stuff in favor of something more Perl-like (roles anyone?), then I'm all for it. If it means we modularize BioPerl so this can be done, well, you probably know where I stand (yes). If it means this is to be BioPerl 2.0, then let's move that direction, sooner than later. But I can't do it alone. We (not just me, but we) need to drive the direction we take. First one who codes gets the gold ring. chris On Feb 6, 2013, at 12:47 PM, Aaron Mackey wrote: > Huzzah! > > -- > Aaron J. Mackey, PhD > Assistant Professor > Center for Public Health Genomics > University of Virginia > amackey at virginia.edu > http://www.cphg.virginia.edu/mackey > > > On Wed, Feb 6, 2013 at 12:58 PM, George Hartzell wrote: > Fields, Christopher J writes: > > [...] > > Right, it took ~8 yrs to go from 5.8 to 5.10. I'd like to point > > out that Python users are in the same boat: the Python version for > > CentOS 5 is 2.4.3, and Biopython requires a minimum of python 2.5 > > (and recommends python 2.7). > > > > We can always state that perl 5.8 is supported for the upcoming > > Bioperl release, but we're dropping v5.8 support for any future > > releases. > > Do more than drop support for 5.8. > > The Perl community has put a transparent and predictable process in > place for releasing [generally] better versions of the language. It > means that Perl has a chance of continuing to be relevant, attracting > new talent and actually *fixing* some of the s&%t that gives Perl a > bad rap. It gives people something to plan around, no one should be > surprised that v 5.X.Y is coming out in mid 20ZZ. > > BioPerl should do the same thing, declare a release policy that trails > along with the Perl release schedule. Keep it simple and no one can > argue with it. Support Perl releases as long as the releases > themselves are supported. > > Rather than expending energy supporting out of date platforms, put the > energy into being modern (or Modern...), better distro building and > packaging, testing, documentation and releasing so that the process of > staying current is painless. > > Look forward. Keep it interesting and fun. > > Everyone running Mac OS 9 on their Pismo, raise your hand. Anyone > make their living running sequencing gels in Plexiglas doohickeys on > their lab bench? > > I'm not suggesting that the BioPerl community is free to make > arbitrary and capricious changes that makes it difficult for *anyone* > to get anything done. Churn is a waste of time. > > But why should the all-volunteer BioPerl community be stuck supporting > code from 12 years ago because it's cost effective for someone else to > avoid spending *their* $/time/people to stay up to date. > > Those sites that value stability/maturity/stagnation so highly have > already accepted the cost/difficulty of nailing one of their feet to > the floor as they try to run forward. They recognize and depend on > the benefits of having that stable base but generally they've also > accepted the costs associated with their restrictive choices. They > know how to pull in separate kernel/driver updates so that they can > actually run on nearly modern hardware. They know, and live with, the > fact that they're not going to have access to the shiny new stuff. > And they know how to stay up to date, when they need to, with the > software that their users need to be competitive (e.g. BioConductor > and R). > > As long as (if/when...) updating a BioPerl release is something that > can reliably happen with a few cpanm invocations then the sites that > otherwise favor punctuated equilibrium will learn to handle gradual > change. > > Those folks that are "stuck" on older releases always have the option > of supporting professional Perl programmers to keep older releases > going, backport changes, etc.... They're already buying support for > their platforms (or freeloading and coping), let them put bread on the > table at one of the bioinformatics consultancies or labs if they have > something special they need. > > Have fun. Use sharp tools. Do cool science. Build cool things. No > one is paying you to be backwards compatible with the previous > millennium. > > g. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From cjfields at illinois.edu Wed Feb 6 22:34:42 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 6 Feb 2013 22:34:42 +0000 Subject: [Bioperl-l] BioPerl long-term, was Re: dependencies on perl version In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1AF0C@CHIMBX5.ad.uillinois.edu> I want to clarify, parser optimization isn't the only point we need to focus on by any means (and may not be the main one). There is a lot of room for improvement top to bottom, that was one specific example I have long held to be an issue. -c On Feb 6, 2013, at 4:11 PM, "Fields, Christopher J" wrote: > Shocked? The main three reason I don't use it 'in anger': performance, performance, and performance. It is very important that we make a concerted effort to address this at all levels. It could be as simple as completely separating parsing from object creation (where the bulk of performance problems seem to lie, but not all of them). ... From p.j.a.cock at googlemail.com Wed Feb 6 22:43:13 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 6 Feb 2013 22:43:13 +0000 Subject: [Bioperl-l] BioPerl long-term, was Re: dependencies on perl version In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> Message-ID: On Wed, Feb 6, 2013 at 10:11 PM, Fields, Christopher J wrote: > > I see no problem in stating any generic parsing and low-level interfaces > are just as much a part of what BioPerl encompasses as the higher-level > Bio::* classes themselves. Steve and Jason were on to something with > SearchIO; it's maybe not as performant as we would like, but it certainly > is more flexible in terms of what can be done, b/c it separates out > low-level parsing from object creation. That's the general model we > should look at. There is a good reason Biopython is following this > model with their SearchIO implementation (Peter C, are you reading this?) Actually I don't think we did end up with that kind of separation in the Biopython SearchIO - which is not so say it isn't an excellent model to follow. Rather the Biopython SearchIO (like the BioPerl one) had as the first goal a consistent object model across assorted file formats. The idea of a low level minimal overhead parsers (which are very format specific), on which a heavier but consistent object model can be built might be a good balance - the high level API has the connivence, but if you give that up you can have more speed. That's what I recommend with FASTQ and Biopython, e.g. http://news.open-bio.org/news/2009/09/biopython-fast-fastq/ > > I have started a wrapper around Heng's FASTQ/FASTA parsing > code (kseq), it seems to work quite well (~20M FASTQ in 30 sec > last I recall?). > I'd have to dig through my emails, but I think the BioRuby guys looked at that too - as I recall while it was fast, the error handling left something to be desired. Email me directly or on the BioRuby list if you want to follow up on that. Regards, Peter From cjfields at illinois.edu Wed Feb 6 22:53:21 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 6 Feb 2013 22:53:21 +0000 Subject: [Bioperl-l] FASTQ, was Re: BioPerl long-term, was Re: dependencies on perl version In-Reply-To: References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu> On Feb 6, 2013, at 4:43 PM, Peter Cock wrote: > On Wed, Feb 6, 2013 at 10:11 PM, Fields, Christopher J > wrote: >> >> I see no problem in stating any generic parsing and low-level interfaces >> are just as much a part of what BioPerl encompasses as the higher-level >> Bio::* classes themselves. Steve and Jason were on to something with >> SearchIO; it's maybe not as performant as we would like, but it certainly >> is more flexible in terms of what can be done, b/c it separates out >> low-level parsing from object creation. That's the general model we >> should look at. There is a good reason Biopython is following this >> model with their SearchIO implementation (Peter C, are you reading this?) > > Actually I don't think we did end up with that kind of separation in the > Biopython SearchIO - which is not so say it isn't an excellent model > to follow. Rather the Biopython SearchIO (like the BioPerl one) had > as the first goal a consistent object model across assorted file > formats. > > The idea of a low level minimal overhead parsers (which are very > format specific), on which a heavier but consistent object model > can be built might be a good balance - the high level API has the > connivence, but if you give that up you can have more speed. > That's what I recommend with FASTQ and Biopython, e.g. > http://news.open-bio.org/news/2009/09/biopython-fast-fastq/ > >> >> I have started a wrapper around Heng's FASTQ/FASTA parsing >> code (kseq), it seems to work quite well (~20M FASTQ in 30 sec >> last I recall?). >> > > I'd have to dig through my emails, but I think the BioRuby guys > looked at that too - as I recall while it was fast, the error handling > left something to be desired. Email me directly or on the BioRuby > list if you want to follow up on that. > > Regards, > > Peter I did a little on this, worth following up on, but I pulled the FASTQ test examples you created from the paper to test it out. IIRC it parsed where it needed to, but I'm not sure how it handled bad sequences, so yes, worth looking into. Maybe worth moving to open-bio-l for broader discussion. chris From whereverroadgoes at gmail.com Wed Feb 6 21:59:04 2013 From: whereverroadgoes at gmail.com (Slym) Date: Wed, 6 Feb 2013 13:59:04 -0800 (PST) Subject: [Bioperl-l] Bio::Tools::SeqStats to count bases In-Reply-To: <87txpr26jj.fsf@topper.koldfront.dk> References: <82ac767b-8816-41af-9b9f-1dd3fa1b0a49@googlegroups.com> <87txpr26jj.fsf@topper.koldfront.dk> Message-ID: <411e920d-e614-417d-9198-78bef9adba16@googlegroups.com> Everything's working now! Thank you very much, especially to you Adam! > From carandraug+dev at gmail.com Thu Feb 7 01:38:20 2013 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Thu, 7 Feb 2013 01:38:20 +0000 Subject: [Bioperl-l] dependencies on perl version In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> Message-ID: On 5 February 2013 20:56, Fields, Christopher J wrote: > On Feb 5, 2013, at 2:06 PM, Carn? Draug wrote: >> how much perl backwards compatibility does bioperl needs to keep? > > Aim for 5.10.1, but be careful of smart-match. Well, I solved my problem differently and ended up not needing any of the new features. But next time I'll know. Thanks Carn? From pcantalupo at gmail.com Thu Feb 7 04:04:08 2013 From: pcantalupo at gmail.com (Paul Cantalupo) Date: Wed, 6 Feb 2013 23:04:08 -0500 Subject: [Bioperl-l] bug 3376 status needs updated Message-ID: Hi, A few months ago, I fixed bug 3376 ( https://github.com/bioperl/bioperl-live/commit/142e5d79e3a6593db32bf0af99048f47d01bd3f2). The Redmine bug page (https://redmine.open-bio.org/issues/3376) hasn't been updated to resolved or closed. Should I do this or is Chris the only one who does that? Thank you, Paul From cjfields at illinois.edu Thu Feb 7 04:20:30 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Thu, 7 Feb 2013 04:20:30 +0000 Subject: [Bioperl-l] bug 3376 status needs updated In-Reply-To: References: Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1B45C@CHIMBX5.ad.uillinois.edu> No, go ahead and close it. Let me know if you run into perm. problems with it. chris On Feb 6, 2013, at 10:04 PM, Paul Cantalupo wrote: > Hi, > > A few months ago, I fixed bug 3376 ( > https://github.com/bioperl/bioperl-live/commit/142e5d79e3a6593db32bf0af99048f47d01bd3f2). > The Redmine bug page (https://redmine.open-bio.org/issues/3376) hasn't been > updated to resolved or closed. Should I do this or is Chris the only one > who does that? > > Thank you, > > Paul > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From l.m.timmermans at students.uu.nl Thu Feb 7 09:07:57 2013 From: l.m.timmermans at students.uu.nl (Leon Timmermans) Date: Thu, 7 Feb 2013 10:07:57 +0100 Subject: [Bioperl-l] Next BioPerl release In-Reply-To: <5112bc60.c69e320a.1e98.2028@mx.google.com> References: <118F034CF4C3EF48A96F86CE585B94BF6CE184AF@CHIMBX5.ad.uillinois.edu> <5112b0b3.a5dc320a.4105.1fe3@mx.google.com> <118F034CF4C3EF48A96F86CE585B94BF6CE1A1AB@CHIMBX5.ad.uillinois.edu> <5112bc60.c69e320a.1e98.2028@mx.google.com> Message-ID: On Wed, Feb 6, 2013 at 9:26 PM, Siddhartha Basu wrote: > As far as the error i encountered, presence of Build.PL was blocking dzil > build/release process. And by default, dzil expects to generate > Build.PL during its build/release process. However, i am not sure which > mode is the most suitable for bioperl devs. You can prune the Build.PL, and then let dzil add its own. We wouldn't be the first to do that sort of thing. Leon From amackey at virginia.edu Thu Feb 7 15:25:07 2013 From: amackey at virginia.edu (Aaron Mackey) Date: Thu, 7 Feb 2013 10:25:07 -0500 Subject: [Bioperl-l] FASTQ, was Re: BioPerl long-term, was Re: dependencies on perl version In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> <118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu> Message-ID: You might also want to consider a lazy/pull-based parser to defer parsing/object-building for pieces of the object that don't get used. This also usually provides some error tolerance. -Aaron -- Aaron J. Mackey, PhD Assistant Professor Center for Public Health Genomics University of Virginia amackey at virginia.edu http://www.cphg.virginia.edu/mackey On Wed, Feb 6, 2013 at 5:53 PM, Fields, Christopher J wrote: > On Feb 6, 2013, at 4:43 PM, Peter Cock wrote: > > > On Wed, Feb 6, 2013 at 10:11 PM, Fields, Christopher J > > wrote: > >> > >> I see no problem in stating any generic parsing and low-level interfaces > >> are just as much a part of what BioPerl encompasses as the higher-level > >> Bio::* classes themselves. Steve and Jason were on to something with > >> SearchIO; it's maybe not as performant as we would like, but it > certainly > >> is more flexible in terms of what can be done, b/c it separates out > >> low-level parsing from object creation. That's the general model we > >> should look at. There is a good reason Biopython is following this > >> model with their SearchIO implementation (Peter C, are you reading > this?) > > > > Actually I don't think we did end up with that kind of separation in the > > Biopython SearchIO - which is not so say it isn't an excellent model > > to follow. Rather the Biopython SearchIO (like the BioPerl one) had > > as the first goal a consistent object model across assorted file > > formats. > > > > The idea of a low level minimal overhead parsers (which are very > > format specific), on which a heavier but consistent object model > > can be built might be a good balance - the high level API has the > > connivence, but if you give that up you can have more speed. > > That's what I recommend with FASTQ and Biopython, e.g. > > http://news.open-bio.org/news/2009/09/biopython-fast-fastq/ > > > >> > >> I have started a wrapper around Heng's FASTQ/FASTA parsing > >> code (kseq), it seems to work quite well (~20M FASTQ in 30 sec > >> last I recall?). > >> > > > > I'd have to dig through my emails, but I think the BioRuby guys > > looked at that too - as I recall while it was fast, the error handling > > left something to be desired. Email me directly or on the BioRuby > > list if you want to follow up on that. > > > > Regards, > > > > Peter > > I did a little on this, worth following up on, but I pulled the FASTQ test > examples you created from the paper to test it out. IIRC it parsed where > it needed to, but I'm not sure how it handled bad sequences, so yes, worth > looking into. Maybe worth moving to open-bio-l for broader discussion. > > chris > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From tiago.hori at gmail.com Thu Feb 7 14:58:37 2013 From: tiago.hori at gmail.com (Tiago Hori) Date: Thu, 7 Feb 2013 06:58:37 -0800 (PST) Subject: [Bioperl-l] Search I::O In-Reply-To: <6B0BCF1B-4B67-4697-9B34-8F822B4DC565@gmail.com> References: <39b1269f-63a7-4b29-af79-8c93ab231abf@googlegroups.com> <6B0BCF1B-4B67-4697-9B34-8F822B4DC565@gmail.com> Message-ID: Thanks, Jason! It is working Now. So here is what I am trying to accomplish. For a given Blastx report, I want to extract the best BLASTx hit that is human, and does not contain unnamed or Predicted. I got very close, but I still can't get it to give me only the top BLAST hit, it gives me all blast hits that meet my criteria. I tried using "last" to stop it from looping through the hits, once it found a human one, but it didn't work. Can someone help? Here is my code so far (mostly stolen for the wiki). use strict; use Bio::SearchIO; my $in = new Bio::SearchIO(-format => 'blast', -file => 'testsalmon.txt'); while( my $result = $in->next_result ) { ## $result is a Bio::Search::Result::ResultI compliant object while( my $hit = $result->next_hit ) { ## $hit is a Bio::Search::Hit::HitI compliant object if( $hit->description !~ /[Uu]nnamed|PREDICTED|hypothetical/){ if( $hit->description =~ /Homo sapiens/){ while( my $hsp = $hit->next_hsp ) { ## $hsp is a Bio::Search::HSP::HSPI compliant object if( $hsp->length('total') > 50 ) { if ( $hsp->percent_identity >= 30) { if( $hsp->evalue <= 1e-05){ print "Query=", $result->query_name,"\t", " Description=", $hit->description,"\t", " Hit=", $hit->name,"\t", " Length=", $hsp->length('total'),"\t", " Percent_id=", $hsp->percent_identity,"\t", } } } } } } } } T. On Wednesday, February 6, 2013 6:46:47 PM UTC-3:30, Jason Stajich wrote: > > you are missing a comma after the -format => 'blast' > should be > my $in = Bio::SearchIO->new(-format => 'blast', > -file => 'XXX' ); > > > On Feb 5, 2013, at 7:21 AM, Tiago Hori > > wrote: > > > Hi All, > > > > I am trying to find the best putative orthologs for 44K Atlantic Salmon > > sequences, and so I need to parse 44K BLAST reports to find the best > human > > hit. I am trying to learn Seach::IO, but when I try the first example on > > the HOWTO: use strict; > > use Bio::SearchIO; > > > > my $in = new Bio::SearchIO(-format => 'blast' > > -file => 'C001R047.txt'); > > > > while( my $result = $in->next_result ) { > > ## $result is a Bio::Search::Result::ResultI compliant object > > while( my $hit = $result->next_hit ) { > > ## $hit is a Bio::Search::Hit::HitI compliant object > > while( my $hsp = $hit->next_hsp ) { > > ## $hsp is a Bio::Search::HSP::HSPI compliant object > > if( $hsp->length('total') > 50 ) { > > if ( $hsp->percent_identity >= 75 ) { > > print "Query=", $result->query_name, > > " Hit=", $hit->name, > > " Length=", $hsp->length('total'), > > " Percent_id=", $hsp->percent_identity, "\n"; > > } > > } > > } > > } > > } > > > > I get this error: Odd number of elements in hash assignment at > > /usr/local/share/perl/5.14.2/Bio/SearchIO.pm line 189. > > > > I am using BioPerl version 1.6.901. Is there a format problem with the > > blast reports? > > > > Any help would be greatly appreciated! > > > > T. > > _______________________________________________ > > Bioperl-l mailing list > > Biop... at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Jason Stajich > jason.... at gmail.com > ja... at bioperl.org > > From cjfields at illinois.edu Thu Feb 7 15:56:04 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Thu, 7 Feb 2013 15:56:04 +0000 Subject: [Bioperl-l] FASTQ, was Re: BioPerl long-term, was Re: dependencies on perl version In-Reply-To: References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> <118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1C42B@CHIMBX5.ad.uillinois.edu> This will likely be the approach for more NGS-friendly Bio::Seq class. Calculation of the PHRED scores could also be deferred until needed. seqtk has some C-based methods that we could possibly take advantage of, but will have to look into it. chris On Feb 7, 2013, at 9:25 AM, Aaron Mackey wrote: > You might also want to consider a lazy/pull-based parser to defer parsing/object-building for pieces of the object that don't get used. This also usually provides some error tolerance. > > -Aaron > > -- > Aaron J. Mackey, PhD > Assistant Professor > Center for Public Health Genomics > University of Virginia > amackey at virginia.edu > http://www.cphg.virginia.edu/mackey > > > On Wed, Feb 6, 2013 at 5:53 PM, Fields, Christopher J wrote: > On Feb 6, 2013, at 4:43 PM, Peter Cock wrote: > > > On Wed, Feb 6, 2013 at 10:11 PM, Fields, Christopher J > > wrote: > >> > >> I see no problem in stating any generic parsing and low-level interfaces > >> are just as much a part of what BioPerl encompasses as the higher-level > >> Bio::* classes themselves. Steve and Jason were on to something with > >> SearchIO; it's maybe not as performant as we would like, but it certainly > >> is more flexible in terms of what can be done, b/c it separates out > >> low-level parsing from object creation. That's the general model we > >> should look at. There is a good reason Biopython is following this > >> model with their SearchIO implementation (Peter C, are you reading this?) > > > > Actually I don't think we did end up with that kind of separation in the > > Biopython SearchIO - which is not so say it isn't an excellent model > > to follow. Rather the Biopython SearchIO (like the BioPerl one) had > > as the first goal a consistent object model across assorted file > > formats. > > > > The idea of a low level minimal overhead parsers (which are very > > format specific), on which a heavier but consistent object model > > can be built might be a good balance - the high level API has the > > connivence, but if you give that up you can have more speed. > > That's what I recommend with FASTQ and Biopython, e.g. > > http://news.open-bio.org/news/2009/09/biopython-fast-fastq/ > > > >> > >> I have started a wrapper around Heng's FASTQ/FASTA parsing > >> code (kseq), it seems to work quite well (~20M FASTQ in 30 sec > >> last I recall?). > >> > > > > I'd have to dig through my emails, but I think the BioRuby guys > > looked at that too - as I recall while it was fast, the error handling > > left something to be desired. Email me directly or on the BioRuby > > list if you want to follow up on that. > > > > Regards, > > > > Peter > > I did a little on this, worth following up on, but I pulled the FASTQ test examples you created from the paper to test it out. IIRC it parsed where it needed to, but I'm not sure how it handled bad sequences, so yes, worth looking into. Maybe worth moving to open-bio-l for broader discussion. > > chris > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From amackey at virginia.edu Thu Feb 7 16:09:14 2013 From: amackey at virginia.edu (Aaron Mackey) Date: Thu, 7 Feb 2013 11:09:14 -0500 Subject: [Bioperl-l] FASTQ, was Re: BioPerl long-term, was Re: dependencies on perl version In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1C42B@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> <118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu> <118F034CF4C3EF48A96F86CE585B94BF6CE1C42B@CHIMBX5.ad.uillinois.edu> Message-ID: e.g., a pull-based FASTQ parser that did nothing else at the top level but "chunk" the file into as-yet-unparsed four-line blobs could appear to work very fast, if the user code did nothing but count the number of entries: while (my $seq = $seqio->nextseq) { $ct++ }; in other words, you defer *everything* except the minimal amount of parsing/logic required to detect object boundaries. This is, in fact, the exact opposite of the event-based SearchIO "push" parsers, which always perform the most parsing possible, despite the user never accessing most of the material. Lastly, with respect to performance, if the parsing/object building operation is not simply IO bound, then parallel parser/object-building CPU threads could be considered, which could then dynamically adapt to pre-parse attributes (e.g. quality scores) that the calling code was actually using. What's the state of thread-safe Perl these days? -Aaron On Thu, Feb 7, 2013 at 10:56 AM, Fields, Christopher J < cjfields at illinois.edu> wrote: > This will likely be the approach for more NGS-friendly Bio::Seq class. > Calculation of the PHRED scores could also be deferred until needed. > > seqtk has some C-based methods that we could possibly take advantage of, > but will have to look into it. > > chris > > On Feb 7, 2013, at 9:25 AM, Aaron Mackey wrote: > > > You might also want to consider a lazy/pull-based parser to defer > parsing/object-building for pieces of the object that don't get used. This > also usually provides some error tolerance. > > > > -Aaron > From sidd.basu at gmail.com Thu Feb 7 16:38:47 2013 From: sidd.basu at gmail.com (Siddhartha Basu) Date: Thu, 7 Feb 2013 10:38:47 -0600 Subject: [Bioperl-l] Re: FASTQ, was Re:BioPerl long-term, was Re: dependencies on perl version In-Reply-To: References: <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> <118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu> <118F034CF4C3EF48A96F86CE585B94BF6CE1C42B@CHIMBX5.ad.uillinois.edu> Message-ID: <5113d899.ea64320a.489a.262d@mx.google.com> Another approach might be use map-reduce(Hadoop) if possible. I have seen one implementation in biopython's GFF3 parser. http://bcbio.wordpress.com/2009/03/22/mapreduce-implementation-of-gff-parsing-for-biopython/ -siddhartha On Thu, 07 Feb 2013, Aaron Mackey wrote: > e.g., a pull-based FASTQ parser that did nothing else at the top level but > "chunk" the file into as-yet-unparsed four-line blobs could appear to work > very fast, if the user code did nothing but count the number of entries: > > while (my $seq = $seqio->nextseq) { $ct++ }; > > in other words, you defer *everything* except the minimal amount of > parsing/logic required to detect object boundaries. > > This is, in fact, the exact opposite of the event-based SearchIO "push" > parsers, which always perform the most parsing possible, despite the user > never accessing most of the material. > > Lastly, with respect to performance, if the parsing/object building > operation is not simply IO bound, then parallel parser/object-building CPU > threads could be considered, which could then dynamically adapt to > pre-parse attributes (e.g. quality scores) that the calling code was > actually using. What's the state of thread-safe Perl these days? > > -Aaron > > > On Thu, Feb 7, 2013 at 10:56 AM, Fields, Christopher J < > cjfields at illinois.edu> wrote: > > > This will likely be the approach for more NGS-friendly Bio::Seq class. > > Calculation of the PHRED scores could also be deferred until needed. > > > > seqtk has some C-based methods that we could possibly take advantage of, > > but will have to look into it. > > > > chris > > > > On Feb 7, 2013, at 9:25 AM, Aaron Mackey wrote: > > > > > You might also want to consider a lazy/pull-based parser to defer > > parsing/object-building for pieces of the object that don't get used. This > > also usually provides some error tolerance. > > > > > > -Aaron > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Thu Feb 7 16:55:53 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Thu, 7 Feb 2013 16:55:53 +0000 Subject: [Bioperl-l] FASTQ, was Re:BioPerl long-term, was Re: dependencies on perl version In-Reply-To: <5113d899.ea64320a.489a.262d@mx.google.com> References: <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> <118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu> <118F034CF4C3EF48A96F86CE585B94BF6CE1C42B@CHIMBX5.ad.uillinois.edu> <5113d899.ea64320a.489a.262d@mx.google.com> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1C7B8@CHIMBX5.ad.uillinois.edu> I think we will want to allow for a multitude of implementations. SeqIO already allows for that to a degree, but multiple backend implementations (say, different ways of parsing/processing FASTQ and others) isn't supported yet. chris On Feb 7, 2013, at 10:38 AM, Siddhartha Basu wrote: > Another approach might be use map-reduce(Hadoop) if possible. I have > seen one implementation in biopython's GFF3 parser. > http://bcbio.wordpress.com/2009/03/22/mapreduce-implementation-of-gff-parsing-for-biopython/ > > -siddhartha > > > On Thu, 07 Feb 2013, Aaron Mackey wrote: > >> e.g., a pull-based FASTQ parser that did nothing else at the top level but >> "chunk" the file into as-yet-unparsed four-line blobs could appear to work >> very fast, if the user code did nothing but count the number of entries: >> >> while (my $seq = $seqio->nextseq) { $ct++ }; >> >> in other words, you defer *everything* except the minimal amount of >> parsing/logic required to detect object boundaries. >> >> This is, in fact, the exact opposite of the event-based SearchIO "push" >> parsers, which always perform the most parsing possible, despite the user >> never accessing most of the material. >> >> Lastly, with respect to performance, if the parsing/object building >> operation is not simply IO bound, then parallel parser/object-building CPU >> threads could be considered, which could then dynamically adapt to >> pre-parse attributes (e.g. quality scores) that the calling code was >> actually using. What's the state of thread-safe Perl these days? >> >> -Aaron >> >> >> On Thu, Feb 7, 2013 at 10:56 AM, Fields, Christopher J < >> cjfields at illinois.edu> wrote: >> >>> This will likely be the approach for more NGS-friendly Bio::Seq class. >>> Calculation of the PHRED scores could also be deferred until needed. >>> >>> seqtk has some C-based methods that we could possibly take advantage of, >>> but will have to look into it. >>> >>> chris >>> >>> On Feb 7, 2013, at 9:25 AM, Aaron Mackey wrote: >>> >>>> You might also want to consider a lazy/pull-based parser to defer >>> parsing/object-building for pieces of the object that don't get used. This >>> also usually provides some error tolerance. >>>> >>>> -Aaron >>> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Thu Feb 7 17:01:07 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Thu, 7 Feb 2013 17:01:07 +0000 Subject: [Bioperl-l] FASTQ, was Re: BioPerl long-term, was Re: dependencies on perl version In-Reply-To: References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> <118F034CF4C3EF48A96F86CE585B94BF6CE1B05F@CHIMBX5.ad.uillinois.edu> <118F034CF4C3EF48A96F86CE585B94BF6CE1C42B@CHIMBX5.ad.uillinois.edu> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1C7EF@CHIMBX5.ad.uillinois.edu> re: thread-safe perl, so-so at best from what I understand. chris On Feb 7, 2013, at 10:09 AM, Aaron Mackey wrote: > e.g., a pull-based FASTQ parser that did nothing else at the top level but "chunk" the file into as-yet-unparsed four-line blobs could appear to work very fast, if the user code did nothing but count the number of entries: > > while (my $seq = $seqio->nextseq) { $ct++ }; > > in other words, you defer *everything* except the minimal amount of parsing/logic required to detect object boundaries. > > This is, in fact, the exact opposite of the event-based SearchIO "push" parsers, which always perform the most parsing possible, despite the user never accessing most of the material. > > Lastly, with respect to performance, if the parsing/object building operation is not simply IO bound, then parallel parser/object-building CPU threads could be considered, which could then dynamically adapt to pre-parse attributes (e.g. quality scores) that the calling code was actually using. What's the state of thread-safe Perl these days? > > -Aaron > > > On Thu, Feb 7, 2013 at 10:56 AM, Fields, Christopher J wrote: > This will likely be the approach for more NGS-friendly Bio::Seq class. Calculation of the PHRED scores could also be deferred until needed. > > seqtk has some C-based methods that we could possibly take advantage of, but will have to look into it. > > chris > > On Feb 7, 2013, at 9:25 AM, Aaron Mackey wrote: > > > You might also want to consider a lazy/pull-based parser to defer parsing/object-building for pieces of the object that don't get used. This also usually provides some error tolerance. > > > > -Aaron From hartzell at alerce.com Thu Feb 7 21:36:24 2013 From: hartzell at alerce.com (George Hartzell) Date: Thu, 7 Feb 2013 13:36:24 -0800 Subject: [Bioperl-l] BioPerl long-term, was Re: dependencies on perl version In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> Message-ID: <20756.7768.125680.662488@gargle.gargle.HOWL> Fields, Christopher J writes: > George, > > Should put your post on a pedestal :) > > tl;dr version: I completely agree, but we need help in order to do this. > [...] And therein lies the [a] problem. Don't look at me.... I'm not coding on bioinformatics problems these days (though I'm available...) so _maybe_ I shouldn't have gotten up on the soapbox. But I'm so sick of getting into arguments (or walking away from them...) with Ruby and Python [and lisp and *PHP*] fans; Perl is dead, you can't write good code in Perl, look - Ruby has GEMS!, etc... Perl of the olden days was an easy language in which to write really shitty code. Even the Perl of the BioPerl heyday wasn't really much help; role your own OO, role your own distro-building, mountains of monkey-work to provide consistent POD, versioning, etc... But that's not the Perl that I use. I have Moose and Moo. TAP and the things built on it. Dist::Zilla. PerlTidy. PerlCritic. cpanm. MetaCPAN. Pinto. GitHub. Perlbrew. Wow. It isn't any harder to write good code, for measures that I care about, using Perl than it is *any* of the other similar languages. And it's just as easy, and happens just as frequently, for people to write shitty (undocumented, untested, poorly managed, poorly packaged, ...) stuff in the other languages. GET OFF MY LAWN, KID! (Yeah, I know...) But BioPerl *is* dying. You might be standing on the shoulders of giants when you use it to solve a problem, but you *definitely* have those same giants (and their extended families) on your shoulders every time I see you try move the project forward. All of that history has become the tail that's wagging the dog. If all y'all are going to keep the thing alive, moving forward and contributing to new great works then make Apple your hero. Deprecate the stuff that's holding you back, give folks a path forward and move on. Have fun. Use sharp tools. Do cool science. Build cool things. Advance your careers (forgot that one last time). Be reasonable and professional. Supporting last year's projects is someone else's business opportunity. g. ps. Are all y'all following this thread? http://news.ycombinator.com/item?id=5123022 Maybe someone should search down for this bit: "Where to start? Any list of this [sic] projects?" and insert a plug for the various open-bio projects. (But "someone" doesn't work here, he said...). From cjfields at illinois.edu Thu Feb 7 23:12:19 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Thu, 7 Feb 2013 23:12:19 +0000 Subject: [Bioperl-l] BioPerl long-term, was Re: dependencies on perl version In-Reply-To: <20756.7768.125680.662488@gargle.gargle.HOWL> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> <20756.7768.125680.662488@gargle.gargle.HOWL> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1D071@CHIMBX5.ad.uillinois.edu> On Feb 7, 2013, at 3:36 PM, George Hartzell wrote: > Fields, Christopher J writes: >> George, >> >> Should put your post on a pedestal :) >> >> tl;dr version: I completely agree, but we need help in order to do this. >> [...] > > And therein lies the [a] problem. Don't look at me.... > > I'm not coding on bioinformatics problems these days (though I'm > available...) so _maybe_ I shouldn't have gotten up on the soapbox. > > But I'm so sick of getting into arguments (or walking away from > them...) with Ruby and Python [and lisp and *PHP*] fans; Perl is dead, > you can't write good code in Perl, look - Ruby has GEMS!, etc? Right, but that's a perception not just in the Bio* world. It's larger and more pervasive than that. > Perl of the olden days was an easy language in which to write really > shitty code. Even the Perl of the BioPerl heyday wasn't really much > help; role your own OO, role your own distro-building, mountains of > monkey-work to provide consistent POD, versioning, etc... > > But that's not the Perl that I use. I have Moose and Moo. TAP and > the things built on it. Dist::Zilla. PerlTidy. PerlCritic. cpanm. > MetaCPAN. Pinto. GitHub. Perlbrew. Wow. Yes, and that is the direction we need to go in. > It isn't any harder to write good code, for measures that I care > about, using Perl than it is *any* of the other similar languages. > > And it's just as easy, and happens just as frequently, for people to > write shitty (undocumented, untested, poorly managed, poorly packaged, > ...) stuff in the other languages. Oh, I know. I'm working on some very nice looking but terribly implemented Python code now. > GET OFF MY LAWN, KID! (Yeah, I know...) > > But BioPerl *is* dying. You might be standing on the shoulders of > giants when you use it to solve a problem, but you *definitely* have > those same giants (and their extended families) on your shoulders > every time I see you try move the project forward. All of that > history has become the tail that's wagging the dog. Yep. > If all y'all are going to keep the thing alive, moving forward and > contributing to new great works then make Apple your hero. Deprecate > the stuff that's holding you back, give folks a path forward and move > on. That's fine. > Have fun. Use sharp tools. Do cool science. Build cool things. > Advance your careers (forgot that one last time). Be reasonable and > professional. > > Supporting last year's projects is someone else's business > opportunity. > > g. Right, but this isn't just my show. I can't do this alone; it's simply too much code and I don't have even 1/4 the time I used to have. > ps. Are all y'all following this thread? > > http://news.ycombinator.com/item?id=5123022 > > Maybe someone should search down for this bit: "Where to start? Any > list of this [sic] projects?" and insert a plug for the various > open-bio projects. (But "someone" doesn't work here, he said?). Read the original guy's post. He's completely delusional (okay, maybe not *completely*, but he comes across as quite bitter and unrealistic). Frankly I don't feel so bad if he wants to leave. He doesn't like messy things. Biology is messy, if one doesn't understand that then computational biology is not for them. chris From carandraug+dev at gmail.com Fri Feb 8 04:12:22 2013 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Fri, 8 Feb 2013 04:12:22 +0000 Subject: [Bioperl-l] BioPerl long-term, was Re: dependencies on perl version Message-ID: On 6 February 2013 22:11, "Fields, Christopher J" wrote: > [...] > So: > > If it means targeting performance, backwards-compatibility be damned (using Devel::NYTProf?), we do that. > > If it means creating a new Bio-NGS repo to focus some of these efforts, so be it. > > If it means we get away from the Java-based interface stuff in favor of something more Perl-like (roles anyone?), then I'm all for it. > > If it means we modularize BioPerl so this can be done, well, you probably know where I stand (yes). > > If it means this is to be BioPerl 2.0, then let's move that direction, sooner than later. > > But I can't do it alone. We (not just me, but we) need to drive the direction we take. > > First one who codes gets the gold ring. Hi I know I'm not much involved with bioperl development but here's my suggestion as maintainer of another quite modular free software project. I swear I'm not promoting it. Skip to the last paragraph for the very short version. Octave Forge is now a collection of packages for GNU Octave, each released independently whenever its maintainer sees fit. But it wasn't like that before. For a long time, everything was released at the same time, there was no independent packages. Then it was decided to split it into sections: main, extra and nonfree (free software dependent on non-free libraries, now purged), and inside those, it was split into packages, each with its own maintainer. But some packages were (and are) more active that the others. Some packages even came from single contributions and we never heard from the authors again. And so, with time, cruft settled in. We didn't want to remove the code, but no one was interested or comfortable enough on the field, to fix it either. Packages that had a much more active development were being dragged down by code that no one was maintaining. So we broke with that and each package is now released independently. We have packages that haven't been released in 3 years yes, but that just shows the packages that no one cares about. Those have been marked as unmaintained and anyone can come around and make a release if they care about it. As the maintainer of the project, I do *not* make the releases of the packages. The package maintainers prepares everything and uploads them, I only run a handful of tests (takes me 10min), upload it to our server, and make the official announcement. I am also the maintainer of one of the packages, and have often made releases of unmaintained packages because I needed it. That's to show, if they are important enough for someone, they will get a release somehow. If they are not important, why would we waste our time on them anyway? We now around 5 package releases per month, many of them being minor releases with a handful of bug fixes. Preparing a release of a small package is much easier and much less trouble than preparing a giant release encompassing all of them at the same time. Short version: I'd recommend to split the project into much smaller ones. Some of the small ones will wither and die but those are the less important ones, and will allow the others, the ones that people care about, freedom to grow faster. Bioperl would still be just one project, that incorporates a hundred or so of smaller modules. Let those who care the most about a specific module to take care of it and make the releases. Releasing a module becomes much simpler, which means more releases, more activity, and the smaller code base for each module also make it less intimidating for new contributors. Carn? From hartzell at alerce.com Fri Feb 8 06:17:17 2013 From: hartzell at alerce.com (George Hartzell) Date: Thu, 7 Feb 2013 22:17:17 -0800 Subject: [Bioperl-l] injecting a bit of levity.... Message-ID: <20756.39021.553502.116384@gargle.gargle.HOWL> Perl's not dead. It's FAMOUS! http://imgs.xkcd.com/comics/perl_problems.png g. From carandraug+dev at gmail.com Fri Feb 8 06:57:30 2013 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Fri, 8 Feb 2013 06:57:30 +0000 Subject: [Bioperl-l] getting a Bio::Search::HSP::HSPI from Bio::SimpleAlign (to find differences between sequences) Message-ID: Hi I already have a Bio::SimpleAlign object (got it after using TCoffee through bioperl-run module) and I'm trying to get a Bio::Search::HSP::HSPI object from a pair of the aligned sequences. How can I do this? I want to use the seq_inds method to compare the sequences. Here's my actual problem just in case I should be trying to fix it some other way. I have a bunch of sequences from protein isoforms. They have small differences between them, point-mutations, small insertions or deletions, nothing too big. I want to make a table of the mutations that each of them has against the consensus sequence. I already made the alignment and got have the consensus with "$align->consensus_string". Now, I want to get something like: isoform1: Ala67Gly, His90_Met91insGln isoform2: .... The seq_inds method from the Bio::Search::HSP::HSPI class seems to do the part of finding the differences, but how can I get one? I can't find it on the documentation. Any tips, and even showing a different approach to my problem, are most appreciated. Thanks, Carn? From l.m.timmermans at students.uu.nl Fri Feb 8 11:18:58 2013 From: l.m.timmermans at students.uu.nl (Leon Timmermans) Date: Fri, 8 Feb 2013 12:18:58 +0100 Subject: [Bioperl-l] BioPerl long-term, was Re: dependencies on perl version In-Reply-To: <20756.7768.125680.662488@gargle.gargle.HOWL> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> <20756.7768.125680.662488@gargle.gargle.HOWL> Message-ID: On Thu, Feb 7, 2013 at 10:36 PM, George Hartzell wrote: > But I'm so sick of getting into arguments (or walking away from > them...) with Ruby and Python [and lisp and *PHP*] fans; Perl is dead, > you can't write good code in Perl, look - Ruby has GEMS!, etc... > > Perl of the olden days was an easy language in which to write really > shitty code. Even the Perl of the BioPerl heyday wasn't really much > help; role your own OO, role your own distro-building, mountains of > monkey-work to provide consistent POD, versioning, etc... > > But that's not the Perl that I use. I have Moose and Moo. TAP and > the things built on it. Dist::Zilla. PerlTidy. PerlCritic. cpanm. > MetaCPAN. Pinto. GitHub. Perlbrew. Wow. I share that experience. > But BioPerl *is* dying. You might be standing on the shoulders of > giants when you use it to solve a problem, but you *definitely* have > those same giants (and their extended families) on your shoulders > every time I see you try move the project forward. All of that > history has become the tail that's wagging the dog. I share your sentiment. Most of BioPerl is architected so badly I can't stomach it most days, and I've worked on hairy codebases included perl itself. There's just too much sick and wrong. It's like hundreds of dot-com-era cgi scripts. The problem (which is common in scientific computing) is that once code works it's effectively abandoned. BioPerl is essentially a gathering of more than a thousand such modules. > If all y'all are going to keep the thing alive, moving forward and > contributing to new great works then make Apple your hero. Deprecate > the stuff that's holding you back, give folks a path forward and move > on. That would be lovely, but who is going to do that? We're suffering from the tragedy of the commons. > Have fun. Use sharp tools. Do cool science. Build cool things. > Advance your careers (forgot that one last time). Be reasonable and > professional. Sounds like good advice to me :-) > Supporting last year's projects is someone else's business > opportunity. True! > ps. Are all y'all following this thread? > > http://news.ycombinator.com/item?id=5123022 > > Maybe someone should search down for this bit: "Where to start? Any > list of this [sic] projects?" and insert a plug for the various > open-bio projects. (But "someone" doesn't work here, he said...). Interesting discussion, though the original post is too cynical even for my taste. Leon From cjfields at illinois.edu Fri Feb 8 14:08:56 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Fri, 8 Feb 2013 14:08:56 +0000 Subject: [Bioperl-l] BioPerl long-term, was Re: dependencies on perl version In-Reply-To: References: <118F034CF4C3EF48A96F86CE585B94BF6CE1820A@CHIMBX5.ad.uillinois.edu> <5111C653.2010703@gmail.com> <118F034CF4C3EF48A96F86CE585B94BF6CE18F2E@CHIMBX5.ad.uillinois.edu> <20754.39343.128576.743448@gargle.gargle.HOWL> <118F034CF4C3EF48A96F86CE585B94BF6CE1ADA4@CHIMBX5.ad.uillinois.edu> <20756.7768.125680.662488@gargle.gargle.HOWL> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1DA2D@CHIMBX5.ad.uillinois.edu> On Feb 8, 2013, at 5:18 AM, Leon Timmermans wrote: > On Thu, Feb 7, 2013 at 10:36 PM, George Hartzell wrote: >> But I'm so sick of getting into arguments (or walking away from >> them...) with Ruby and Python [and lisp and *PHP*] fans; Perl is dead, >> you can't write good code in Perl, look - Ruby has GEMS!, etc... >> >> Perl of the olden days was an easy language in which to write really >> shitty code. Even the Perl of the BioPerl heyday wasn't really much >> help; role your own OO, role your own distro-building, mountains of >> monkey-work to provide consistent POD, versioning, etc... >> >> But that's not the Perl that I use. I have Moose and Moo. TAP and >> the things built on it. Dist::Zilla. PerlTidy. PerlCritic. cpanm. >> MetaCPAN. Pinto. GitHub. Perlbrew. Wow. > > I share that experience. > >> But BioPerl *is* dying. You might be standing on the shoulders of >> giants when you use it to solve a problem, but you *definitely* have >> those same giants (and their extended families) on your shoulders >> every time I see you try move the project forward. All of that >> history has become the tail that's wagging the dog. > > I share your sentiment. Most of BioPerl is architected so badly I > can't stomach it most days, and I've worked on hairy codebases > included perl itself. There's just too much sick and wrong. It's like > hundreds of dot-com-era cgi scripts. > > The problem (which is common in scientific computing) is that once > code works it's effectively abandoned. BioPerl is essentially a > gathering of more than a thousand such modules. Yep, the progression from 'it works' to 'it works very well' tends to have very high activation energy. Many of the fixes tend to be more bandaids (get it working) than fundamental surgery. I tried my hand at this, got a few things done. >> If all y'all are going to keep the thing alive, moving forward and >> contributing to new great works then make Apple your hero. Deprecate >> the stuff that's holding you back, give folks a path forward and move >> on. > > That would be lovely, but who is going to do that? We're suffering > from the tragedy of the commons. Spot on, but we could break that path for the time being. I think BioPerl as is will have to be in maintenance mode; we need a new effort to break with older perl, older practices. >> Have fun. Use sharp tools. Do cool science. Build cool things. >> Advance your careers (forgot that one last time). Be reasonable and >> professional. > > Sounds like good advice to me :-) > >> Supporting last year's projects is someone else's business >> opportunity. > > True! We just need to make a bioperl 1.x branch for the maintenance bit, rechristen 'master' as 'v2', and just move on to fixing the f****** code. Let's move on that. >> ps. Are all y'all following this thread? >> >> http://news.ycombinator.com/item?id=5123022 >> >> Maybe someone should search down for this bit: "Where to start? Any >> list of this [sic] projects?" and insert a plug for the various >> open-bio projects. (But "someone" doesn't work here, he said...). > > Interesting discussion, though the original post is too cynical even > for my taste. > > Leon Yes, that's not unusual unfortunately. We have a number of physicists and mathematicians here who have started their initial forays into computational biology, they're all startled at how noisy it is and how messy code can. Of course their disciplines have had the benefit of teaching students how to (somewhat decently) code for the last 40 years. chris From l.m.timmermans at students.uu.nl Fri Feb 8 12:08:06 2013 From: l.m.timmermans at students.uu.nl (Leon Timmermans) Date: Fri, 8 Feb 2013 13:08:06 +0100 Subject: [Bioperl-l] BioPerl long-term, was Re: dependencies on perl version In-Reply-To: References: Message-ID: On Fri, Feb 8, 2013 at 5:12 AM, Carn? Draug wrote: > Short version: > I'd recommend to split the project into much smaller ones. Some of the > small ones will wither and die but those are the less important ones, > and will allow the others, the ones that people care about, freedom to > grow faster. Bioperl would still be just one project, that > incorporates a hundred or so of smaller modules. Let those who care > the most about a specific module to take care of it and make the > releases. Releasing a module becomes much simpler, which means more > releases, more activity, and the smaller code base for each module > also make it less intimidating for new contributors. That has been a goal for some time now, but it's fairly complicated. Not only do we have a LOT of modules (bioperl-live alone is more than 900), they also have complicated dependencies. I've attached the results of my static dependency analysis of bioperl-live. I suspect this split-up needs to done by automated graph analysis, it's too much to do by hand. Leon -------------- next part -------------- A non-text attachment was scrubbed... Name: deps.dot Type: application/octet-stream Size: 93463 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: deps.png Type: image/png Size: 6694525 bytes Desc: not available URL: From sebastien.moretti at unil.ch Fri Feb 8 16:19:29 2013 From: sebastien.moretti at unil.ch (=?ISO-8859-1?Q?Moretti_S=E9bastien?=) Date: Fri, 08 Feb 2013 17:19:29 +0100 Subject: [Bioperl-l] PhyloXML Message-ID: <51152591.9010402@unil.ch> Hi I would like to add some XML to an existing PhyloXML tree. No problem to read and write it. I would like to add smthg after the tag as in http://www.phyloxml.org/examples_syntax/phyloxml_syntax_example_1.html but get problems with add_phyloXML_annotation() : Can't locate object method "annotation" via package "Bio::Tree::Tree" at /software/share/perl5/vendor_perl/Bio/TreeIO/phyloxml.pm line 984, line 1 (#1) (F) You called a method correctly, and it correctly indicated a package functioning as a class, but that package doesn't define that particular method, nor does any of its base classes. See perlobj. Uncaught exception from user code: Can't locate object method "annotation" via package "Bio::Tree::Tree" at /software/share/perl5/vendor_perl/Bio/TreeIO/phyloxml.pm line 984, line 1. at /software/share/perl5/vendor_perl/Bio/TreeIO/phyloxml.pm line 984 Bio::TreeIO::phyloxml::element_default('Bio::TreeIO::phyloxml=HASH(0x134b1268)') called at /software/share/perl5/vendor_perl/Bio/TreeIO/phyloxml.pm line 670 Bio::TreeIO::phyloxml::processXMLNode('Bio::TreeIO::phyloxml=HASH(0x134b1268)') called at /software/share/perl5/vendor_perl/Bio/TreeIO/phyloxml.pm line 309 Bio::TreeIO::phyloxml::add_phyloXML_annotation('Bio::TreeIO::phyloxml=HASH(0x134b1268)', '-obj', 'Bio::Tree::Tree=HASH(0x13525258)', '-xml', 'SUMF family') called at ./add_annotation_to_phyloxml.pl line 40 I think I do something wrong but what ? Here is the code my $treeio = new Bio::TreeIO(-file => "$infile", -format => 'phyloxml', ); my $tree = $treeio->next_tree; # Add annotation $treeio->add_phyloXML_annotation(-obj => $tree, -xml => 'SUMF family', ); -- S?bastien Moretti From cjfields at illinois.edu Sat Feb 9 06:25:17 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Sat, 9 Feb 2013 06:25:17 +0000 Subject: [Bioperl-l] BioPerl future Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1F217@CHIMBX5.ad.uillinois.edu> All, (cross-posting to gmod-gbrowse) I want to gauge the community's thoughts on a few things. At the moment I think we can safely say that BioPerl 1.x is in maintenance mode. By 'maintenance mode', I mean that we can only do so much with it w/o breaking backwards compatibility with old scripts. We need a way forward so that we can address fundamental problems within the core codebase, namely speed. I am thinking at the moment of pushing a 'v1' branch next week after I make an official announcement, with a new 1.6 release coming out from that branch (as already announced, tentatively scheduled for March 1). That frees up master for any code development, removal of modules/cruft, etc. This will open an initial path forward and at least enable us to do more. Make sense? This of course means that any code reliant on v1 should pull from that branch instead of 'master'. Thoughts? chris From cjfields at illinois.edu Sat Feb 9 06:43:24 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Sat, 9 Feb 2013 06:43:24 +0000 Subject: [Bioperl-l] BioPerl long-term, was Re: dependencies on perl version In-Reply-To: References: Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1F2C6@CHIMBX5.ad.uillinois.edu> On Feb 8, 2013, at 6:08 AM, Leon Timmermans wrote: > On Fri, Feb 8, 2013 at 5:12 AM, Carn? Draug wrote: >> Short version: >> I'd recommend to split the project into much smaller ones. Some of the >> small ones will wither and die but those are the less important ones, >> and will allow the others, the ones that people care about, freedom to >> grow faster. Bioperl would still be just one project, that >> incorporates a hundred or so of smaller modules. Let those who care >> the most about a specific module to take care of it and make the >> releases. Releasing a module becomes much simpler, which means more >> releases, more activity, and the smaller code base for each module >> also make it less intimidating for new contributors. > > That has been a goal for some time now, but it's fairly complicated. > Not only do we have a LOT of modules (bioperl-live alone is more than > 900), they also have complicated dependencies. I've attached the > results of my static dependency analysis of bioperl-live. I suspect > this split-up needs to done by automated graph analysis, it's too much > to do by hand. > > Leon > Leon, I'm hoping we can do this sooner than later. In fact, if we proceed with make a 'v1' branch or something similar, we can start extricating out code sooner than later (next few weeks). chris From cjfields at illinois.edu Sat Feb 9 13:51:35 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Sat, 9 Feb 2013 13:51:35 +0000 Subject: [Bioperl-l] [Gmod-gbrowse] BioPerl future Message-ID: Sheldon, The branch is where the old (v1.x) code would reside. Master branch would be v2. Chris Sent via phone -------- Original message -------- From: Sheldon McKay Date: To: "Fields, Christopher J" Cc: BioPerl List ,gmod-gbrowse at lists.sourceforge.net Subject: Re: [Gmod-gbrowse] BioPerl future Hi Chris, This sounds like a good idea. I think it will eventually allow bioperl to evolve into a leaner, meaner package that would be more likely to be adopted by new or isolated bioinformaticians, who tend to be put off by the size and complexity of bioperl as it now stands. One question I have is whether the name of branch v1 might be perceived as a step backward. How about v2? Sheldon On Saturday, February 9, 2013, Fields, Christopher J wrote: All, (cross-posting to gmod-gbrowse) I want to gauge the community's thoughts on a few things. At the moment I think we can safely say that BioPerl 1.x is in maintenance mode. By 'maintenance mode', I mean that we can only do so much with it w/o breaking backwards compatibility with old scripts. We need a way forward so that we can address fundamental problems within the core codebase, namely speed. I am thinking at the moment of pushing a 'v1' branch next week after I make an official announcement, with a new 1.6 release coming out from that branch (as already announced, tentatively scheduled for March 1). That frees up master for any code development, removal of modules/cruft, etc. This will open an initial path forward and at least enable us to do more. Make sense? This of course means that any code reliant on v1 should pull from that branch instead of 'master'. Thoughts? chris ------------------------------------------------------------------------------ Free Next-Gen Firewall Hardware Offer Buy your Sophos next-gen firewall before the end March 2013 and get the hardware for free! Learn more. http://p.sf.net/sfu/sophos-d2d-feb _______________________________________________ Gmod-gbrowse mailing list Gmod-gbrowse at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse -- Sheldon McKay, PhD Computational Biologist DNA Learning Center Cold Spring Harbor Laboratory 1 Bungtown Rd Cold Spring Harbor, NY 11724 (516) 367-5185 www.dnalc.org From sheldon.mckay at gmail.com Sat Feb 9 13:04:50 2013 From: sheldon.mckay at gmail.com (Sheldon McKay) Date: Sat, 9 Feb 2013 08:04:50 -0500 Subject: [Bioperl-l] [Gmod-gbrowse] BioPerl future In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1F217@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE1F217@CHIMBX5.ad.uillinois.edu> Message-ID: Hi Chris, This sounds like a good idea. I think it will eventually allow bioperl to evolve into a leaner, meaner package that would be more likely to be adopted by new or isolated bioinformaticians, who tend to be put off by the size and complexity of bioperl as it now stands. One question I have is whether the name of branch v1 might be perceived as a step backward. How about v2? Sheldon On Saturday, February 9, 2013, Fields, Christopher J wrote: > All, > > (cross-posting to gmod-gbrowse) > > I want to gauge the community's thoughts on a few things. At the moment I > think we can safely say that BioPerl 1.x is in maintenance mode. By > 'maintenance mode', I mean that we can only do so much with it w/o breaking > backwards compatibility with old scripts. We need a way forward so that we > can address fundamental problems within the core codebase, namely speed. > > I am thinking at the moment of pushing a 'v1' branch next week after I > make an official announcement, with a new 1.6 release coming out from that > branch (as already announced, tentatively scheduled for March 1). That > frees up master for any code development, removal of modules/cruft, etc. > This will open an initial path forward and at least enable us to do more. > Make sense? This of course means that any code reliant on v1 should pull > from that branch instead of 'master'. > > Thoughts? > > chris > > ------------------------------------------------------------------------------ > Free Next-Gen Firewall Hardware Offer > Buy your Sophos next-gen firewall before the end March 2013 > and get the hardware for free! Learn more. > http://p.sf.net/sfu/sophos-d2d-feb > _______________________________________________ > Gmod-gbrowse mailing list > Gmod-gbrowse at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse > -- Sheldon McKay, PhD Computational Biologist DNA Learning Center Cold Spring Harbor Laboratory 1 Bungtown Rd Cold Spring Harbor, NY 11724 (516) 367-5185 www.dnalc.org From cjfields at illinois.edu Sun Feb 10 04:25:14 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Sun, 10 Feb 2013 04:25:14 +0000 Subject: [Bioperl-l] BioPerl future In-Reply-To: <2F53583C-9E7D-4D6A-A4C2-E5C27DDBA493@illinois.edu> References: <2F53583C-9E7D-4D6A-A4C2-E5C27DDBA493@illinois.edu> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE1FC4C@CHIMBX5.ad.uillinois.edu> Apologies if you receive this twice. I never received the replies from the gbrowse list through bioperl-l so it is possible there were mail issues last night. ------------------------ All, (cross-posting to gmod-gbrowse) I want to gauge the community's thoughts on a few things. At the moment I think we can safely say that BioPerl 1.x is in maintenance mode. By 'maintenance mode', I mean that we can only do so much with it w/o breaking backwards compatibility with old scripts. We need a way forward so that we can address fundamental problems within the core codebase, namely speed. I am thinking at the moment of pushing a 'v1' branch next week after I make an official announcement, with a new 1.6 release coming out from that branch (as already announced, tentatively scheduled for March 1). That frees up master for any code development, removal of modules/cruft, etc. This will open an initial path forward and at least enable us to do more. Make sense? This of course means that any code reliant on v1 should pull from that branch instead of 'master'. Thoughts? chris From genehack at genehack.org Sun Feb 10 04:36:07 2013 From: genehack at genehack.org (John SJ Anderson) Date: Sat, 9 Feb 2013 20:36:07 -0800 Subject: [Bioperl-l] BioPerl future In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE1FC4C@CHIMBX5.ad.uillinois.edu> References: <2F53583C-9E7D-4D6A-A4C2-E5C27DDBA493@illinois.edu> <118F034CF4C3EF48A96F86CE585B94BF6CE1FC4C@CHIMBX5.ad.uillinois.edu> Message-ID: <668BED38-61AE-4D21-A3BD-B7AEC9361EF6@genehack.org> On Feb 9, 2013, at 8:25 PM, "Fields, Christopher J" wrote: > Thoughts? +1 The other thing to maybe give some advance thought to is organization of the new development. Maybe instead of one big repository, we can encourage the more loosely coupled small pieces that everybody seems to realize we need by having more, smaller repositories. j. -- John SJ Anderson // genehack at genehack.org From carandraug+dev at gmail.com Sun Feb 10 18:40:33 2013 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Sun, 10 Feb 2013 18:40:33 +0000 Subject: [Bioperl-l] BioPerl future Message-ID: On 10 February 2013 17:00, wrote: > Message: 3 > Date: Sat, 9 Feb 2013 20:36:07 -0800 > From: John SJ Anderson > Subject: Re: [Bioperl-l] BioPerl future > To: "Fields, Christopher J" > Cc: BioPerl List > Message-ID: <668BED38-61AE-4D21-A3BD-B7AEC9361EF6 at genehack.org> > Content-Type: text/plain; charset=us-ascii > > On Feb 9, 2013, at 8:25 PM, "Fields, Christopher J" wrote: > >> Thoughts? > > +1 > > The other thing to maybe give some advance thought to is organization of the new development. Maybe instead of one big repository, we can encourage the more loosely coupled small pieces that everybody seems to realize we need by having more, smaller repositories. For those interested, I have just added instructions on the wiki on how to split a subset of modules, tests, files, etc from the bioperl-live repository into a new repository while keeping their old history. http://www.bioperl.org/wiki/Using_Git/Advanced#Split_a_module_from_bioperl-live Carn? From cjfields at illinois.edu Sun Feb 10 20:08:35 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Sun, 10 Feb 2013 20:08:35 +0000 Subject: [Bioperl-l] BioPerl future In-Reply-To: References: Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE20632@CHIMBX5.ad.uillinois.edu> On Feb 10, 2013, at 12:40 PM, Carn? Draug wrote: > On 10 February 2013 17:00, wrote: >> Message: 3 >> Date: Sat, 9 Feb 2013 20:36:07 -0800 >> From: John SJ Anderson >> Subject: Re: [Bioperl-l] BioPerl future >> To: "Fields, Christopher J" >> Cc: BioPerl List >> Message-ID: <668BED38-61AE-4D21-A3BD-B7AEC9361EF6 at genehack.org> >> Content-Type: text/plain; charset=us-ascii >> >> On Feb 9, 2013, at 8:25 PM, "Fields, Christopher J" wrote: >> >>> Thoughts? >> >> +1 >> >> The other thing to maybe give some advance thought to is organization of the new development. Maybe instead of one big repository, we can encourage the more loosely coupled small pieces that everybody seems to realize we need by having more, smaller repositories. > > For those interested, I have just added instructions on the wiki on > how to split a subset of modules, tests, files, etc from the > bioperl-live repository into a new repository while keeping their old > history. > > http://www.bioperl.org/wiki/Using_Git/Advanced#Split_a_module_from_bioperl-live > > Carn? It's probably worth looking at this page as well, then: http://www.bioperl.org/wiki/BioPerl_Modularization We should probably merge the two. chris From hlapp at drycafe.net Mon Feb 11 01:03:34 2013 From: hlapp at drycafe.net (Hilmar Lapp) Date: Sun, 10 Feb 2013 20:03:34 -0500 Subject: [Bioperl-l] PhyloXML In-Reply-To: <51152591.9010402@unil.ch> References: <51152591.9010402@unil.ch> Message-ID: On Feb 8, 2013, at 11:19 AM, Moretti S?bastien wrote: > # Add annotation > $treeio->add_phyloXML_annotation(-obj => $tree, > -xml => 'SUMF family', > ); If you really have $treeio in your script in this line and not $tree, then that's at least one problem. But the stack suggests that the above isn't the exact line in your script - can you confirm that? -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : =========================================================== From sebastien.moretti at unil.ch Mon Feb 11 07:08:22 2013 From: sebastien.moretti at unil.ch (=?ISO-8859-1?Q?S=E9bastien_MORETTI?=) Date: Mon, 11 Feb 2013 08:08:22 +0100 Subject: [Bioperl-l] PhyloXML In-Reply-To: References: <51152591.9010402@unil.ch> Message-ID: <511898E6.7060400@unil.ch> >> # Add annotation >> $treeio->add_phyloXML_annotation(-obj => $tree, >> -xml => 'SUMF family', >> ); > > If you really have $treeio in your script in this line and not $tree, then that's at least one problem. But the stack suggests that the above isn't the exact line in your script - can you confirm that? > > -hilmar I replaced $treeio by $tree in the above line but still get an error. Don't see what you mean by "the stack suggests that the above isn't the exact line in your script" The only think I changed is the length of the xml string I try to insert. But get the same error with an empty xml string. my $treeio = new Bio::TreeIO(-file => "$infile", -format => 'phyloxml', ); my $tree = $treeio->next_tree; # Add annotation $tree->add_phyloXML_annotation(-obj => $tree, -xml => 'SUMF family', ); Can't locate object method "add_phyloXML_annotation" via package "Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, line 1 (#1) (F) You called a method correctly, and it correctly indicated a package functioning as a class, but that package doesn't define that particular method, nor does any of its base classes. See perlobj. Uncaught exception from user code: Can't locate object method "add_phyloXML_annotation" via package "Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, line 1. at ./add_annotation_to_phyloxml.pl line 40 -- S?bastien Moretti Department of Ecology and Evolution, Biophore, University of Lausanne, CH-1015 Lausanne, Switzerland Tel.: +41 (21) 692 4221/4079 http://bioinfo.unil.ch/ From saladi1 at illinois.edu Tue Feb 12 21:24:34 2013 From: saladi1 at illinois.edu (Shyam Saladi) Date: Tue, 12 Feb 2013 13:24:34 -0800 Subject: [Bioperl-l] Bio::Tools::SeqStats->count_codons Message-ID: Hi, I am using the count_codons method from Bio::Tools::SeqStats and keep getting "AMBIGUOUS" codons, but I can't figure out why exactly. When I translate the same sequence that gives the error using another standard utility like (ExPASy - Translate), it seems to work alright. An example sequence is below. Could anyone lend some insight? Thanks, Shyam AAA AAC AAG AAT ACA ACC ACG ACT AGA AGC AGT *AMBIGUOUS* ATA ATC ATG ATT CAA CAC CAG CAT CCA CCC CCG CCT CGA CGC CGG CGT CTA CTC CTG CTT GAA GAC GAG GAT GCA GCC GCG GCT GGA GGC GGG GGT GTA GTC GTG GTT TAA TAC TAT TCA TCC TCG TCT TGG TGT TTA TTC TTG TTT count filename 1.722488038277511961722488038277511961722 2.966507177033492822966507177033492822967 1.531100478468899521531100478468899521531 0.9569377990430622009569377990430622009569 0.4784688995215311004784688995215311004785 1.722488038277511961722488038277511961722 1.33971291866028708133971291866028708134 1.913875598086124401913875598086124401914 0.1913875598086124401913875598086124401914 0.7655502392344497607655502392344497607656 1.435406698564593301435406698564593301435 * 0.09569377990430622009569377990430622009569* 0.3827751196172248803827751196172248803828 2.488038277511961722488038277511961722488 3.349282296650717703349282296650717703349 3.636363636363636363636363636363636363636 2.870813397129186602870813397129186602871 0.3827751196172248803827751196172248803828 1.626794258373205741626794258373205741627 0.4784688995215311004784688995215311004785 1.722488038277511961722488038277511961722 0.5741626794258373205741626794258373205742 1.052631578947368421052631578947368421053 1.244019138755980861244019138755980861244 0.3827751196172248803827751196172248803828 0.7655502392344497607655502392344497607656 0.1913875598086124401913875598086124401914 2.488038277511961722488038277511961722488 0.4784688995215311004784688995215311004785 0.6698564593301435406698564593301435406699 2.105263157894736842105263157894736842105 0.8612440191387559808612440191387559808612 2.870813397129186602870813397129186602871 1.435406698564593301435406698564593301435 1.722488038277511961722488038277511961722 2.775119617224880382775119617224880382775 2.00956937799043062200956937799043062201 2.488038277511961722488038277511961722488 3.540669856459330143540669856459330143541 2.00956937799043062200956937799043062201 0.1913875598086124401913875598086124401914 2.392344497607655502392344497607655502392 0.8612440191387559808612440191387559808612 5.454545454545454545454545454545454545455 1.913875598086124401913875598086124401914 0.8612440191387559808612440191387559808612 4.593301435406698564593301435406698564593 2.679425837320574162679425837320574162679 0.09569377990430622009569377990430622009569 1.148325358851674641148325358851674641148 1.148325358851674641148325358851674641148 0.8612440191387559808612440191387559808612 0.4784688995215311004784688995215311004785 2.105263157894736842105263157894736842105 0.9569377990430622009569377990430622009569 0.9569377990430622009569377990430622009569 0.09569377990430622009569377990430622009569 2.679425837320574162679425837320574162679 2.966507177033492822966507177033492822967 3.062200956937799043062200956937799043062 2.775119617224880382775119617224880382775 1045 temp.seq ATGGCACGTTTTTTTATTGATCGTCCCATCTTTGCGTGGGTGATCGCCTTAATTATTATGTTGGCGGGGGTGCTTTCAATTCGCACCCTGCCGGTTTCTCAATATCCCAGCATTGCACCGCCAACCGTGGTGATCAGTGCTAACTACCCTGGTGCATCGGCCAAGATTGTTGAAGACTCAGTGACTCAGGTGATTGAGCAACGCATGAAGGGTATCGATCACCTACGTTATATTGCCTCAACCAGCGATAGTTTCGGTAATGCTGAAATCACTTTGACCTTCAATGCCGAAGCCGATCCTGATATTGCTCAGGTACAAGTTCAGAACAAATTGCAGGGTGCAATGACCCTGTTACCACAAGAGGTACAGGCTCAAGGGGTTGACGTTAACAAATCAAGTTCTGGCTTYTTGATGGTGCTGGGTTTCGTATCGACTGACGGTTCCTTAGATAAAGGCGACATCGCCGACTATGTGGGTGCAAACGTACAAGATCCCATGAGCCGTGTACCGGGCGTGGGTGAAATTCAGCTGTTTGGTGCCCAATATGCGATGCGTATATGGCTTGATCCTTTAAAACTGACTCAATATAACTTGACCAGTTTAGAGGTGATCTCGGCGATTCGTGCTCAAAACGCGCAGGTGTCTGCGGGTCAGTTGGGTGGTACGCCGTCAATTCAAGGGCAAGAACTTAACGCCACTGTTTCGGCGCAAAGTCGTTTGCAAACCCCTGAAGAGTTTCGCAAGATTATCCTGAAGTCTGATACTTCGGGTGCGAATGTGTTCCTCGGTGATGTGGCGCGCGTAGAGTTAGGTTCAGAGAGTTATGCCGTTGTCTCGTTCTACAATGGTAAGCCTGCTACTGGTTTAGCGATTAAACTGGCGACAGGCGCAAACGCGTTGGATACCGCTGAAGCTGTTCGTGATAAAGTTGAAGAATTGCGACCTTTCTTCCCGCAAGGGTTGGATGTTGTTTATCCCTACGATACTACGCCATTCGTTGAGAAATCGATAGAAGGCGTGGTACACACCCTGCTCGAAGCGATTGTTCTGGTGTTTGTCATCATGTACCTCTTCCTGCAAAACTTCCGTGCGACCTTAATTCCGACGATTGCGGTACCAGTGGTCTTGCTGGGAACGTTTGCGATTTTGTCGGCCACGGGCTTCTCTATCAACACCCTTACCATGTTTGCTATGGTGCTGGCGATTGGTCTGTTGGTGGACGACGCCATCGTGGTGGTTGAAAACGTTGAGCGGGTGATGTCGGAAGAAGGGTTGAGCCCACTCGAAGCGACTCGTAAATCGATGGATCAAATCACTGGCGCCTTAGTTGGTATTGGTTTGACGTTATCTGCTGTATTTGTGCCAATGGCATTTATGTCGGGTTCTACTGGGGTCATTTACCGTCAGTTCTCGATCACTATCGTGTCTGCGATGGCATTGTCGGTATTAGTGGCCTTGATTTTAACGCCGGCACTTTGTGCCACTATGTTAAAACCCGTGCAGAAGGGACATGGTCATATTGAAACCGGTTTCTTCGGTTGGTTTAACCGTAACTTTGATCGCTTAACTAACCGTTACGAATCCAGTGTGGCGGGCATAGTGAAGCGTGGCTTTAGAGTCATGATGATTTATGTGGCTTTAGTGGTCGCCGTCGGTTGGATCTTCATGCGTATGCCAACTGCATTCTTACCCGATGAAGACCAAGGTATCTTGTTTACGCAGGCGATTTTGCCAACAAACTCGACTCAAGAAAGTACCCTCAAAGTGCTGGATAAGGTATCCGATCACTTCATGGCTGAAGAAGGCGTGAGATCGGTATTCAGCGTGGCGGGCTTTAGCTTTGCGGGTCAAGGCCAAAACATGGGTATCGCTTTCGTTGGCTTGAAGGATTGGTCAGAGCGTGAAGCACCTGGTATGGATGTGCAGTCTATTGCGGGTCGTGCTATGGGTGCCTTTAGTCAAATTAAAGACGCCTTCGTATTTGCCTTCGTACCACCTGCGGTTATTGAGCTGGGTACGGCGAATGGTTTTGACATGTACCTGCAAGATAAAAACGGTCAAGGCCACGATAAGTTAATAGCGGCTCGTAACCAATTGCTGGGTATGGCGGCTCAGAATCCAAACCTTATGGGTGTTCGCCCTAATGGTCAGGAAGATGCGCCAATCTATCAATTGCATATTGATCATGCAAAGTTGAGCGCATTAGGCGTTGATATTGCTAACGTTAACAGTGTGTTGGCAACTGCTTGGGGTGGTTCCTATGTGAACGATTTTATCGACCGCGGCCGTGTGAAAAAGGTATTTGTGCAAGGTGATGCCCAATACCGTATGCAGCCTGAAGACCTCAACACTTGGTACGTGCGTAACAACAAGGGTGACATGGTGCCATTTTCGGCCTTTGCAACAGGTTCTTGGGAATACGGCTCACCGCGTCTAGAACGTTTTAACGGTTTACCAGCGGTGAATATTCAAGGCGCAACTGCACCAGGCTTTAGTACGGGTGCTGCCATGACTATCATGGAGGACTTAGTTAAGCAGCTACCACCTGGCTTTGGCATCGAGTGGAACGGCTTATCCTACGAGGAACGTTTATCGGGTAACCAAGCACCAGCCTTGTATGCGTTGTCGATTCTGGTGGTATTCCTTGTATTAGCAGCCTTGTATGAAAGCTGGTCAGTACCGTTTGCGGTTATCCTTGTGGTTCCATTGGGGATTATCGGTGCTCTATTGGCGATGAATGGTCGAGGCTTGCCTAACGACGTGTTCTTCCAAGTGGGTCTGTTAACAACGGTTGGTTTGGCAACCAAGAACGCCATCTTGATTGTGGAATTTGCAAAAGAATTCTACGAGAAGGGGGCGGGTCTGGTTGAGGCGACCTTACATGCGGTCCGCGTGCGTTTACGTCCGATTTTAATGACGTCGCTCGCTTTTGGTCTGGGGGTTGTACCGCTAGCCATTAGTACAGGTGTGGGTTCGGGCAGTCAGAACGCCATTGGTACCGGTGTACTTGGCGGTATGATGAGTTCGACCTTCTTAGGTATCTTCTTCGTGCCACTGTTCTTCGTCATTGTTGAGCGGATCTTCAGTAAACGAGAGCGAAAAGCGAAAGAGAAAAATCCTACGTCGACGGATTAA From bosborne11 at verizon.net Wed Feb 13 02:30:08 2013 From: bosborne11 at verizon.net (Brian Osborne) Date: Tue, 12 Feb 2013 21:30:08 -0500 Subject: [Bioperl-l] Bio::Tools::SeqStats->count_codons In-Reply-To: References: Message-ID: Shyam, An ambiguous codon would be one that has a character other than [ACTGU] in it. I see '!' in your sequences, that would create an ambiguous codon. Brian O. On Feb 12, 2013, at 4:24 PM, Shyam Saladi wrote: > Hi, > > I am using the count_codons method from Bio::Tools::SeqStats and keep > getting "AMBIGUOUS" codons, but I can't figure out why exactly. > > When I translate the same sequence that gives the error using another > standard utility like (ExPASy - Translate), it seems to work alright. > > An example sequence is below. Could anyone lend some insight? > > Thanks, > Shyam > > > > AAA AAC AAG AAT ACA ACC ACG ACT AGA AGC > AGT *AMBIGUOUS* ATA ATC ATG ATT CAA CAC > CAG CAT CCA CCC CCG CCT CGA CGC CGG > CGT CTA CTC CTG CTT GAA GAC GAG GAT GCA > GCC GCG GCT GGA GGC GGG GGT GTA GTC > GTG GTT TAA TAC TAT TCA TCC TCG TCT TGG > TGT TTA TTC TTG TTT count filename > 1.722488038277511961722488038277511961722 > 2.966507177033492822966507177033492822967 > 1.531100478468899521531100478468899521531 > 0.9569377990430622009569377990430622009569 > 0.4784688995215311004784688995215311004785 > 1.722488038277511961722488038277511961722 > 1.33971291866028708133971291866028708134 > 1.913875598086124401913875598086124401914 > 0.1913875598086124401913875598086124401914 > 0.7655502392344497607655502392344497607656 > 1.435406698564593301435406698564593301435 * > 0.09569377990430622009569377990430622009569* > 0.3827751196172248803827751196172248803828 > 2.488038277511961722488038277511961722488 > 3.349282296650717703349282296650717703349 > 3.636363636363636363636363636363636363636 > 2.870813397129186602870813397129186602871 > 0.3827751196172248803827751196172248803828 > 1.626794258373205741626794258373205741627 > 0.4784688995215311004784688995215311004785 > 1.722488038277511961722488038277511961722 > 0.5741626794258373205741626794258373205742 > 1.052631578947368421052631578947368421053 > 1.244019138755980861244019138755980861244 > 0.3827751196172248803827751196172248803828 > 0.7655502392344497607655502392344497607656 > 0.1913875598086124401913875598086124401914 > 2.488038277511961722488038277511961722488 > 0.4784688995215311004784688995215311004785 > 0.6698564593301435406698564593301435406699 > 2.105263157894736842105263157894736842105 > 0.8612440191387559808612440191387559808612 > 2.870813397129186602870813397129186602871 > 1.435406698564593301435406698564593301435 > 1.722488038277511961722488038277511961722 > 2.775119617224880382775119617224880382775 > 2.00956937799043062200956937799043062201 > 2.488038277511961722488038277511961722488 > 3.540669856459330143540669856459330143541 > 2.00956937799043062200956937799043062201 > 0.1913875598086124401913875598086124401914 > 2.392344497607655502392344497607655502392 > 0.8612440191387559808612440191387559808612 > 5.454545454545454545454545454545454545455 > 1.913875598086124401913875598086124401914 > 0.8612440191387559808612440191387559808612 > 4.593301435406698564593301435406698564593 > 2.679425837320574162679425837320574162679 > 0.09569377990430622009569377990430622009569 > 1.148325358851674641148325358851674641148 > 1.148325358851674641148325358851674641148 > 0.8612440191387559808612440191387559808612 > 0.4784688995215311004784688995215311004785 > 2.105263157894736842105263157894736842105 > 0.9569377990430622009569377990430622009569 > 0.9569377990430622009569377990430622009569 > 0.09569377990430622009569377990430622009569 > 2.679425837320574162679425837320574162679 > 2.966507177033492822966507177033492822967 > 3.062200956937799043062200956937799043062 > 2.775119617224880382775119617224880382775 1045 temp.seq > > ATGGCACGTTTTTTTATTGATCGTCCCATCTTTGCGTGGGTGATCGCCTTAATTATTATGTTGGCGGGGGTGCTTTCAATTCGCACCCTGCCGGTTTCTCAATATCCCAGCATTGCACCGCCAACCGTGGTGATCAGTGCTAACTACCCTGGTGCATCGGCCAAGATTGTTGAAGACTCAGTGACTCAGGTGATTGAGCAACGCATGAAGGGTATCGATCACCTACGTTATATTGCCTCAACCAGCGATAGTTTCGGTAATGCTGAAATCACTTTGACCTTCAATGCCGAAGCCGATCCTGATATTGCTCAGGTACAAGTTCAGAACAAATTGCAGGGTGCAATGACCCTGTTACCACAAGAGGTACAGGCTCAAGGGGTTGACGTTAACAAATCAAGTTCTGGCTTYTTGATGGTGCTGGGTTTCGTATCGACTGACGGTTCCTTAGATAAAGGCGACATCGCCGACTATGTGGGTGCAAACGTACAAGATCCCATGAGCCGTGTACCGGGCGTGGGTGAAATTCAGCTGTTTGGTGCCCAATATGCGATGCGTATATGGCTTGATCCTTTAAAACTGACTCAATATAACTTGACCAGTTTAGAGGTGATCTCGGCGATTCGTGCTCAAAACGCGCAGGTGTCTGCGGGTCAGTTGGGTGGTACGCCGTCAATTCAAGGGCAAGAACTTAACGCCACTGTTTCGGCGCAAAGTCGTTTGCAAACCCCTGAAGAGTTTCGCAAGATTATCCTGAAGTCTGATACTTCGGGTGCGAATGTGTTCCTCGGTGATGTGGCGCGCGTAGAGTTAGGTTCAGAGAGTTATGCCGTTGTCTCGTTCTACAATGGTAAGCCTGCTACTGGTTTAGCGATTAAACTGGCGACAGGCGCAAACGCGTTGGATACCGCTGAAGCTGTTCGTGATAAAGTTGAAGAATTGCGACCTTTCTTCCCGCAAGGGTTGGATGTTGTTTATCCCTACGATACTAC! > GCCATTCGTTGAGAAATCGATAGAAGGCGTGGTACACACCCTGCTCGAAGCGATTGTTCTGGTGTTTGTCATCATGTACCTCTTCCTGCAAAACTTCCGTGCGACCTTAATTCCGACGATTGCGGTACCAGTGGTCTTGCTGGGAACGTTTGCGATTTTGTCGGCCACGGGCTTCTCTATCAACACCCTTACCATGTTTGCTATGGTGCTGGCGATTGGTCTGTTGGTGGACGACGCCATCGTGGTGGTTGAAAACGTTGAGCGGGTGATGTCGGAAGAAGGGTTGAGCCCACTCGAAGCGACTCGTAAATCGATGGATCAAATCACTGGCGCCTTAGTTGGTATTGGTTTGACGTTATCTGCTGTATTTGTGCCAATGGCATTTATGTCGGGTTCTACTGGGGTCATTTACCGTCAGTTCTCGATCACTATCGTGTCTGCGATGGCATTGTCGGTATTAGTGGCCTTGATTTTAACGCCGGCACTTTGTGCCACTATGTTAAAACCCGTGCAGAAGGGACATGGTCATATTGAAACCGGTTTCTTCGGTTGGTTTAACCGTAACTTTGATCGCTTAACTAACCGTTACGAATCCAGTGTGGCGGGCATAGTGAAGCGTGGCTTTAGAGTCATGATGATTTATGTGGCTTTAGTGGTCGCCGTCGGTTGGATCTTCATGCGTATGCCAACTGCATTCTTACCCGATGAAGACCAAGGTATCTTGTTTACGCAGGCGATTTTGCCAACAAACTCGACTCAAGAAAGTACCCTCAAAGTGCTGGATAAGGTATCCGATCACTTCATGGCTGAAGAAGGCGTGAGATCGGTATTCAGCGTGGCGGGCTTTAGCTTTGCGGGTCAAGGCCAAAACATGGGTATCGCTTTCGTTGGCTTGAAGGATTGGTCAGAGCGTGAAGCACCTGGTATGGATGTGCAGTCTATTGCGGGTCGTGCTATGGGTGCCTTTAGTCAAATTAAAGACGCCTTC! > GTATTTGCCTTCGTACCACCTGCGGTTATTGAGCTGGGTACGGCGAATGGTTTTGACATGTACCTGCAAG > ATAAAAACGGTCAAGGCCACGATAAGTTAATAGCGGCTCGTAACCAATTGCTGGGTATGGCGGCTCAGAATCCAAACCTTATGGGTGTTCGCCCTAATGGTCAGGAAGATGCGCCAATCTATCAATTGCATATTGATCATGCAAAGTTGAGCGCATTAGGCGTTGATATTGCTAACGTTAACAGTGTGTTGGCAACTGCTTGGGGTGGTTCCTATGTGAACGATTTTATCGACCGCGGCCGTGTGAAAAAGGTATTTGTGCAAGGTGATGCCCAATACCGTATGCAGCCTGAAGACCTCAACACTTGGTACGTGCGTAACAACAAGGGTGACATGGTGCCATTTTCGGCCTTTGCAACAGGTTCTTGGGAATACGGCTCACCGCGTCTAGAACGTTTTAACGGTTTACCAGCGGTGAATATTCAAGGCGCAACTGCACCAGGCTTTAGTACGGGTGCTGCCATGACTATCATGGAGGACTTAGTTAAGCAGCTACCACCTGGCTTTGGCATCGAGTGGAACGGCTTATCCTACGAGGAACGTTTATCGGGTAACCAAGCACCAGCCTTGTATGCGTTGTCGATTCTGGTGGTATTCCTTGTATTAGCAGCCTTGTATGAAAGCTGGTCAGTACCGTTTGCGGTTATCCTTGTGGTTCCATTGGGGATTATCGGTGCTCTATTGGCGATGAATGGTCGAGGCTTGCCTAACGACGTGTTCTTCCAAGTGGGTCTGTTAACAACGGTTGGTTTGGCAACCAAGAACGCCATCTTGATTGTGGAATTTGCAAAAGAATTCTACGAGAAGGGGGCGGGTCTGGTTGAGGCGACCTTACATGCGGTCCGCGTGCGTTTACGTCCGATTTTAATGACGTCGCTCGCTTTTGGTCTGGGGGTTGTACCGCTAGCCATTAGTACAGGTGTGGGTTCGGGCAGTCAGAACGCCATTGGTACCGGTGTACTTGGCGGTATGATGAGTTCGACCTTCTTA! > GGTATCTTCTTCGTGCCACTGTTCTTCGTCATTGTTGAGCGGATCTTCAGTAAACGAGAGCGAAAAGCGAAAGAGAAAAATCCTACGTCGACGGATTAA > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Wed Feb 13 15:18:10 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 13 Feb 2013 15:18:10 +0000 Subject: [Bioperl-l] [ANNOUNCEMENT] BioPerl Future Development Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE24CF5@CHIMBX5.ad.uillinois.edu> All, tl;dr: A lot of change is coming. Be forewarned and be prepared. This is an 'official' announcement to the BioPerl community on future BioPerl plans. We have decided to move continued maintenance of Bioperl release series over to the new 'v1' branch. This branch will be the point where any future versions of 1.6.x code will be released, starting with the (already-scheduled) March 1 release. The 'master' branch will become the main focal point for future development of BioPerl going into an eventual v2 release, with a focus on performance enhancements, addressing newer technologies like NGS and large data, code cleanup, and simplifying the code base. We welcome any help with code improvements. GMOD folks? Want to help? This is a good opportunity to address BioPerl short-comings in the code base! What this means for anyone using BioPerl currently: 1) We anticipate significant issues if you are relying on the 'master' branch for anything. To inelegantly state it, the core developers are taking back the 'master' branch for future development. Please please please do not rely on the 'master' branch for stable code; if you are reliant on the BioPerl 1.6.x, make sure to use 'v1'. We can revisit whether to make 'v1' the default checkout branch if/when the need arises. 2) Expect not to find some modules. We will be migrating modules requiring external dependencies and other associated chunks of the code base out into their own repositories over the next year to help future maintenance; the eventual intent is to release all of these independently on CPAN. We will completely remove all code previously marked as deprecated, and we may immediately deprecate additional modules if needed (this will of course be discussed on list). 3) Expect version numbering to change significantly. Because we are releasing code in separate repositories, I fully expect downstream versioning problems if we stick with the current system (e.g. all bioperl-live modules having the same version). It will be too much of a headache to sync versions for all modules as this will entail making a full release of all bioperl code, one of the main reasons we are splitting out code to begin with. At the moment, no specific versioning scheme has been chosen, though I *highly* recommend using X.Y versioning for simplicity (e.g. no more 3-point versions). This is the standard that Lincoln has adopted for Bio::Graphics and GBrowse. 4) Expect quick deprecation of methods within modules as needed. These should of course be brought up to the mail list prior to actual implementation, but I would anticipate some things changing as we try to adopt a more consistent method naming scheme. 5) The same steps outlined for bioperl-live will apply for bioperl-run modules. We will have to decide the best approach to use for those, e.g. whether to separate them out based on task (alignment), application group (NGS, BLAST, RNA), etc. and how these may fit organically with bioperl-live modules where appropriate. 6) Do not expect a new CPAN release of such code until Dec 2013. Even then it will be in an alpha stage. We are all busy campers. We do not anticipate significant changes to bioperl-network or bioperl-db at this time beyond updating them to deal with new changes. I'm sure there are many other points that need to be discussed. Please reply over the next week if you have any concerns. chris From cjfields at illinois.edu Wed Feb 13 16:01:07 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 13 Feb 2013 16:01:07 +0000 Subject: [Bioperl-l] Test-pls ignore Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE2506D@CHIMBX5.ad.uillinois.edu> testing the mail list to see if it is working. -c From sebastien.moretti at unil.ch Wed Feb 13 16:21:23 2013 From: sebastien.moretti at unil.ch (=?ISO-8859-1?Q?Moretti_S=E9bastien?=) Date: Wed, 13 Feb 2013 17:21:23 +0100 Subject: [Bioperl-l] PhyloXML In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE24E8D@CHIMBX5.ad.uillinois.edu> References: <51152591.9010402@unil.ch> <511898E6.7060400@unil.ch> <118F034CF4C3EF48A96F86CE585B94BF6CE24E8D@CHIMBX5.ad.uillinois.edu> Message-ID: <511BBD83.2000708@unil.ch> >>>> # Add annotation >>>> $treeio->add_phyloXML_annotation(-obj => $tree, >>>> -xml => 'SUMF family', >>>> ); >>> >>> If you really have $treeio in your script in this line and not $tree, then that's at least one problem. But the stack suggests that the above isn't the exact line in your script - can you confirm that? >>> >>> -hilmar >> >> I replaced $treeio by $tree in the above line but still get an error. >> Don't see what you mean by "the stack suggests that the above isn't the exact line in your script" >> >> The only think I changed is the length of the xml string I try to insert. But get the same error with an empty xml string. >> >> >> >> my $treeio = new Bio::TreeIO(-file => "$infile", >> -format => 'phyloxml', >> ); >> my $tree = $treeio->next_tree; >> >> # Add annotation >> $tree->add_phyloXML_annotation(-obj => $tree, >> -xml => 'SUMF family', >> ); >> >> Can't locate object method "add_phyloXML_annotation" via package >> "Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, line 1 (#1) >> (F) You called a method correctly, and it correctly indicated a package >> functioning as a class, but that package doesn't define that particular >> method, nor does any of its base classes. See perlobj. >> >> Uncaught exception from user code: >> Can't locate object method "add_phyloXML_annotation" via package "Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, line 1. >> at ./add_annotation_to_phyloxml.pl line 40 > > Will have to look into this. One problem we have is that phyloXML support has dwindled, so if anyone wants to take this on I would be more than happy to help them get started. > > chris You mean that BioPerl 1.6.901 has not a full support of PhyloXML ? The problem I have is "expected" ? -- S?bastien Moretti From cjfields at illinois.edu Wed Feb 13 15:47:17 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 13 Feb 2013 15:47:17 +0000 Subject: [Bioperl-l] PhyloXML In-Reply-To: <511898E6.7060400@unil.ch> References: <51152591.9010402@unil.ch> <511898E6.7060400@unil.ch> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE24E8D@CHIMBX5.ad.uillinois.edu> On Feb 11, 2013, at 1:08 AM, S?bastien MORETTI wrote: >>> # Add annotation >>> $treeio->add_phyloXML_annotation(-obj => $tree, >>> -xml => 'SUMF family', >>> ); >> >> If you really have $treeio in your script in this line and not $tree, then that's at least one problem. But the stack suggests that the above isn't the exact line in your script - can you confirm that? >> >> -hilmar > > I replaced $treeio by $tree in the above line but still get an error. > Don't see what you mean by "the stack suggests that the above isn't the exact line in your script" > > The only think I changed is the length of the xml string I try to insert. But get the same error with an empty xml string. > > > > my $treeio = new Bio::TreeIO(-file => "$infile", > -format => 'phyloxml', > ); > my $tree = $treeio->next_tree; > > # Add annotation > $tree->add_phyloXML_annotation(-obj => $tree, > -xml => 'SUMF family', > ); > > Can't locate object method "add_phyloXML_annotation" via package > "Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, line 1 (#1) > (F) You called a method correctly, and it correctly indicated a package > functioning as a class, but that package doesn't define that particular > method, nor does any of its base classes. See perlobj. > > Uncaught exception from user code: > Can't locate object method "add_phyloXML_annotation" via package "Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, line 1. > at ./add_annotation_to_phyloxml.pl line 40 > > > > -- > S?bastien Moretti > Department of Ecology and Evolution, > Biophore, University of Lausanne, > CH-1015 Lausanne, Switzerland > Tel.: +41 (21) 692 4221/4079 > http://bioinfo.unil.ch/\ Will have to look into this. One problem we have is that phyloXML support has dwindled, so if anyone wants to take this on I would be more than happy to help them get started. chris From carandraug+dev at gmail.com Wed Feb 13 17:23:23 2013 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Wed, 13 Feb 2013 17:23:23 +0000 Subject: [Bioperl-l] Next BioPerl release Message-ID: On 5 February 2013 21:53, Fields, Christopher J wrote: > I am scheduling the next BioPerl CPAN release tentatively for March 1. Any help in triaging bug reports would be greatly appreciated! Hi is this release of bioperl-live only or also includes bioperl-run? Carn? From cjfields at illinois.edu Wed Feb 13 17:08:21 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 13 Feb 2013 17:08:21 +0000 Subject: [Bioperl-l] PhyloXML In-Reply-To: <511BBD83.2000708@unil.ch> References: <51152591.9010402@unil.ch> <511898E6.7060400@unil.ch> <118F034CF4C3EF48A96F86CE585B94BF6CE24E8D@CHIMBX5.ad.uillinois.edu> <511BBD83.2000708@unil.ch> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE2532B@CHIMBX5.ad.uillinois.edu> On Feb 13, 2013, at 10:21 AM, Moretti S?bastien wrote: >>>>> # Add annotation >>>>> $treeio->add_phyloXML_annotation(-obj => $tree, >>>>> -xml => 'SUMF family', >>>>> ); >>>> >>>> If you really have $treeio in your script in this line and not $tree, then that's at least one problem. But the stack suggests that the above isn't the exact line in your script - can you confirm that? >>>> >>>> -hilmar >>> >>> I replaced $treeio by $tree in the above line but still get an error. >>> Don't see what you mean by "the stack suggests that the above isn't the exact line in your script" >>> >>> The only think I changed is the length of the xml string I try to insert. But get the same error with an empty xml string. >>> >>> >>> >>> my $treeio = new Bio::TreeIO(-file => "$infile", >>> -format => 'phyloxml', >>> ); >>> my $tree = $treeio->next_tree; >>> >>> # Add annotation >>> $tree->add_phyloXML_annotation(-obj => $tree, >>> -xml => 'SUMF family', >>> ); >>> >>> Can't locate object method "add_phyloXML_annotation" via package >>> "Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, line 1 (#1) >>> (F) You called a method correctly, and it correctly indicated a package >>> functioning as a class, but that package doesn't define that particular >>> method, nor does any of its base classes. See perlobj. >>> >>> Uncaught exception from user code: >>> >>> at ./add_annotation_to_phyloxml.pl line 40 >> >> Will have to look into this. One problem we have is that phyloXML support has dwindled, so if anyone wants to take this on I would be more than happy to help them get started. >> >> chris > > You mean that BioPerl 1.6.901 has not a full support of PhyloXML ? > The problem I have is "expected" ? > > -- > S?bastien Moretti I think it handles most of phyloXML fine, but the implementation of the parser is a little tricky. I tried cleaning this up a few years back but didn't make much progress. The function is in Bio::TreeIO::phyloxml, so the correct call should be (as you previously had it): $treeio->add_phyloXML_annotation(-obj => $tree, -xml => 'SUMF family', ); My guess is that Bio::Tree::Tree was AnnotatableI at one point but that was removed, will have to trace that back. Can you file a bug on this? https://redmine.open-bio.org/ chris From cjfields at illinois.edu Wed Feb 13 18:05:53 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 13 Feb 2013 18:05:53 +0000 Subject: [Bioperl-l] Next BioPerl release In-Reply-To: References: Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE25573@CHIMBX5.ad.uillinois.edu> On Feb 13, 2013, at 11:23 AM, Carn? Draug wrote: > On 5 February 2013 21:53, Fields, Christopher J wrote: >> I am scheduling the next BioPerl CPAN release tentatively for March 1. Any help in triaging bug reports would be greatly appreciated! > > Hi > > is this release of bioperl-live only or also includes bioperl-run? > > Carn? We can work on a bioperl-run release. It's too much to handle both in one go. The problem I have faced with bioperl-run in the past is similar to bioperl-live, that the tools used are a moving target and that makes the wrappers easily out-of-date. I would really like a more flexible generic way of defining these that would allow for easier maintenance. chris From l.m.timmermans at students.uu.nl Wed Feb 13 19:44:22 2013 From: l.m.timmermans at students.uu.nl (Leon Timmermans) Date: Wed, 13 Feb 2013 20:44:22 +0100 Subject: [Bioperl-l] Next BioPerl release In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE25573@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE25573@CHIMBX5.ad.uillinois.edu> Message-ID: On Wed, Feb 13, 2013 at 7:05 PM, Fields, Christopher J wrote: > We can work on a bioperl-run release. It's too much to handle both in one go. The problem I have faced with bioperl-run in the past is similar to bioperl-live, that the tools used are a moving target and that makes the wrappers easily out-of-date. I would really like a more flexible generic way of defining these that would allow for easier maintenance. Also, bioperl-run needs to be cut into smaller distributions even more than bioperl-live. Few people if anyone at all has all tools it tries to wrap at hand, so its almost impossible to pass its testing suite. We need dists that can realistically pass. Leon From cjfields at illinois.edu Wed Feb 13 21:04:26 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 13 Feb 2013 21:04:26 +0000 Subject: [Bioperl-l] Next BioPerl release In-Reply-To: References: <118F034CF4C3EF48A96F86CE585B94BF6CE25573@CHIMBX5.ad.uillinois.edu> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE25B07@CHIMBX5.ad.uillinois.edu> On Feb 13, 2013, at 1:44 PM, Leon Timmermans wrote: > On Wed, Feb 13, 2013 at 7:05 PM, Fields, Christopher J > wrote: >> We can work on a bioperl-run release. It's too much to handle both in one go. The problem I have faced with bioperl-run in the past is similar to bioperl-live, that the tools used are a moving target and that makes the wrappers easily out-of-date. I would really like a more flexible generic way of defining these that would allow for easier maintenance. > > Also, bioperl-run needs to be cut into smaller distributions even more > than bioperl-live. Few people if anyone at all has all tools it tries > to wrap at hand, so its almost impossible to pass its testing suite. > > We need dists that can realistically pass. > > Leon Yup. It's a mess. chris From florent.angly at gmail.com Wed Feb 13 22:33:14 2013 From: florent.angly at gmail.com (Florent Angly) Date: Thu, 14 Feb 2013 08:33:14 +1000 Subject: [Bioperl-l] [ANNOUNCEMENT] BioPerl Future Development In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE24CF5@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6CE24CF5@CHIMBX5.ad.uillinois.edu> Message-ID: <511C14AA.9030107@gmail.com> On 14/02/13 01:18, Fields, Christopher J wrote: > I*highly* recommend using X.Y versioning for simplicity (e.g. no more 3-point versions) Yes, I support the X.Y versioning as well. Florent From l.m.timmermans at students.uu.nl Wed Feb 13 23:12:06 2013 From: l.m.timmermans at students.uu.nl (Leon Timmermans) Date: Thu, 14 Feb 2013 00:12:06 +0100 Subject: [Bioperl-l] [ANNOUNCEMENT] BioPerl Future Development In-Reply-To: <511C14AA.9030107@gmail.com> References: <118F034CF4C3EF48A96F86CE585B94BF6CE24CF5@CHIMBX5.ad.uillinois.edu> <511C14AA.9030107@gmail.com> Message-ID: On Wed, Feb 13, 2013 at 11:33 PM, Florent Angly wrote: > On 14/02/13 01:18, Fields, Christopher J wrote: >> >> I*highly* recommend using X.Y versioning for simplicity (e.g. no more >> 3-point versions) > > Yes, I support the X.Y versioning as well. > Florent See also: http://www.dagolden.com/index.php/369/version-numbers-should-be-boring/ Leon From daisieh at gmail.com Thu Feb 14 05:21:15 2013 From: daisieh at gmail.com (Daisie Huang) Date: Wed, 13 Feb 2013 21:21:15 -0800 (PST) Subject: [Bioperl-l] Question regarding while loops for reading files In-Reply-To: References: Message-ID: <3cbbba3b-759d-4281-9592-6b690aea92ab@googlegroups.com> I think you need to reset the pointer to the filehandle before you go through the while loop the second time: seek $fh,0,0 On Wednesday, February 13, 2013 6:46:41 PM UTC-8, Tiago Hori wrote: > > Hey Guys, > > I am still at the same place. I am writing these little pieces of code to > try to learn the language better, so any advice would be useful. I am again > parsing through tab delimited files and now trying to find fish from on id > (in these case families AS5 and AS9), retrieve the weights and average > them. When I started I did it for one family and it worked (instead of the > @families I had a scalar $family set to AS5). But really it is more useful > to look at more than one family at time (I should mention that are 2 types > of fish per family one ends in PS , the other doesn't). So I tried to use a > foreach loop to go through the file twice, once with a the search value set > to AS5 and a second time to AS9. It works for AS5, but for some reason, the > foreach loop sets $test to AS9 the second time, but it doesn't go through > the while loop. What am I doing wrong? > > here is the code: > > #! /usr/bin/perl > use strict; > use warnings; > > my $file = $ARGV[0]; > my @family = ('AS5','AS9'); > my $i; > my $ii; > my $test; > > open (my $fh, "<", $file) or die ("Can't open $file: $!"); > > foreach (@family){ > $test = $_; > my @data_weight_2N = (); > my @data_weight_3N = (); > while (<$fh>){ > chomp; > my $line = $_; > my @data = split ("\t", $line); > if ($data[0] !~ /[0-9]*/){ > next;} > elsif ($data[1] eq "ABF09-$test"){ > $i += 1; > push (@data_weight_2N, $data[6]); > }elsif ($data[1] eq "ABF09-".$test."PS"){ > $ii += 1; > push (@data_weight_3N,$data[6]); > } > } > my $mean_2N = &average (\@data_weight_2N); > my $stdev_2N = &stdev (\@data_weight_2N); > my $stderr_2N = ($stdev_2N/sqrt($i)); > > print "These are the the avearge weight, stdev and stderr for $test > 2N:\t", $mean_2N,"\t",$stdev_2N,"\t",$stderr_2N, "\n"; > > my $mean_3N = &average (\@data_weight_3N); > my $stdev_3N = &stdev (\@data_weight_3N); > my $stderr_3N = ($stdev_3N/sqrt($i)); > > print "These are the the avearge weight, stdev and stderr for $test > 3N:\t", $mean_3N,"\t",$stdev_3N,"\t",$stderr_3N, "\n"; > } > > close ($fh); > > sub average{ > my($data) = @_; > if (not @$data) { > print ("Empty array\n"); > return 0; > } > my $total = 0; > foreach (@$data) { > $total += $_; > } > my $average = $total / @$data; > return $average; > } > > sub stdev{ > my($data) = @_; > if(@$data == 1){ > return 0; > } > my $average = &average($data); > my $sqtotal = 0; > foreach(@$data) { > $sqtotal += ($average-$_) ** 2; > } > my $std = ($sqtotal / (@$data-1)) ** 0.5; > return $std; > } > > Thanks, > > T. > > -- > "Education is not to be used to promote obscurantism." - Theodonius > Dobzhansky. > > "Gracias a la vida que me ha dado tanto > Me ha dado el sonido y el abecedario > Con ?l, las palabras que pienso y declaro > Madre, amigo, hermano > Y luz alumbrando la ruta del alma del que estoy amando > > Gracias a la vida que me ha dado tanto > Me ha dado la marcha de mis pies cansados > Con ellos anduve ciudades y charcos > Playas y desiertos, monta?as y llanos > Y la casa tuya, tu calle y tu patio" > > Violeta Parra - Gracias a la Vida > > Tiago S. F. Hori. PhD. > Ocean Science Center-Memorial University of Newfoundland > From sebastien.moretti at unil.ch Thu Feb 14 08:09:06 2013 From: sebastien.moretti at unil.ch (=?ISO-8859-1?Q?S=E9bastien_MORETTI?=) Date: Thu, 14 Feb 2013 09:09:06 +0100 Subject: [Bioperl-l] PhyloXML In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6CE2532B@CHIMBX5.ad.uillinois.edu> References: <51152591.9010402@unil.ch> <511898E6.7060400@unil.ch> <118F034CF4C3EF48A96F86CE585B94BF6CE24E8D@CHIMBX5.ad.uillinois.edu> <511BBD83.2000708@unil.ch> <118F034CF4C3EF48A96F86CE585B94BF6CE2532B@CHIMBX5.ad.uillinois.edu> Message-ID: <511C9BA2.9000508@unil.ch> >>>>>> # Add annotation >>>>>> $treeio->add_phyloXML_annotation(-obj => $tree, >>>>>> -xml => 'SUMF family', >>>>>> ); >>>>> >>>>> If you really have $treeio in your script in this line and not $tree, then that's at least one problem. But the stack suggests that the above isn't the exact line in your script - can you confirm that? >>>>> >>>>> -hilmar >>>> >>>> I replaced $treeio by $tree in the above line but still get an error. >>>> Don't see what you mean by "the stack suggests that the above isn't the exact line in your script" >>>> >>>> The only think I changed is the length of the xml string I try to insert. But get the same error with an empty xml string. >>>> >>>> >>>> >>>> my $treeio = new Bio::TreeIO(-file => "$infile", >>>> -format => 'phyloxml', >>>> ); >>>> my $tree = $treeio->next_tree; >>>> >>>> # Add annotation >>>> $tree->add_phyloXML_annotation(-obj => $tree, >>>> -xml => 'SUMF family', >>>> ); >>>> >>>> Can't locate object method "add_phyloXML_annotation" via package >>>> "Bio::Tree::Tree" at ./add_annotation_to_phyloxml.pl line 40, line 1 (#1) >>>> (F) You called a method correctly, and it correctly indicated a package >>>> functioning as a class, but that package doesn't define that particular >>>> method, nor does any of its base classes. See perlobj. >>>> >>>> Uncaught exception from user code: >>>> >>>> at ./add_annotation_to_phyloxml.pl line 40 >>> >>> Will have to look into this. One problem we have is that phyloXML support has dwindled, so if anyone wants to take this on I would be more than happy to help them get started. >>> >>> chris >> >> You mean that BioPerl 1.6.901 has not a full support of PhyloXML ? >> The problem I have is "expected" ? >> >> -- >> S?bastien Moretti > > I think it handles most of phyloXML fine, but the implementation of the parser is a little tricky. I tried cleaning this up a few years back but didn't make much progress. > > The function is in Bio::TreeIO::phyloxml, so the correct call should be (as you previously had it): > > $treeio->add_phyloXML_annotation(-obj => $tree, > -xml => 'SUMF family', > ); > > My guess is that Bio::Tree::Tree was AnnotatableI at one point but that was removed, will have to trace that back. Can you file a bug on this? > > https://redmine.open-bio.org/ > > chris I will fill a bug on this. I'd be happy to try to contribute to the phyloxml code. But don't know how to proceed for BioPerl. -- S?bastien Moretti From hartzell at alerce.com Thu Feb 14 20:04:44 2013 From: hartzell at alerce.com (George Hartzell) Date: Thu, 14 Feb 2013 12:04:44 -0800 Subject: [Bioperl-l] Question regarding while loops for reading files In-Reply-To: <3cbbba3b-759d-4281-9592-6b690aea92ab@googlegroups.com> References: <3cbbba3b-759d-4281-9592-6b690aea92ab@googlegroups.com> Message-ID: <20765.17244.185833.755900@gargle.gargle.HOWL> I think that it's important to get feedback on code that one has written and to try to understand how/what/why someone else has done in their code. To that end.... Since Tiago's using this to learn the language better I can't resist some comments beyond resetting the file handle. For grins I rewrote it using Text::CSV_XS and Statistics::Basic and to take a single pass through the data file using a multilevel data structure. I resisted the urge to rewrite it in Moose. Didn't even have an urge to rewrite it in R. Funny, that.... The script is here Tiago.pl https://gist.github.com/hartzell/4955401 With something like what I think the data looks like here: https://gist.github.com/hartzell/4955570 Even without that big of a rewrite, I had a bunch of local comments which are inline below. Daisie Huang writes: > [...] > On Wednesday, February 13, 2013 6:46:41 PM UTC-8, Tiago Hori wrote: > > > > Hey Guys, > > > > I am still at the same place. I am writing these little pieces of code to > > try to learn the language better, so any advice would be useful. > > [...] > > here is the code: > > > > #! /usr/bin/perl > > use strict; > > use warnings; > > > > my $file = $ARGV[0]; Slightly better would be $filename, so that when you step up to Path::Class you can differentiate a file object from a file name string. > > my @family = ('AS5','AS9'); Better would be @families, plural. See the use of $family below. > > my $i; > > my $ii; As far as I can tell, these are just counting the number of things that you push onto the various arrays. You don't need them, referring to the list in scalar context will give you its size. > > my $test; You use this to hold the name of the family, so it's not particularly evocative. You should also restrict it's scope to within the loop. See the comment for the foreach loop. > > open (my $fh, "<", $file) or die ("Can't open $file: $!"); You made my day, three arg. open *and* you checked for errors. Nice! > > foreach (@family){ Better would be for my $family (@families) { which is evocative and restricts the scope of $family to the for loop (and for is 4 characters shorter than foreach...). > > $test = $_; No longer need this, using $family declared in the for loop with the proper scoping. > > my @data_weight_2N = (); > > my @data_weight_3N = (); > > while (<$fh>){ > > chomp; > > my $line = $_; > > my @data = split ("\t", $line); Don't parse CSV (TSV) files yourself. Get in the habit of using Text::CSV_XS. > > if ($data[0] !~ /[0-9]*/){ > > next;} > > elsif ($data[1] eq "ABF09-$test"){ > > $i += 1; You don't need the counter. > > push (@data_weight_2N, $data[6]); > > }elsif ($data[1] eq "ABF09-".$test."PS"){ > > $ii += 1; You don't need the counter. > > push (@data_weight_3N,$data[6]); > > } > > } > > my $mean_2N = &average (\@data_weight_2N); > > my $stdev_2N = &stdev (\@data_weight_2N); You don't need the ampersands on the subroutine calls. They're old school and just encourage people to make fun of our language for its use of all those funny punctuation marks . > > my $stderr_2N = ($stdev_2N/sqrt($i)); Unless I'm mistaken, this is equivalent my $stderr_2N = ($stdev_2N/sqrt(scalar @data_weight_2N)); and you don't need the counter, the explicit use of scalar there might even be redundant (I'm a coward). You use the same trick in your subroutine defn's below. > > > > print "These are the the avearge weight, stdev and stderr for $test > > 2N:\t", $mean_2N,"\t",$stdev_2N,"\t",$stderr_2N, "\n"; > > > > my $mean_3N = &average (\@data_weight_3N); > > my $stdev_3N = &stdev (\@data_weight_3N); > > my $stderr_3N = ($stdev_3N/sqrt($i)); > > > > print "These are the the avearge weight, stdev and stderr for $test > > 3N:\t", $mean_3N,"\t",$stdev_3N,"\t",$stderr_3N, "\n"; > > } > > > > close ($fh); Ah, rats. You checked whether open worked, you need to do the same thing on close too! close ($fh) or die !$; Or you could just use autodie qw(open close); and then they'll die appropriately when they have to and you don't have to bother with the checking. > > sub average{ > > my($data) = @_; > > if (not @$data) { > > print ("Empty array\n"); > > return 0; > > } > > my $total = 0; > > foreach (@$data) { > > $total += $_; > > } use List::AllUtils qw(sum); # somewhere up at the top of the script... my $total = sum(@$data); if (not defined $total) { print "Empty array\n"; return; } List::AllUtils is your friend. Learn to use it. Your returning 0 for an empty list is probably the wrong thing, isn't it possible to the total to actually be 0? Just return instead. Don't return undef, just return (and let perl take context into account for you). You probably don't actually want to spew "Empty array" out into your output stream, imagine writing a script that postprocesses your output and having to deal with it. If you really need to say it, send it to standard error with print STDERR "Empty array\n"; > > my $average = $total / @$data; > > return $average; If you don't really need the error message, then you can get to my $total = sum(@$data); return unless $total; return $total / @$data; And if an empty data array is *truly* unexpected, maybe you should just die/carp. > > } > > > > sub stdev{ > > my($data) = @_; > > if(@$data == 1){ > > return 0; > > } > > my $average = &average($data); > > my $sqtotal = 0; > > foreach(@$data) { > > $sqtotal += ($average-$_) ** 2; > > } > > my $std = ($sqtotal / (@$data-1)) ** 0.5; > > return $std; > > } Ditto on the use of List::AllUtils, etc... Phew. The only other thing I'd like to see would be an arrangement that let's you write simple tests. A simple sol'n would be to package the entire main part of the code up into e.g. a subroutine that returns a hashref keyed by family, containing a hashref keyed by 2N/3N/... and then you could just: use Test::More; use Tiago qw(summarize); my $output = summarize("test_data.tsv"); is($output->{AS5}->{'2N}, "42", "Got the magic number") # etc... done_testing; Thanks for sharing your code. Keep practicing! g. From carandraug+dev at gmail.com Thu Feb 14 22:13:45 2013 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Thu, 14 Feb 2013 22:13:45 +0000 Subject: [Bioperl-l] bioperl in Google Summer of Code 2013 Message-ID: Hi we got word of it on another project I'm involved with and I was wondering. Is bioperl going to apply for the Google Summer of Code this year? http://www.google-melange.com/gsoc/homepage/google/gsoc2013 Carn? From hlapp at drycafe.net Fri Feb 15 14:28:30 2013 From: hlapp at drycafe.net (Hilmar Lapp) Date: Fri, 15 Feb 2013 09:28:30 -0500 Subject: [Bioperl-l] bioperl in Google Summer of Code 2013 In-Reply-To: References: Message-ID: <50A8D5F3-7248-49C1-AA1D-969C2BCA38A9@drycafe.net> I presume the OBF does as an umbrella organization on behalf of all Bio* projects. If you fancy proposing a project idea or mentoring, now is not a bad time to think about that or looking for co-mentors. -hilmar Sent with a tap. On Feb 14, 2013, at 5:13 PM, Carn? Draug wrote: > Hi > > we got word of it on another project I'm involved with and I was > wondering. Is bioperl going to apply for the Google Summer of Code > this year? > > http://www.google-melange.com/gsoc/homepage/google/gsoc2013 > > Carn? > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From p.j.a.cock at googlemail.com Fri Feb 15 14:47:39 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Fri, 15 Feb 2013 14:47:39 +0000 Subject: [Bioperl-l] bioperl in Google Summer of Code 2013 In-Reply-To: <50A8D5F3-7248-49C1-AA1D-969C2BCA38A9@drycafe.net> References: <50A8D5F3-7248-49C1-AA1D-969C2BCA38A9@drycafe.net> Message-ID: On Fri, Feb 15, 2013 at 2:28 PM, Hilmar Lapp wrote: > I presume the OBF does as an umbrella organization on behalf of all Bio* > projects. If you fancy proposing a project idea or mentoring, now is not a > bad time to think about that or looking for co-mentors. > > -hilmar Yes, the plan is that as in the last few years, the OBF will apply to GSoC and cover for BioPerl, BioJava, BioRuby, Biopython etc. At this stage the Bio* projects would be wise to start coming up with some good project ideas and experienced developers thinking about being a mentor. For potential students, getting involved in the community early is a good idea (e.g. bug reports, or better fixing existing bugs) See also: http://lists.open-bio.org/mailman/listinfo/gsoc http://lists.open-bio.org/mailman/listinfo/gsoc-mentors Peter From cjfields at illinois.edu Fri Feb 15 14:59:43 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Fri, 15 Feb 2013 14:59:43 +0000 Subject: [Bioperl-l] bioperl in Google Summer of Code 2013 In-Reply-To: References: <50A8D5F3-7248-49C1-AA1D-969C2BCA38A9@drycafe.net> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE28328@CHIMBX5.ad.uillinois.edu> On Feb 15, 2013, at 8:47 AM, Peter Cock wrote: > On Fri, Feb 15, 2013 at 2:28 PM, Hilmar Lapp wrote: >> I presume the OBF does as an umbrella organization on behalf of all Bio* >> projects. If you fancy proposing a project idea or mentoring, now is not a >> bad time to think about that or looking for co-mentors. >> >> -hilmar > > Yes, the plan is that as in the last few years, the OBF will apply to > GSoC and cover for BioPerl, BioJava, BioRuby, Biopython etc. At > this stage the Bio* projects would be wise to start coming up with > some good project ideas and experienced developers thinking about > being a mentor. For potential students, getting involved in the > community early is a good idea (e.g. bug reports, or better fixing > existing bugs) > > See also: > http://lists.open-bio.org/mailman/listinfo/gsoc > http://lists.open-bio.org/mailman/listinfo/gsoc-mentors > > Peter At the moment I'm not sure if Rob is heading this up or if the baton will be passed on to someone else. I can't take charge of writing up a proposal at the moment but I can certainly help edit. chris From scott at scottcain.net Fri Feb 15 19:18:37 2013 From: scott at scottcain.net (Scott Cain) Date: Fri, 15 Feb 2013 14:18:37 -0500 Subject: [Bioperl-l] sequence-region directives in gff files In-Reply-To: References: Message-ID: Hi Carn?, Thanks for pointing this out; I was only sort of paying attention to the FeatureIO discussion, and it hadn't occurred to me that my commit was the problem. I believe I've reproduced the functionality from that commit, and I even added a test that makes use of the added method (yes, I know, it surprised me too!). All of the tests now pass for me in the FeatureIO master. I'm putting it on my todo list to check that the Chado loader that makes use of Bio::FeatureIO still works as expected with the new incarnation. Thanks, Scott On Wed, Feb 13, 2013 at 5:22 AM, Carn? Draug wrote: > Hi Scott > > 3 years ago, the code for the Bio::SeqFeatureIO::* modules was split > from bioperl-live into a separate repository[1]. Because the code was > not removed from the bioperl-live repository, people ended up patching > on both sides, leading to 2 branches of development. Last weekend I > merged them back together with the exception of one commit that would > not longer apply[2]. > > This commit was authored by you with the following commit message: > "tiny change to Bio::FeatureIO::gff to allow the gmod chado gff3 bulk > loader to not choke when the gff file has ##sequence-region > directives. The loader is documented not to support this, but now it > will quitely ignore those directives." > > Do you think you could take a look at it? > > Thank you, > Carn? > > [1] https://github.com/bioperl/Bio-FeatureIO > [2] https://github.com/bioperl/bioperl-live/commit/7218728b66ad297953676236077fd0ec757378c0 -- ------------------------------------------------------------------------ Scott Cain, Ph. D. scott at scottcain dot net GMOD Coordinator (http://gmod.org/) 216-392-3087 Ontario Institute for Cancer Research From carandraug+dev at gmail.com Tue Feb 19 18:52:57 2013 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Tue, 19 Feb 2013 18:52:57 +0000 Subject: [Bioperl-l] bioperl in Google Summer of Code 2013 In-Reply-To: References: <50A8D5F3-7248-49C1-AA1D-969C2BCA38A9@drycafe.net> <118F034CF4C3EF48A96F86CE585B94BF6CE28328@CHIMBX5.ad.uillinois.edu> Message-ID: On 15 February 2013 14:28, Hilmar Lapp wrote: > [...] > If you fancy proposing a project idea or mentoring, now is not a bad time to think about that or looking for co-mentors. On 15 February 2013 14:59, Fields, Christopher J wrote: > At the moment I'm not sure if Rob is heading this up or if the baton will be passed on to someone else. I can't take charge of writing up a proposal at the moment but I can certainly help edit. I would like to participate this year as a student. I do not have however, have any bioperl itch that would last a summer to fix. The largest of them is to implement BLAST using NCBI's server. They have made available a SOAP-based BLAST and doing this has been on my todo for ages. Would you suggest any other project for bioperl? Carn? From peymanalavi at yahoo.com Tue Feb 19 21:16:49 2013 From: peymanalavi at yahoo.com (peyman alavi) Date: Tue, 19 Feb 2013 13:16:49 -0800 (PST) Subject: [Bioperl-l] BioGraphics: Bio::SCF installation through cpan fails Message-ID: <1361308609.90384.YahooMailNeo@web120901.mail.ne1.yahoo.com> Hello, I am having problems for a while trying to install the Bio::SCF module on my Vista32. Now, I know that Bio::SCF isn't really a Bioperl module, but I need it for Bio::Graphics, and I thought perhaps other people had experienced the same problem before.? I have installed zlib and io_lib (both their last available versions), but it looks like sth. (presumably with io_lib) is missing. I should be very grateful if someone could tell me what still needs to be done! Here are the paths where the io_lib "library" and "include" directories are installed, and I set them to cpan before trying to install Bio::SCF: o conf makepl_arg ?LIBS=-Lc:/MinGW/msys/1.0/local/lib INC=-Ic:/MinGW/msys/1.0/local/include? And the following is what I get on the STDOUT: ? Set up gcc environment - 4.7.2 [32m cpan shell -- CPAN exploration and modules installation (v1.9800) Enter 'h' for help.[0m ? [32m??? makepl_arg???????? [LIBS=-Lc:/MinGW/msys/1.0/local/lib INC=-Ic:/MinGW/msys/1.0/local/include][0m [32mPlease use 'o conf commit' to make the config permanent![0m ? [32m[0m [32mReading 'D:\Perl\cpan\Metadata'[0m [32m? Database was generated on Sun, 17 Feb 2013 12:17:02 GMT[0m [32mRunning install for module 'Bio::SCF'[0m [32mRunning make for L/LD/LDS/Bio-SCF-1.03.tar.gz[0m [32mChecksum for D:\Perl\cpan\sources\authors\id\L\LD\LDS\Bio-SCF-1.03.tar.gz ok[0m [32mScanning cache D:\Perl/cpan/build for sizes[0m [32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32mDONE[0m [32mBio-SCF-1.03/[0m [32mBio-SCF-1.03/t/[0m [32mBio-SCF-1.03/t/scf.t[0m [32mBio-SCF-1.03/eg/[0m [32mBio-SCF-1.03/eg/write_test_obj.pl[0m [32mBio-SCF-1.03/eg/write_test_tied.pl[0m [32mBio-SCF-1.03/eg/read_test_obj.pl[0m [32mBio-SCF-1.03/eg/read_test_tied.pl[0m [32mBio-SCF-1.03/SCF/[0m [32mBio-SCF-1.03/SCF/Arrays.pm[0m [32mBio-SCF-1.03/DISCLAIMER[0m [32mBio-SCF-1.03/README[0m [32mBio-SCF-1.03/SCF.pm[0m [32mBio-SCF-1.03/SCF.xs[0m [32mBio-SCF-1.03/Changes[0m [32mBio-SCF-1.03/test.scf[0m [32mBio-SCF-1.03/Makefile.PL[0m [32mBio-SCF-1.03/META.yml[0m [32mBio-SCF-1.03/INSTALL[0m [32mBio-SCF-1.03/MANIFEST[0m [32m ? CPAN.pm: Building L/LD/LDS/Bio-SCF-1.03.tar.gz[0m ? Set up gcc environment - 4.7.2 Checking if your kit is complete... Looks good Writing Makefile for Bio::SCF Writing MYMETA.yml and MYMETA.json cp SCF.pm blib\lib\Bio\SCF.pm cp SCF/Arrays.pm blib\lib\Bio\SCF\Arrays.pm D:\Perl\bin\perl.exe D:\Perl\site\lib\ExtUtils\xsubpp? -typemap D:\Perl\lib\ExtUtils\typemap? SCF.xs > SCF.xsc && D:\Perl\bin\perl.exe -MExtUtils::Command -e mv -- SCF.xsc SCF.c Please specify prototyping behavior for SCF.xs (see perlxs manual) c:/MinGW/bin/gcc.exe -c? -Ic:/MinGW/msys/1.0/local/include ???????????? -DNDEBUG -DWIN32 -D_CONSOLE -DNO_STRICT -DHAVE_DES_FCRYPT -DUSE_SITECUSTOMIZE -DPERL_IMPLICIT_CONTEXT -DPERL_IMPLICIT_SYS -DUSE_PERLIO -D_USE_32BIT_TIME_T -DPERL_MSVCRT_READFIX -DHASATTRIBUTE -fno-strict-aliasing -mms-bitfields -O2 ??????? ??-DVERSION=\"1.03\" ??????? -DXS_VERSION=\"1.03\"? "-ID:\Perl\lib\CORE"? -DLITTLE_ENDIAN SCF.c In file included from c:/MinGW/msys/1.0/local/include/io_lib/scf.h:31:0, ???????????????? from SCF.xs:12: c:/MinGW/msys/1.0/local/include/io_lib/mFILE.h:23:0: warning: "MF_APPEND" redefined [enabled by default] In file included from c:\mingw\bin\../lib/gcc/mingw32/4.7.2/../../../../include/windows.h:55:0, ???????????????? from D:\Perl\lib\CORE/win32.h:61, ???????????????? from D:\Perl\lib\CORE/win32thread.h:4, ???????????????? from D:\Perl\lib\CORE/perl.h:2825, ???????????????? from SCF.xs:5: c:\mingw\bin\../lib/gcc/mingw32/4.7.2/../../../../include/winuser.h:131:0: note: this is the location of the previous definition SCF.xs: In function 'XS_Bio__SCF_get_scf_pointer': SCF.xs:35:2: warning: passing argument 3 of '(*Perl_ILIO_ptr((struct PerlInterpreter *)Perl_get_context()))->pNameStat' from incompatible pointer type [enabled by default] SCF.xs:35:2: note: expected 'struct _stati64 *' but argument is of type 'struct stat *' Running Mkbootstrap for Bio::SCF () D:\Perl\bin\perl.exe -MExtUtils::Command -e chmod -- 644 SCF.bs D:\Perl\bin\perl.exe -MExtUtils::Mksymlists \ ???? -e "Mksymlists('NAME'=>\"Bio::SCF\", 'DLBASE' => 'SCF', 'DL_FUNCS' => {? }, 'FUNCLIST' => [], 'IMPORTS' => {? }, 'DL_VARS' => []);" Set up gcc environment - 4.7.2 dlltool --def SCF.def --output-exp dll.exp c:\MinGW\bin\g++.exe -o blib\arch\auto\Bio\SCF\SCF.dll -Wl,--base-file -Wl,dll.base -mdll -L"D:\Perl\lib\CORE" SCF.o?? D:\Perl\lib\CORE\perl512.lib c:\MinGW\lib\libkernel32.a c:\MinGW\lib\libuser32.a c:\MinGW\lib\libgdi32.a c:\MinGW\lib\libwinspool.a c:\MinGW\lib\libcomdlg32.a c:\MinGW\lib\libadvapi32.a c:\MinGW\lib\libshell32.a c:\MinGW\lib\libole32.a c:\MinGW\lib\liboleaut32.a c:\MinGW\lib\libnetapi32.a c:\MinGW\lib\libuuid.a c:\MinGW\lib\libws2_32.a c:\MinGW\lib\libmpr.a c:\MinGW\lib\libwinmm.a c:\MinGW\lib\libversion.a c:\MinGW\lib\libodbc32.a c:\MinGW\lib\libodbccp32.a c:\MinGW\lib\libcomctl32.a c:\MinGW\lib\libmsvcrt.a dll.exp Warning: resolving _VirtualQuery at 12 by linking to _VirtualQuery Use --enable-stdcall-fixup to disable these warnings Use --disable-stdcall-fixup to disable these fixups Warning: resolving _VirtualProtect at 16 by linking to _VirtualProtect Warning: resolving _EnterCriticalSection at 4 by linking to _EnterCriticalSection Warning: resolving _TlsGetValue at 4 by linking to _TlsGetValue Warning: resolving _GetLastError at 0 by linking to _GetLastError Warning: resolving _LeaveCriticalSection at 4 by linking to _LeaveCriticalSection Warning: resolving _DeleteCriticalSection at 4 by linking to _DeleteCriticalSection Warning: resolving _InitializeCriticalSection at 4 by linking to _InitializeCriticalSection SCF.o:SCF.c:(.text+0xf35): undefined reference to `mfreopen' SCF.o:SCF.c:(.text+0xf4b): undefined reference to `mfwrite_scf' SCF.o:SCF.c:(.text+0xf6a): undefined reference to `mfflush' SCF.o:SCF.c:(.text+0xf72): undefined reference to `mfdestroy' SCF.o:SCF.c:(.text+0x1138): undefined reference to `write_scf' SCF.o:SCF.c:(.text+0x16ac): undefined reference to `scf_deallocate' SCF.o:SCF.c:(.text+0x17b1): undefined reference to `mfreopen' SCF.o:SCF.c:(.text+0x17c1): undefined reference to `mfread_scf' SCF.o:SCF.c:(.text+0x19bd): undefined reference to `read_scf' c:/mingw/bin/../lib/gcc/mingw32/4.7.2/../../../../mingw32/bin/ld.exe: SCF.o: bad reloc address 0xa4 in section `.rdata' c:/mingw/bin/../lib/gcc/mingw32/4.7.2/../../../../mingw32/bin/ld.exe: final link failed: Invalid operation collect2.exe: error: ld returned 1 exit status dmake.exe:? Error code 129, while making 'blib\arch\auto\Bio\SCF\SCF.dll' [32m? LDS/Bio-SCF-1.03.tar.gz[0m [31m? D:\Perl\site\bin\dmake.exe -- NOT OK[0m [32mRunning make test[0m [32m? Can't test without successful make[0m [32mRunning make install[0m [32m? Make had returned bad status, install seems impossible[0m [32mFailed during this command: ?LDS/Bio-SCF-1.03.tar.gz????????????????????? : make NO[0m [32m[0m [31mWarning: Configuration not saved.[0m [32mLockfile removed.[0m ? ? ?Thanks in advance for any useful suggestions/help!! Peyman From scott at scottcain.net Tue Feb 19 23:39:44 2013 From: scott at scottcain.net (Scott Cain) Date: Tue, 19 Feb 2013 18:39:44 -0500 Subject: [Bioperl-l] BioGraphics: Bio::SCF installation through cpan fails In-Reply-To: <1361308609.90384.YahooMailNeo@web120901.mail.ne1.yahoo.com> References: <1361308609.90384.YahooMailNeo@web120901.mail.ne1.yahoo.com> Message-ID: <777246AB-2EF0-403D-9652-8EA8390D5C53@scottcain.net> Hi Peyman, I have no idea what might be required to get staden and Bio::SCF installed on a windows machine; you have my sympathies for having to go through it. But what I wanted to touch on was what you wrote, that is, that you "need" it for Bio::Graphics. I just wanted to point out that you don't need it unless you want to be able to display traces from ABI sequencers (which most people don't really care to do these days). Bioi::SCF is listed as a recommended module, not a required one. Scott Sent from my iPad On Feb 19, 2013, at 4:16 PM, peyman alavi wrote: > Hello, > I am having > problems for a while trying to install the Bio::SCF module on my Vista32. Now, I know that Bio::SCF isn't really a Bioperl module, but I need it for Bio::Graphics, and I thought perhaps other people had experienced the same problem before. I > have installed zlib and io_lib (both their last available versions), but it > looks like sth. (presumably with io_lib) is missing. I should be very grateful > if someone could tell me what still needs to be done! > Here are > the paths where the io_lib "library" and "include" directories are installed, and I > set them to cpan before trying to install Bio::SCF: > o conf > makepl_arg ?LIBS=-Lc:/MinGW/msys/1.0/local/lib INC=-Ic:/MinGW/msys/1.0/local/include? > And the > following is what I get on the STDOUT: > > Set up gcc environment - 4.7.2 > [32m > cpan shell -- CPAN exploration and modules installation (v1.9800) > Enter 'h' for help.[0m > > [32m makepl_arg [LIBS=-Lc:/MinGW/msys/1.0/local/lib > INC=-Ic:/MinGW/msys/1.0/local/include][0m > [32mPlease use 'o conf commit' to make the config permanent![0m > > [32m[0m > [32mReading 'D:\Perl\cpan\Metadata'[0m > [32m Database was generated on > Sun, 17 Feb 2013 12:17:02 GMT[0m > [32mRunning install for module 'Bio::SCF'[0m > [32mRunning make for L/LD/LDS/Bio-SCF-1.03.tar.gz[0m > [32mChecksum for > D:\Perl\cpan\sources\authors\id\L\LD\LDS\Bio-SCF-1.03.tar.gz ok[0m > [32mScanning cache D:\Perl/cpan/build for sizes[0m > [32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32mDONE[0m > [32mBio-SCF-1.03/[0m > [32mBio-SCF-1.03/t/[0m > [32mBio-SCF-1.03/t/scf.t[0m > [32mBio-SCF-1.03/eg/[0m > [32mBio-SCF-1.03/eg/write_test_obj.pl[0m > [32mBio-SCF-1.03/eg/write_test_tied.pl[0m > [32mBio-SCF-1.03/eg/read_test_obj.pl[0m > [32mBio-SCF-1.03/eg/read_test_tied.pl[0m > [32mBio-SCF-1.03/SCF/[0m > [32mBio-SCF-1.03/SCF/Arrays.pm[0m > [32mBio-SCF-1.03/DISCLAIMER[0m > [32mBio-SCF-1.03/README[0m > [32mBio-SCF-1.03/SCF.pm[0m > [32mBio-SCF-1.03/SCF.xs[0m > [32mBio-SCF-1.03/Changes[0m > [32mBio-SCF-1.03/test.scf[0m > [32mBio-SCF-1.03/Makefile.PL[0m > [32mBio-SCF-1.03/META.yml[0m > [32mBio-SCF-1.03/INSTALL[0m > [32mBio-SCF-1.03/MANIFEST[0m > [32m > CPAN.pm: Building > L/LD/LDS/Bio-SCF-1.03.tar.gz[0m > > Set up gcc environment - 4.7.2 > Checking if your kit is complete... > Looks good > Writing Makefile for Bio::SCF > Writing MYMETA.yml and MYMETA.json > cp SCF.pm blib\lib\Bio\SCF.pm > cp SCF/Arrays.pm blib\lib\Bio\SCF\Arrays.pm > D:\Perl\bin\perl.exe D:\Perl\site\lib\ExtUtils\xsubpp -typemap D:\Perl\lib\ExtUtils\typemap SCF.xs > SCF.xsc && > D:\Perl\bin\perl.exe -MExtUtils::Command -e mv -- SCF.xsc SCF.c > Please specify prototyping behavior for SCF.xs (see perlxs manual) > c:/MinGW/bin/gcc.exe -c -Ic:/MinGW/msys/1.0/local/include -DNDEBUG > -DWIN32 -D_CONSOLE -DNO_STRICT -DHAVE_DES_FCRYPT -DUSE_SITECUSTOMIZE > -DPERL_IMPLICIT_CONTEXT -DPERL_IMPLICIT_SYS -DUSE_PERLIO -D_USE_32BIT_TIME_T > -DPERL_MSVCRT_READFIX -DHASATTRIBUTE -fno-strict-aliasing -mms-bitfields -O2 -DVERSION=\"1.03\" -DXS_VERSION=\"1.03\" "-ID:\Perl\lib\CORE" -DLITTLE_ENDIAN SCF.c > In file included from c:/MinGW/msys/1.0/local/include/io_lib/scf.h:31:0, > from SCF.xs:12: > c:/MinGW/msys/1.0/local/include/io_lib/mFILE.h:23:0: warning: > "MF_APPEND" redefined [enabled by default] > In file included from > c:\mingw\bin\../lib/gcc/mingw32/4.7.2/../../../../include/windows.h:55:0, > from > D:\Perl\lib\CORE/win32.h:61, > from > D:\Perl\lib\CORE/win32thread.h:4, > from > D:\Perl\lib\CORE/perl.h:2825, > from SCF.xs:5: > c:\mingw\bin\../lib/gcc/mingw32/4.7.2/../../../../include/winuser.h:131:0: > note: this is the location of the previous definition > SCF.xs: In function 'XS_Bio__SCF_get_scf_pointer': > SCF.xs:35:2: warning: passing argument 3 of '(*Perl_ILIO_ptr((struct > PerlInterpreter *)Perl_get_context()))->pNameStat' from incompatible pointer > type [enabled by default] > SCF.xs:35:2: note: expected 'struct _stati64 *' but argument is of type > 'struct stat *' > Running Mkbootstrap for Bio::SCF () > D:\Perl\bin\perl.exe -MExtUtils::Command -e chmod -- 644 SCF.bs > D:\Perl\bin\perl.exe -MExtUtils::Mksymlists \ > -e > "Mksymlists('NAME'=>\"Bio::SCF\", 'DLBASE' => 'SCF', > 'DL_FUNCS' => { }, 'FUNCLIST' => > [], 'IMPORTS' => { }, 'DL_VARS' => > []);" > Set up gcc environment - 4.7.2 > dlltool --def SCF.def --output-exp dll.exp > c:\MinGW\bin\g++.exe -o blib\arch\auto\Bio\SCF\SCF.dll -Wl,--base-file > -Wl,dll.base -mdll -L"D:\Perl\lib\CORE" SCF.o D:\Perl\lib\CORE\perl512.lib > c:\MinGW\lib\libkernel32.a c:\MinGW\lib\libuser32.a c:\MinGW\lib\libgdi32.a > c:\MinGW\lib\libwinspool.a c:\MinGW\lib\libcomdlg32.a c:\MinGW\lib\libadvapi32.a > c:\MinGW\lib\libshell32.a c:\MinGW\lib\libole32.a c:\MinGW\lib\liboleaut32.a > c:\MinGW\lib\libnetapi32.a c:\MinGW\lib\libuuid.a c:\MinGW\lib\libws2_32.a > c:\MinGW\lib\libmpr.a c:\MinGW\lib\libwinmm.a c:\MinGW\lib\libversion.a > c:\MinGW\lib\libodbc32.a c:\MinGW\lib\libodbccp32.a c:\MinGW\lib\libcomctl32.a > c:\MinGW\lib\libmsvcrt.a dll.exp > Warning: resolving _VirtualQuery at 12 by linking to _VirtualQuery > Use --enable-stdcall-fixup to disable these warnings > Use --disable-stdcall-fixup to disable these fixups > Warning: resolving _VirtualProtect at 16 by linking to _VirtualProtect > Warning: resolving _EnterCriticalSection at 4 by linking to > _EnterCriticalSection > Warning: resolving _TlsGetValue at 4 by linking to _TlsGetValue > Warning: resolving _GetLastError at 0 by linking to _GetLastError > Warning: resolving _LeaveCriticalSection at 4 by linking to > _LeaveCriticalSection > Warning: resolving _DeleteCriticalSection at 4 by linking to > _DeleteCriticalSection > Warning: resolving _InitializeCriticalSection at 4 by linking to > _InitializeCriticalSection > SCF.o:SCF.c:(.text+0xf35): undefined reference to `mfreopen' > SCF.o:SCF.c:(.text+0xf4b): undefined reference to `mfwrite_scf' > SCF.o:SCF.c:(.text+0xf6a): undefined reference to `mfflush' > SCF.o:SCF.c:(.text+0xf72): undefined reference to `mfdestroy' > SCF.o:SCF.c:(.text+0x1138): undefined reference to `write_scf' > SCF.o:SCF.c:(.text+0x16ac): undefined reference to `scf_deallocate' > SCF.o:SCF.c:(.text+0x17b1): undefined reference to `mfreopen' > SCF.o:SCF.c:(.text+0x17c1): undefined reference to `mfread_scf' > SCF.o:SCF.c:(.text+0x19bd): undefined reference to `read_scf' > c:/mingw/bin/../lib/gcc/mingw32/4.7.2/../../../../mingw32/bin/ld.exe: > SCF.o: bad reloc address 0xa4 in section `.rdata' > c:/mingw/bin/../lib/gcc/mingw32/4.7.2/../../../../mingw32/bin/ld.exe: > final link failed: Invalid operation > collect2.exe: error: ld returned 1 exit status > dmake.exe: Error code 129, while > making 'blib\arch\auto\Bio\SCF\SCF.dll' > [32m LDS/Bio-SCF-1.03.tar.gz[0m > [31m D:\Perl\site\bin\dmake.exe > -- NOT OK[0m > [32mRunning make test[0m > [32m Can't test without successful > make[0m > [32mRunning make install[0m > [32m Make had returned bad > status, install seems impossible[0m > [32mFailed during this command: > LDS/Bio-SCF-1.03.tar.gz : make NO[0m > [32m[0m > [31mWarning: Configuration not saved.[0m > [32mLockfile removed.[0m > > > Thanks in advance for any useful > suggestions/help!! > Peyman > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From anngregory at email.arizona.edu Wed Feb 20 05:20:41 2013 From: anngregory at email.arizona.edu (Ann Gregory) Date: Tue, 19 Feb 2013 22:20:41 -0700 Subject: [Bioperl-l] Problem Parsing BLAST output to annotate FASTA file Message-ID: Hi BioPerl, I am having issues with a BioPerl script. I have a blastxml file from a blastx blast and the original multifasta file containing the original nucleotides sequences. I want to take the blast result (ie. the blast description) and annotate my multifasta file. I have written 2 while loops that extract the blast descriptions as well as the nucleotide sequence from the multifasta file. My problem is that I cannot incorporate one of the while loops into the other without loosing the loop property of one of the loops. I would like to take the 1st blast description, then the 1st nucleotide sequence, then the 2nd blast description, then the 2nd nucleotide sequence and so on...just can figure out how to alternate the results. See script below: use warnings; use strict; use Bio::SearchIO; use Bio::SeqIO; my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file => "$ARGV[0]"); while (my $result = $search_in->next_result) { while (my $hit = $result->next_hit) { while (my $hsp = $hit->next_hsp) { my $qd = $hit->description; print $qd, "\n"; } } } my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]"); while (my $seqobj = $seqio->next_seq) { my $nuc = $seqobj->seq(); print $nuc, "\n"; }-- Ann (Nina) Gregory Graduate Student Rich Lab / Sullivan Lab Soil, Water, Environmental Science Department University of Arizona From yonexhalaolv at gmail.com Wed Feb 20 09:17:12 2013 From: yonexhalaolv at gmail.com (Sebastian Lau) Date: Wed, 20 Feb 2013 01:17:12 -0800 (PST) Subject: [Bioperl-l] =?utf-8?q?failed_to_install_via_fink=EF=BC=9Ano_packa?= =?utf-8?q?ge_found_for_specification_=27bioperl-pm5100=27!?= Message-ID: <84fa1bcb-a39f-4847-bff2-e3a9c2b909ea@googlegroups.com> *Hi guys,* * * *I just about to install bioperl on my MacOS 10.7.5 via fink. but after typing the command, fink said it couldn't find any package:* fangmatoMacBook-Pro:~ yoyo$ fink install bioperl-pm5100 Information about 6901 packages read in 1 seconds. Failed: no package found for specification 'bioperl-pm5100'! fangmatoMacBook-Pro:~ yoyo$ fink install bioperl-pm588 Information about 6901 packages read in 1 seconds. Failed: no package found for specification 'bioperl-pm588'! fangmatoMacBook-Pro:~ yoyo$ fink install bioperl-pm586 Information about 6901 packages read in 1 seconds. Failed: no package found for specification 'bioperl-pm586'! *I followed the instruction on wiki. I don't know what's wrong with it. Thanks for your help.* From awitney at sgul.ac.uk Wed Feb 20 15:22:51 2013 From: awitney at sgul.ac.uk (Adam Witney) Date: Wed, 20 Feb 2013 15:22:51 +0000 Subject: [Bioperl-l] Problem Parsing BLAST output to annotate FASTA file In-Reply-To: References: Message-ID: <5124EA4B.5020409@sgul.ac.uk> Hi Ann, On 20/02/2013 05:20, Ann Gregory wrote: > Hi BioPerl, > > I am having issues with a BioPerl script. I have a blastxml file from a > blastx blast and the original multifasta file containing the original > nucleotides sequences. > > I want to take the blast result (ie. the blast description) and annotate my > multifasta file. > > I have written 2 while loops that extract the blast descriptions as well as > the nucleotide sequence from the multifasta file. > > My problem is that I cannot incorporate one of the while loops into the > other without loosing the loop property of one of the loops. I would like > to take the 1st blast description, then the 1st nucleotide sequence, then > the 2nd blast description, then the 2nd nucleotide sequence and so > on...just can figure out how to alternate the results. > > See script below: > > > use warnings; > use strict; > use Bio::SearchIO; > use Bio::SeqIO; > > > my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file => > "$ARGV[0]"); > while (my $result = $search_in->next_result) { > while (my $hit = $result->next_hit) { > while (my $hsp = $hit->next_hsp) { > my $qd = $hit->description; > print $qd, "\n"; > } > } > } > > my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]"); > while (my $seqobj = $seqio->next_seq) { > my $nuc = $seqobj->seq(); > print $nuc, "\n"; > }-- I think what you are proposing assumes that the loop over the BLAST results will come back in the same order as the loop over the Fasta file, this may be the case, but I'm not sure its something I would rely on. Anyway, I would loop over the BLAST results, storing the relevant data to an array or hash and then loop over the fasta file to put the two together. eg: my $blast_data; while ( ... blast data ... ) { ... $blast_data->{$qd} = ... } while ( my $seqobj = $seqio->next_seq ) { my $id = $seqobj->id; print $blast_data->{$id}."\n"; } something along those lines... or have i misunderstood you? if so can you provide some more details, like what do you want your output to look like? HTH Adam From andreas.leimbach at uni-wuerzburg.de Wed Feb 20 16:24:50 2013 From: andreas.leimbach at uni-wuerzburg.de (Andreas Leimbach) Date: Wed, 20 Feb 2013 17:24:50 +0100 Subject: [Bioperl-l] Problem Parsing BLAST output to annotate FASTA file In-Reply-To: References: Message-ID: <5124F8D2.4020904@uni-wuerzburg.de> oops, I just realized I had one loop to much in there. Adam is correct. Sorry. The last part of the code I send you should look like this: my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]"); while (my $seqobj = $seqio->next_seq) { print ">$hits{$seqobj->display_id}\n"; my $nuc = $seqobj->seq(); print $nuc, "\n"; } Cheers, Andreas -- Andreas Leimbach Universit?t M?nster Institut f?r Hygiene Mendelstr. 7 D-48149 M?nster Germany Tel.: +49 (0)551 39 3843 E-Mail: andreas.leimbach at uni-wuerzburg.de On 20.2.13 06:20, Ann Gregory wrote: > Hi BioPerl, > > I am having issues with a BioPerl script. I have a blastxml file from a > blastx blast and the original multifasta file containing the original > nucleotides sequences. > > I want to take the blast result (ie. the blast description) and annotate my > multifasta file. > > I have written 2 while loops that extract the blast descriptions as well as > the nucleotide sequence from the multifasta file. > > My problem is that I cannot incorporate one of the while loops into the > other without loosing the loop property of one of the loops. I would like > to take the 1st blast description, then the 1st nucleotide sequence, then > the 2nd blast description, then the 2nd nucleotide sequence and so > on...just can figure out how to alternate the results. > > See script below: > > > use warnings; > use strict; > use Bio::SearchIO; > use Bio::SeqIO; > > > my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file => > "$ARGV[0]"); > while (my $result = $search_in->next_result) { > while (my $hit = $result->next_hit) { > while (my $hsp = $hit->next_hsp) { > my $qd = $hit->description; > print $qd, "\n"; > } > } > } > > my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]"); > while (my $seqobj = $seqio->next_seq) { > my $nuc = $seqobj->seq(); > print $nuc, "\n"; > }-- > Ann (Nina) Gregory > Graduate Student > Rich Lab / Sullivan Lab > Soil, Water, Environmental Science Department > University of Arizona > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From andreas.leimbach at uni-wuerzburg.de Wed Feb 20 16:14:29 2013 From: andreas.leimbach at uni-wuerzburg.de (Andreas Leimbach) Date: Wed, 20 Feb 2013 17:14:29 +0100 Subject: [Bioperl-l] Problem Parsing BLAST output to annotate FASTA file In-Reply-To: References: Message-ID: <5124F665.5050602@uni-wuerzburg.de> Hi Ann, I agree with Adam, but I was already writing my email, while his came in. Hope it helps: I hope I understand correctly what you want to do. Just to clarify, you queried a protein blast database with blastx and nucleotide queries. Now you want to associate the protein description for the FIRST blast hit with the corresponding nucleotide fasta file. Is that correct? You have to put the two while loops into one another. Or associate the blast hits with the query descriptions. But it's not feasible to take the first blast hit and the first nucleotide fasta seq, then the 2nd of both etc, as Adam already pointed out. You would have to iterate through both at the same time. I.e. take the first blast hit, then iterate through the nucleotide fasta until you find the hit. Then take the 2nd blast hit and iterate through the nucleotide fasta etc. It's probably easiest to do this in a hash. Something along the lines of (not tested I just punched that in the E-Mail): my %hits; my $hit_desc; my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file => "$ARGV[0]"); while (my $result = $search_in->next_result) { while (my $hit = $result->next_hit) { while (my $hsp = $hit->next_hsp) { if ($hit->description eq $hit_desc) { # Only want the first blast hit next; } my $hit_desc = $hit->description; $hits{$result->query_description} = $hit_desc; } } } my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]"); foreach my $query (keys %hits) { while (my $seqobj = $seqio->next_seq) { if ($seqobj->display_id eq $query) { print ">$hits{$query}\n"; my $nuc = $seqobj->seq(); print $nuc, "\n"; } You might want to put some evalue cutoff in there to only score significant hits. Also if your nucleotide query multi-fasta file is very large, you might consider creating an index first: http://www.bioperl.org/wiki/HOWTO:Local_Databases#Bio::Index Hope that helps! Cheers, Andreas P.S.: Please next time include version numbers for BioPerl and Perl and a little more detail what you want to do. ;-) -- Andreas Leimbach Universit?t M?nster Institut f?r Hygiene Mendelstr. 7 D-48149 M?nster Germany Tel.: +49 (0)551 39 3843 E-Mail: andreas.leimbach at uni-wuerzburg.de On 20.2.13 06:20, Ann Gregory wrote: > Hi BioPerl, > > I am having issues with a BioPerl script. I have a blastxml file from a > blastx blast and the original multifasta file containing the original > nucleotides sequences. > > I want to take the blast result (ie. the blast description) and annotate my > multifasta file. > > I have written 2 while loops that extract the blast descriptions as well as > the nucleotide sequence from the multifasta file. > > My problem is that I cannot incorporate one of the while loops into the > other without loosing the loop property of one of the loops. I would like > to take the 1st blast description, then the 1st nucleotide sequence, then > the 2nd blast description, then the 2nd nucleotide sequence and so > on...just can figure out how to alternate the results. > > See script below: > > > use warnings; > use strict; > use Bio::SearchIO; > use Bio::SeqIO; > > > my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file => > "$ARGV[0]"); > while (my $result = $search_in->next_result) { > while (my $hit = $result->next_hit) { > while (my $hsp = $hit->next_hsp) { > my $qd = $hit->description; > print $qd, "\n"; > } > } > } > > my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]"); > while (my $seqobj = $seqio->next_seq) { > my $nuc = $seqobj->seq(); > print $nuc, "\n"; > }-- > Ann (Nina) Gregory > Graduate Student > Rich Lab / Sullivan Lab > Soil, Water, Environmental Science Department > University of Arizona > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From andreas.leimbach at uni-wuerzburg.de Wed Feb 20 17:00:51 2013 From: andreas.leimbach at uni-wuerzburg.de (Andreas Leimbach) Date: Wed, 20 Feb 2013 18:00:51 +0100 Subject: [Bioperl-l] Problem Parsing BLAST output to annotate FASTA file In-Reply-To: References: <5124F8D2.4020904@uni-wuerzburg.de> Message-ID: <51250143.9050503@uni-wuerzburg.de> Hey Ann, damn, it 's not my best day ... Anyways, I wouldn't work with List::MoreUtils's each_array function, as this assumes that the blast hits and the nucleotide queries are in the same order (as Adam pointed out). Rather use a hash which associates a key to a certain value. Also, the hash can be used to skip sequences that have no hits. Here's my new version: my %hits; my $hit_desc; my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file => "$ARGV[0]"); while (my $result = $search_in->next_result) { while (my $hit = $result->next_hit) { while (my $hsp = $hit->next_hsp) { $hits{$result->query_description} = $hit->description; # hash: associate query_desc (key) with hit_desc (value) last; # jump out of the while loop; this should resolve getting only the first hit } last; # see above } } my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]"); while (my $seqobj = $seqio->next_seq) { if ($hits{$seqobj->display_id}) { # only true if display_id associated with hit_desc and should skip seqs without hits print ">$hits{$seqobj->display_id}\n"; my $nuc = $seqobj->seq(); print $nuc, "\n"; } } Cheers, Andreas P.S.: I redirected your mail to the BioPerl mailing list, others might profit from my mistakes ;-) ... -- Andreas Leimbach Universit?t M?nster Institut f?r Hygiene Mendelstr. 7 D-48149 M?nster Germany Tel.: +49 (0)551 39 3843 E-Mail: andreas.leimbach at uni-wuerzburg.de On 20.2.13 17:35, Ann Gregory wrote: > Hi Andreas, > > Thanks for you help! I don't understand how this gets the first blast hit: > > if ($hit->description eq $hit_desc) { # Only want the first blast hit > next; > } > > I tried this and seems to be working...but I can't get the 1st blast hit > or skip the sequences that had no hits. Do you know any quick fixes? > > * > use warnings; > use strict; > use Bio::SearchIO; > use Bio::SeqIO; > use List::MoreUtils qw(each_array); > > my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file => > "$ARGV[0]"); > my @ids; > while (my $result = $search_in->next_result) { > while (my $hit = $result->next_hit) { > while (my $hsp = $hit->next_hsp) { > my $match = $result->num_hits; > push(@ids, $qd); > } > } > } > } > > my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]"); > my @seqs; > while (my $seqobj = $seqio->next_seq) { > my $nuc = $seqobj->seq(); > push(@seqs, $nuc); > } > > my $it = each_array(@ids, at seqs); > while(my($ids,$seqs)=$it->()){ > print $ids, "\n", $seqs, "\n"; > } > * > > Thanks again! > ~Ann > > On Wed, Feb 20, 2013 at 9:24 AM, Andreas Leimbach > > wrote: > > oops, I just realized I had one loop to much in there. Adam is > correct. Sorry. > > The last part of the code I send you should look like this: > > > my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]"); > while (my $seqobj = $seqio->next_seq) { > print ">$hits{$seqobj->display_id}\__n"; > > my $nuc = $seqobj->seq(); > print $nuc, "\n"; > } > > > Cheers, > Andreas > > > -- > Andreas Leimbach > Universit?t M?nster > Institut f?r Hygiene > Mendelstr. 7 > D-48149 M?nster > Germany > > Tel.: +49 (0)551 39 3843 > E-Mail: andreas.leimbach at uni-__wuerzburg.de > > > On 20.2.13 06:20, Ann Gregory wrote: > > Hi BioPerl, > > I am having issues with a BioPerl script. I have a blastxml file > from a > blastx blast and the original multifasta file containing the > original > nucleotides sequences. > > I want to take the blast result (ie. the blast description) and > annotate my > multifasta file. > > I have written 2 while loops that extract the blast descriptions > as well as > the nucleotide sequence from the multifasta file. > > My problem is that I cannot incorporate one of the while loops > into the > other without loosing the loop property of one of the loops. I > would like > to take the 1st blast description, then the 1st nucleotide > sequence, then > the 2nd blast description, then the 2nd nucleotide sequence and so > on...just can figure out how to alternate the results. > > See script below: > > > use warnings; > use strict; > use Bio::SearchIO; > use Bio::SeqIO; > > > my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file => > "$ARGV[0]"); > while (my $result = $search_in->next_result) { > while (my $hit = $result->next_hit) { > while (my $hsp = $hit->next_hsp) { > my $qd = $hit->description; > print $qd, "\n"; > } > } > } > > my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => > "$ARGV[1]"); > while (my $seqobj = $seqio->next_seq) { > my $nuc = $seqobj->seq(); > print $nuc, "\n"; > }-- > Ann (Nina) Gregory > Graduate Student > Rich Lab / Sullivan Lab > Soil, Water, Environmental Science Department > University of Arizona > _________________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/__mailman/listinfo/bioperl-l > > > > > > -- > Ann (Nina) Gregory > Graduate Student > Rich Lab / Sullivan Lab > Soil, Water, Environmental Science Department > University of Arizona > > > From cjfields at illinois.edu Wed Feb 20 18:24:58 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 20 Feb 2013 18:24:58 +0000 Subject: [Bioperl-l] Problem Parsing BLAST output to annotate FASTA file In-Reply-To: <51250143.9050503@uni-wuerzburg.de> References: <5124F8D2.4020904@uni-wuerzburg.de> <51250143.9050503@uni-wuerzburg.de> Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6CE2EB4A@CHIMBX5.ad.uillinois.edu> If this is meant to be something done using the same FASTA files for a bunch of BLAST reports, might be worth setting up a flat file index and using that to look up and grab the sequences; it should be a LOT faster, just the first pass (generation of the initial index) would take a little time. Look at Bio::DB::Fasta for an example. chris On Feb 20, 2013, at 11:00 AM, Andreas Leimbach wrote: > Hey Ann, > > damn, it 's not my best day ... Anyways, I wouldn't work with List::MoreUtils's each_array function, as this assumes that the blast hits and the nucleotide queries are in the same order (as Adam pointed out). Rather use a hash which associates a key to a certain value. Also, the hash can be used to skip sequences that have no hits. > Here's my new version: > > my %hits; > my $hit_desc; > my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file => > "$ARGV[0]"); > while (my $result = $search_in->next_result) { > while (my $hit = $result->next_hit) { > while (my $hsp = $hit->next_hsp) { > $hits{$result->query_description} = $hit->description; # hash: associate query_desc (key) with hit_desc (value) > last; # jump out of the while loop; this should resolve getting only the first hit > } > last; # see above > } > } > > > my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]"); > while (my $seqobj = $seqio->next_seq) { > if ($hits{$seqobj->display_id}) { # only true if display_id associated with hit_desc and should skip seqs without hits > print ">$hits{$seqobj->display_id}\n"; > my $nuc = $seqobj->seq(); > print $nuc, "\n"; > } > } > > Cheers, > Andreas > > P.S.: I redirected your mail to the BioPerl mailing list, others might profit from my mistakes ;-) ... > > -- > Andreas Leimbach > Universit?t M?nster > Institut f?r Hygiene > Mendelstr. 7 > D-48149 M?nster > Germany > > Tel.: +49 (0)551 39 3843 > E-Mail: andreas.leimbach at uni-wuerzburg.de > > On 20.2.13 17:35, Ann Gregory wrote: >> Hi Andreas, >> >> Thanks for you help! I don't understand how this gets the first blast hit: >> >> if ($hit->description eq $hit_desc) { # Only want the first blast hit >> next; >> } >> >> I tried this and seems to be working...but I can't get the 1st blast hit >> or skip the sequences that had no hits. Do you know any quick fixes? >> >> * >> use warnings; >> use strict; >> use Bio::SearchIO; >> use Bio::SeqIO; >> use List::MoreUtils qw(each_array); >> >> my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file => >> "$ARGV[0]"); >> my @ids; >> while (my $result = $search_in->next_result) { >> while (my $hit = $result->next_hit) { >> while (my $hsp = $hit->next_hsp) { >> my $match = $result->num_hits; >> push(@ids, $qd); >> } >> } >> } >> } >> >> my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]"); >> my @seqs; >> while (my $seqobj = $seqio->next_seq) { >> my $nuc = $seqobj->seq(); >> push(@seqs, $nuc); >> } >> >> my $it = each_array(@ids, at seqs); >> while(my($ids,$seqs)=$it->()){ >> print $ids, "\n", $seqs, "\n"; >> } >> * >> >> Thanks again! >> ~Ann >> >> On Wed, Feb 20, 2013 at 9:24 AM, Andreas Leimbach >> > > wrote: >> >> oops, I just realized I had one loop to much in there. Adam is >> correct. Sorry. >> >> The last part of the code I send you should look like this: >> >> >> my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]"); >> while (my $seqobj = $seqio->next_seq) { >> print ">$hits{$seqobj->display_id}\__n"; >> >> my $nuc = $seqobj->seq(); >> print $nuc, "\n"; >> } >> >> >> Cheers, >> Andreas >> >> >> -- >> Andreas Leimbach >> Universit?t M?nster >> Institut f?r Hygiene >> Mendelstr. 7 >> D-48149 M?nster >> Germany >> >> Tel.: +49 (0)551 39 3843 >> E-Mail: andreas.leimbach at uni-__wuerzburg.de >> >> >> On 20.2.13 06:20, Ann Gregory wrote: >> >> Hi BioPerl, >> >> I am having issues with a BioPerl script. I have a blastxml file >> from a >> blastx blast and the original multifasta file containing the >> original >> nucleotides sequences. >> >> I want to take the blast result (ie. the blast description) and >> annotate my >> multifasta file. >> >> I have written 2 while loops that extract the blast descriptions >> as well as >> the nucleotide sequence from the multifasta file. >> >> My problem is that I cannot incorporate one of the while loops >> into the >> other without loosing the loop property of one of the loops. I >> would like >> to take the 1st blast description, then the 1st nucleotide >> sequence, then >> the 2nd blast description, then the 2nd nucleotide sequence and so >> on...just can figure out how to alternate the results. >> >> See script below: >> >> >> use warnings; >> use strict; >> use Bio::SearchIO; >> use Bio::SeqIO; >> >> >> my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file => >> "$ARGV[0]"); >> while (my $result = $search_in->next_result) { >> while (my $hit = $result->next_hit) { >> while (my $hsp = $hit->next_hsp) { >> my $qd = $hit->description; >> print $qd, "\n"; >> } >> } >> } >> >> my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => >> "$ARGV[1]"); >> while (my $seqobj = $seqio->next_seq) { >> my $nuc = $seqobj->seq(); >> print $nuc, "\n"; >> }-- >> Ann (Nina) Gregory >> Graduate Student >> Rich Lab / Sullivan Lab >> Soil, Water, Environmental Science Department >> University of Arizona >> _________________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/__mailman/listinfo/bioperl-l >> >> >> >> >> >> -- >> Ann (Nina) Gregory >> Graduate Student >> Rich Lab / Sullivan Lab >> Soil, Water, Environmental Science Department >> University of Arizona >> >> >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From carandraug+dev at gmail.com Mon Feb 25 10:08:23 2013 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Mon, 25 Feb 2013 10:08:23 +0000 Subject: [Bioperl-l] module for description of sequence variants (where to place code) Message-ID: Hi I'm writing a perl module to write a description of the variance between 2 sequences as described on http://www.hgvs.org/mutnomen/recs-prot.html Basically, given 2 sequences, would returns something like "p.Lys2del p.His25_Met26insGln" if those are the differences. It also accounts for the existence of - characters on the sequences that may come from their alignment. My question is, where on the project tree should I place the module? Also, is there something already written that would convert from 1 to 3 letter code? Carn? From andreas.leimbach at uni-wuerzburg.de Mon Feb 25 10:32:43 2013 From: andreas.leimbach at uni-wuerzburg.de (Andreas Leimbach) Date: Mon, 25 Feb 2013 11:32:43 +0100 Subject: [Bioperl-l] module for description of sequence variants (where to place code) In-Reply-To: References: Message-ID: <512B3DCB.7050008@uni-wuerzburg.de> Hi Carn?, for your last question: You can convert aa strings from one to three letter code with 'Bio::SeqUtils'. Cheers, Andreas -- Andreas Leimbach Universit?t M?nster Institut f?r Hygiene Mendelstr. 7 D-48149 M?nster Germany Tel.: +49 (0)551 39 3843 E-Mail: andreas.leimbach at uni-wuerzburg.de On 25.2.13 11:08, Carn? Draug wrote: > Hi > > I'm writing a perl module to write a description of the variance > between 2 sequences as described on > http://www.hgvs.org/mutnomen/recs-prot.html > > Basically, given 2 sequences, would returns something like "p.Lys2del > p.His25_Met26insGln" if those are the differences. It also accounts > for the existence of - characters on the sequences that may come from > their alignment. > > My question is, where on the project tree should I place the module? > > Also, is there something already written that would convert from 1 to > 3 letter code? > > Carn? > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From genehack at genehack.org Thu Feb 28 00:57:48 2013 From: genehack at genehack.org (John SJ Anderson) Date: Wed, 27 Feb 2013 16:57:48 -0800 Subject: [Bioperl-l] YAPC talks? Message-ID: Hi - Is there anyone that was planning on submitting a Bioperl talk to YAPC::NA? In an unrelated conversation, one of the organizers expressed an interest in getting a Bioperl talk this year. If no one else is planning on a talk submission, Jay Hannah (aka deafferret) and I are promising/threatening a tag-team style "Bioperl rules / Bioperl sucks" overview/state of the dist style talk... thanks, john. From cjfields at illinois.edu Thu Feb 28 02:48:55 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Thu, 28 Feb 2013 02:48:55 +0000 Subject: [Bioperl-l] YAPC talks? In-Reply-To: References: Message-ID: <118F034CF4C3EF48A96F86CE585B94BF6E705CD3@CHIMBX5.ad.uillinois.edu> At the moment I personally have no plans on going, but I think a no-holds-barred bioperl talk is a good idea. chris On Feb 27, 2013, at 6:57 PM, John SJ Anderson wrote: > Hi - > > Is there anyone that was planning on submitting a Bioperl talk to > YAPC::NA? In an unrelated conversation, one of the organizers > expressed an interest in getting a Bioperl talk this year. > > If no one else is planning on a talk submission, Jay Hannah (aka > deafferret) and I are promising/threatening a tag-team style "Bioperl > rules / Bioperl sucks" overview/state of the dist style talk... > > thanks, > john. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From hlapp at drycafe.net Thu Feb 28 03:20:34 2013 From: hlapp at drycafe.net (Hilmar Lapp) Date: Wed, 27 Feb 2013 22:20:34 -0500 Subject: [Bioperl-l] YAPC talks? In-Reply-To: References: Message-ID: <42C1F1B8-FE26-43A8-B601-E80D17D215EC@drycafe.net> On Feb 27, 2013, at 7:57 PM, John SJ Anderson wrote: > Jay Hannah (aka deafferret) and I are promising/threatening a tag-team style "Bioperl > rules / Bioperl sucks" overview/state of the dist style talk... Please videotape. I'll be sure to watch and promote it :-) -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : =========================================================== From saladi1 at illinois.edu Thu Feb 28 06:58:20 2013 From: saladi1 at illinois.edu (Shyam Saladi) Date: Wed, 27 Feb 2013 22:58:20 -0800 Subject: [Bioperl-l] EUtilities Cookbook - Accn to gi Message-ID: Hi, I think that rettype for the section "Get GIs for a list of accessions" should be -rettype => 'gi'); instead of 'gilist' as it is now. I think this change is due to a change in NCBI eutils. webpage: http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook#Get_GIs_for_a_list_of_accessions Thanks, Shyam From fossandonc at hotmail.com Thu Feb 28 15:36:34 2013 From: fossandonc at hotmail.com (=?iso-8859-1?Q?Francisco_J._Ossand=F3n?=) Date: Thu, 28 Feb 2013 12:36:34 -0300 Subject: [Bioperl-l] Fix for Bug #3376 broke somewhere else Message-ID: Hi, I was re-checking Bug #3302 using the Bio::SearchIO modules of the repository and found that now it can't parse a Hmmer2 file that was previously fine. After tracking the problem, I discovered that a change in a regular expression to fix another bug broke the parse. The fix for the Bug #3376 consisted in adding an extra condition to omit lines where end of domain indicator is split across lines (https://redmine.open-bio.org/issues/3376): TEST: domain 1 of 1, from 8 to 97: score 184.7, E = 2.5e-56 *->svfqqqqssksttgstvtAiAiAigYRYRYRAvtWnsGsLssGvnDn sv+qqqq+ + +vtAiAiAigYRYRYRAv Wn GsLs G nDn Test 8 SVYQQQQGGSA----MVTAIAIAIGYRYRYRAVVWNKGSLSTGTNDN 50 DnDqqsdgLYtiYYsvtvpssslpsqtviHHHaHkasstkiiikiePr<- DnDq +d LYtiYYsvtv +ss+p q+v+HHHaH+asstkiiiki P Test 51 DNDQAAD-LYTIYYSVTVSASSWPGQSVTHHHAHPASSTKIIIKIAPS 97 * Test - - This case is characterized by the 2 dashes in the line... So the expression added in hmmer2.pm - ?next_result? (https://github.com/bioperl/bioperl-live/commit/142e5d79e3a6593db32bf0af9904 8f47d01bd3f2): elsif (CORE::length($_) == 0 || ( $count != 1 && /^\s+$/o ) || /^\s+\-?\*\s*$/ || /^.+\-\s+\-\s*$/ ) ### <--- This regex was designed for bug 3376 { next; } But the expression used is too broad because it uses the "^.+" just before the 2 dashes, and it broke these lines parsing, where is full of dashes: KyACrqCdtiVQAPaPakpIErGiptaGLLArvlVSKyaEHlPLYRQsEI lcl|gi|340 - -------------------------------------------------- - yaRqGVeiaRstLadWVgrtgarLaPLvdALaeyVLkeGklHADeTPVqV +i s L V++ + r lcl|gi|340 60938 ------AIMISGLIHGVSARCLRF-------------------------- 60955 I think a reasonable fix that still fixes the original bug and restore the function for this case is to add an extra \s+ in the regex just before the first dash, so the expression makes sure that the first dash is the one that comes AFTER the description (and is replacing the usual coordinate number) and is not the last of an alignment or a series of dashes like the one above: elsif (CORE::length($_) == 0 || ( $count != 1 && /^\s+$/o ) || /^\s+\-?\*\s*$/ || /^.+\s+\-\s+\-\s*$/ ) ### <--- Tweaked regex { next; } I tested it and it works fine, hope you find the fix acceptable. Cheers, -- Francisco J. Ossandon Bioinformatician. Ph.D. Candidate, University Andres Bello. Center for Bioinformatics and Genome Biology, Fundacion Ciencia para la Vida. Santiago, Chile. www.cienciavida.cl/CBGB.htm From PDagosto at edgebio.com Mon Feb 25 16:50:34 2013 From: PDagosto at edgebio.com (Phil Dagosto) Date: Mon, 25 Feb 2013 16:50:34 +0000 Subject: [Bioperl-l] Error when running Build.PL Message-ID: Greetings, I downloaded BioPerl 1.6.1 from this location: http://www.bioperl.org/wiki/Getting_BioPerl When I ran Build.PL with all of the default settings chosen in the interactive mode I got the following error message: Could not get valid metadata. Error is: Invalid metadata structure. Errors: 'Perl_5' for 'license' does not have a URL scheme (resources -> license) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::FeatureIO::gff -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::WebAgent -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::EUtilParameters -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::OntologyIO::InterProParser -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Biblio::IO::medlinexml -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::strider -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::PhyloNetwork::RandomFactory -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::Analysis::DNA::ESEfinder -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::game::gameSubs -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::FeatureIO::interpro -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::GFF::Adaptor::berkeleydb -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::entrezgene -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::tinyseq -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::chadoxml -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::game::gameWriter -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::FileCache -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::bsml_sax -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::Primer3 -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::GFF::Adaptor::ace -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::PopGen::HtSNP -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tree::Compatible -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::Ace -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::Taxonomy::entrez -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::agave -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::PopGen::TagHaplotype -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::SeqFeature::Store::FeatureFileLoader -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::* -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::Analysis::Protein* -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SearchIO::blastxml -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::EUtilities -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tree::Draw::Cladogram -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::SeqPattern -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::tigrxml -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqFeature::Collection -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Draw::Pictogram -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SearchIO::Writer::BSMLResultWriter -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::Query::HIVQuery -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::TreeIO::svggraph -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::Biblio::eutils -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::PhyloNetwork -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::SeqPattern::BackTranslate -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::Query::GenBank -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Variation::IO::xml -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::PhyloNetwork::GraphViz -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqFeature::Annotated -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::NCBIHelper -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::HIV -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::Analysis::DNA* -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Tools::Run::RemoteBlast -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::SeqIO::excel -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::ClusterIO::dbsnp -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::Microarray::Tools::ReseqChip -> requires) [Validation: 1.2], Expected a map structure from string or file. (optional_features -> Bio::DB::Biblio::soap -> requires) [Validation: 1.2] at /usr/local/lib/perl5/5.10.1/Module/Build/Base.pm line 4559 Could not create MYMETA files Creating new 'Build' script for 'BioPerl' version '1.006001' I have no idea whether this is a problem or not or if I can proceed. Also, I'm confused by the version number referenced in the last line. 1.006001 is our current version - I thought I was installing version 1.6.1. Are these version numbers equivalent, i.e., are the zeros not meaningful?. I was actually looking for version 1.2.3 (or greater) - where can I find that? Thanks, Phil Phil Dagosto Sr. Software Engineer Edge Bio 201 Perry Parkway, Suite 5 Gaithersburg, MD 20850 pdagosto at edgebio.com (240) 912-8669