From cjfields at illinois.edu Mon Aug 1 00:07:38 2011 From: cjfields at illinois.edu (Chris Fields) Date: Sun, 31 Jul 2011 23:07:38 -0500 Subject: [Bioperl-l] BioPerl Test requirements Message-ID: <79E5ED11-1D3F-436F-AA02-D922EAEB123A@illinois.edu> All, We are currently using a BioPerl-specific module for running tests called Bio::Root::Test. It is essentially a wrapper module, re-exporting all the methods for Test::More, Test::Exception, and Test::Warn. One problem: it currently expects a copy of Test::Warn and Test::Exception in each repository as a fallback. Another problem: these included modules appear to be triggering dependencies with debian packaging. As an example of one hidden dependency, the included Test::Warn requires Array::Compare, which converted to Moose a few years ago, so you automatically have to install the entire Moose dependency tree, even though Bioperl doesn't require it (not a slam on Moose, you really SHOULD be using Moose these days. No, really :). Anway, more recent versions of Test::Warn don't have this requirement, but as we package an old version of this module we get stuck with the dependencies until we (manually) update this for each repository. Ick. I think the best solution is to remove the bioperl-local modules in t/lib and list Test::Most instead as a 'build_requires' in Build.PL, e.g. the module is only necessary for the build phase so is optionally installed. Test::Most essentially does exactly the same thing as Bio::Root::Test and more; it also includes Test::Deep and Test::Diff (Bio::Root::Test has a few additional methods of use as well). As this will require developers to use Test::Most instead, though, I though it would be worth asking on the list to see if there are any objections. Any thoughts? chris From cjfields at illinois.edu Mon Aug 1 00:42:39 2011 From: cjfields at illinois.edu (Chris Fields) Date: Sun, 31 Jul 2011 23:42:39 -0500 Subject: [Bioperl-l] protaparam In-Reply-To: References: Message-ID: <44853A9D-9E78-469E-B8D8-B06EBDB5F780@illinois.edu> Shachi, My guess is this is not a BioPerl-specific issue, but that the web service interface has changed or is no longer active. Unfortunately this is one module that has no tests associated with it, so this passed through the cracks. You are more than welcome to file a bug on this, but if the service is inactive we'll likely immediately deprecate the module. chris On Jul 28, 2011, at 11:46 PM, Shachi Gahoi wrote: > Dear All, > > If anybody know how to rum protparam using bioperl please let me know. > > > Thanks in advance > > -- > Regards, > Shachi > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From jason.stajich at gmail.com Mon Aug 1 03:12:32 2011 From: jason.stajich at gmail.com (Jason Stajich) Date: Sun, 31 Jul 2011 23:12:32 -0800 Subject: [Bioperl-l] Fwd: Bio::Tools::Run::Phylo::Phyml, tree_string References: Message-ID: <3521B67E-D158-492A-8A60-025D6C5C9934@gmail.com> Heikki - can you take a look at this when you get time - I'm unclear what the BIONJ string is used for? Begin forwarded message: > From: Tristan Lefebure > Date: July 27, 2011 6:12:16 AM AKDT > To: bioperl mailing list > Subject: Re: [Bioperl-l] Bio::Tools::Run::Phylo::Phyml, tree_string > > done: > https://redmine.open-bio.org/issues/3273 > > -- > Tristan > > On Tue, Jul 26, 2011 at 9:43 PM, Chris Fields wrote: >> That's an odd one. Could you file this on redmine? >> >> chris >> >> On Jul 26, 2011, at 10:14 AM, Tristan Lefebure wrote: >> >>> Ouups, I found a typo in my post, it should read: >>> >>> I am not quite sure I understand why tree_string() from >>> Bio::Tools::Run::Phylo::Phyml returns >>> a string that looks like that (I removed the end of the tree): >>> >>> BIONJ(((((((('92':0.0114354726,'12':0.0472591023)0.0000000000:0.0000005859,... >>> >>> On Tue, Jul 26, 2011 at 4:47 PM, Tristan Lefebure >>> wrote: >>>> Hi there, >>>> I am not quite sure I understand why tree_string() from Bio::Tools::Run::Phylo::Phyml returns >>>> a string that looks like that (I removed the end of the tree): >>>> >>>> Tree is BIONJ(((((((('92':0.0114354726,'12':0.0472591023)0.0000000000:0.0000005859,... >>>> >>>> Why do we have this 'Tree is BIONJ' thing? >>>> >>>> A quick look at the code in the _run() function gives : >>>> >>>> { >>>> open(my $FH_TREE, "<", $tree_file) >>>> || $self->throw("Phyml call ($command) did not give an output: $?"); >>>> local $/; >>>> $self->{_tree} .= <$FH_TREE>; >>>> } >>>> >>>> Why appending something to $self->{_tree}? What about? >>>> $self->{_tree} = <$FH_TREE>; >>>> >>>> I was about to fill a bug report, but then I saw that in Phyml.t: >>>> >>>> is substr($factory->tree_string, 0, 9), 'BIONJ(SIN', 'tree_string()'; >>>> >>>> Well, I am lost. Any help much appreciated... >>>> >>>> -- >>>> Tristan >>>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From David.Messina at sbc.su.se Mon Aug 1 05:09:47 2011 From: David.Messina at sbc.su.se (Dave Messina) Date: Mon, 1 Aug 2011 11:09:47 +0200 Subject: [Bioperl-l] BioPerl Test requirements In-Reply-To: <79E5ED11-1D3F-436F-AA02-D922EAEB123A@illinois.edu> References: <79E5ED11-1D3F-436F-AA02-D922EAEB123A@illinois.edu> Message-ID: Sounds good, Chris. Go for it. Dave From hlapp at drycafe.net Mon Aug 1 16:30:18 2011 From: hlapp at drycafe.net (Hilmar Lapp) Date: Mon, 1 Aug 2011 16:30:18 -0400 Subject: [Bioperl-l] BioPerl Test requirements In-Reply-To: <79E5ED11-1D3F-436F-AA02-D922EAEB123A@illinois.edu> References: <79E5ED11-1D3F-436F-AA02-D922EAEB123A@illinois.edu> Message-ID: I think the small burden this change incurs for each developer is well outweighed by the reduced maintenance and installation burden. Go for it. -hilmar On Aug 1, 2011, at 12:07 AM, Chris Fields wrote: > All, > > We are currently using a BioPerl-specific module for running tests > called Bio::Root::Test. It is essentially a wrapper module, re- > exporting all the methods for Test::More, Test::Exception, and > Test::Warn. One problem: it currently expects a copy of Test::Warn > and Test::Exception in each repository as a fallback. Another > problem: these included modules appear to be triggering dependencies > with debian packaging. > > As an example of one hidden dependency, the included Test::Warn > requires Array::Compare, which converted to Moose a few years ago, > so you automatically have to install the entire Moose dependency > tree, even though Bioperl doesn't require it (not a slam on Moose, > you really SHOULD be using Moose these days. No, really :). > > Anway, more recent versions of Test::Warn don't have this > requirement, but as we package an old version of this module we get > stuck with the dependencies until we (manually) update this for each > repository. Ick. > > I think the best solution is to remove the bioperl-local modules in > t/lib and list Test::Most instead as a 'build_requires' in Build.PL, > e.g. the module is only necessary for the build phase so is > optionally installed. Test::Most essentially does exactly the same > thing as Bio::Root::Test and more; it also includes Test::Deep and > Test::Diff (Bio::Root::Test has a few additional methods of use as > well). > > As this will require developers to use Test::Most instead, though, I > though it would be worth asking on the list to see if there are any > objections. Any thoughts? > > > chris > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : =========================================================== From cjfields at illinois.edu Mon Aug 1 16:34:56 2011 From: cjfields at illinois.edu (Chris Fields) Date: Mon, 1 Aug 2011 15:34:56 -0500 Subject: [Bioperl-l] BioPerl Test requirements In-Reply-To: References: <79E5ED11-1D3F-436F-AA02-D922EAEB123A@illinois.edu> Message-ID: <0D28A228-53D1-4843-B99D-9F8A48132EA2@illinois.edu> Okay, will do. I'll initially test on a branch and then pull in. Thanks for the feedback Hilmar and Dave! chris On Aug 1, 2011, at 3:30 PM, Hilmar Lapp wrote: > I think the small burden this change incurs for each developer is well outweighed by the reduced maintenance and installation burden. Go for it. > > -hilmar > > On Aug 1, 2011, at 12:07 AM, Chris Fields wrote: > >> All, >> >> We are currently using a BioPerl-specific module for running tests called Bio::Root::Test. It is essentially a wrapper module, re-exporting all the methods for Test::More, Test::Exception, and Test::Warn. One problem: it currently expects a copy of Test::Warn and Test::Exception in each repository as a fallback. Another problem: these included modules appear to be triggering dependencies with debian packaging. >> >> As an example of one hidden dependency, the included Test::Warn requires Array::Compare, which converted to Moose a few years ago, so you automatically have to install the entire Moose dependency tree, even though Bioperl doesn't require it (not a slam on Moose, you really SHOULD be using Moose these days. No, really :). >> >> Anway, more recent versions of Test::Warn don't have this requirement, but as we package an old version of this module we get stuck with the dependencies until we (manually) update this for each repository. Ick. >> >> I think the best solution is to remove the bioperl-local modules in t/lib and list Test::Most instead as a 'build_requires' in Build.PL, e.g. the module is only necessary for the build phase so is optionally installed. Test::Most essentially does exactly the same thing as Bio::Root::Test and more; it also includes Test::Deep and Test::Diff (Bio::Root::Test has a few additional methods of use as well). >> >> As this will require developers to use Test::Most instead, though, I though it would be worth asking on the list to see if there are any objections. Any thoughts? >> >> >> chris >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : > =========================================================== > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From hlapp at drycafe.net Mon Aug 1 18:36:27 2011 From: hlapp at drycafe.net (Hilmar Lapp) Date: Mon, 1 Aug 2011 18:36:27 -0400 Subject: [Bioperl-l] Job opportunity: User Interface Design and Web Application Developer Message-ID: <7F0AE58E-6052-469B-ACD0-207FAD060472@drycafe.net> (Apologies if you have received this already or if this is considered spam - we're trying to reach out as broad as possible and I know that quite a few in the Bio* communities would be well qualified. Please feel free to pass on to anyone who might be interested, or might know someone who is.) User Interface Design and Web Application Developer The National Evolutionary Synthesis Center (NESCent) seeks a creative and enthusiastic individual to design user interfaces and web applications for scientific applications that manage, analyze, visualize and share data in support of evolutionary research. The incumbent will work as part of a small informatics team in close collaboration with domain scientists. NESCent (http://nescent.org) is an NSF-funded center dedicated to cross-disciplinary research in evolutionary science. Our informatics team works closely with visiting and resident scientists to support their custom software and database development needs (http://informatics.nescent.org ), and collaborates broadly with other biodiversity informatics projects. All NESCent software products are open-source, and the Center has a number of initiatives to actively promote collaborative development of community software resources. Above all, we are enthusiastic about our work, about the mission of the Center, and about the contribution of informatics to that mission. Job description: The incumbent will design and develop user interfaces and web applications for databases and other software tools for sponsored scientists and staff. The job responsibilities include all stages of the software development process, including requirements gathering, design, implementation, release packaging and documentation, as part of a small team (typically 2-3 individuals). We expect the incumbent to present their work at conferences and contribute to publications with scientific collaborators; interact regularly with visiting and resident scientists, other members of the informatics team and Center staff; and generally serve as an expert resource for Center personnel. The position provides opportunities for professional development and encourages research into new technologies. Most informatics staff work at our Durham NC offices, located adjacent to Duke University, but we support a wide range of technologies for virtual communication with off-site staff and collaborators. Salary range: $70,000 - $80,000, depending on education and experience Required Qualifications: * Demonstrated success collaborating with clients on custom software solutions * Experience with various stages of the software development cycle * Expertise in development and testing of user interface designs * Excellent communication skills, both virtual and face-to-face Preferred Qualifications: * M.S. or Ph.D. in Computer Science, Bioinformatics or related field * Demonstrated interest in science, particularly biology * Expertise in dynamic and interactive web technologies (JavaScript, CGI) * Expertise in rapid application development and respective programming technologies and languages (e.g., modern scripting languages and web-application frameworks such as Python/Django, Ruby/ Ruby-on-Rails, and Perl/Catalyst). * Expertise in graphic design * Expertise in data visualization and/or scientific data integration * Expertise in software usability design and assessment * Expertise in web service (SOAP, REST, XML, JSON) and semantic web technologies * Fluency in Java programming * Prior experience in relational database programming (PostgreSQL or MySQL) * Experience with open-source, and collaborative, software development How to apply: Please send cover letter, resume and contact information for three references to Dr. Karen Cranston, Training Coordinator and Bioinformatics Project Manager (karen.cranston at nescent.org); Please also complete the online application at the University of North Carolina HR website: http://bit.ly/r9HQ8r. Informal inquires or requests for additional information may be directed to Dr. Cranston by email or phone (+1-919-613-2275). Closing date is August 15, 2011. -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : =========================================================== From florent.angly at gmail.com Mon Aug 1 20:09:51 2011 From: florent.angly at gmail.com (Florent Angly) Date: Tue, 02 Aug 2011 10:09:51 +1000 Subject: [Bioperl-l] BioPerl Test requirements In-Reply-To: <0D28A228-53D1-4843-B99D-9F8A48132EA2@illinois.edu> References: <79E5ED11-1D3F-436F-AA02-D922EAEB123A@illinois.edu> <0D28A228-53D1-4843-B99D-9F8A48132EA2@illinois.edu> Message-ID: <4E37404F.1040001@gmail.com> If Test::Most gives more testing capabilities and makes packaging Bioperl easier, I think it's pretty sweet! Florent On 02/08/11 06:34, Chris Fields wrote: > Okay, will do. I'll initially test on a branch and then pull in. Thanks for the feedback Hilmar and Dave! > > chris > > On Aug 1, 2011, at 3:30 PM, Hilmar Lapp wrote: > >> I think the small burden this change incurs for each developer is well outweighed by the reduced maintenance and installation burden. Go for it. >> >> -hilmar >> >> On Aug 1, 2011, at 12:07 AM, Chris Fields wrote: >> >>> All, >>> >>> We are currently using a BioPerl-specific module for running tests called Bio::Root::Test. It is essentially a wrapper module, re-exporting all the methods for Test::More, Test::Exception, and Test::Warn. One problem: it currently expects a copy of Test::Warn and Test::Exception in each repository as a fallback. Another problem: these included modules appear to be triggering dependencies with debian packaging. >>> >>> As an example of one hidden dependency, the included Test::Warn requires Array::Compare, which converted to Moose a few years ago, so you automatically have to install the entire Moose dependency tree, even though Bioperl doesn't require it (not a slam on Moose, you really SHOULD be using Moose these days. No, really :). >>> >>> Anway, more recent versions of Test::Warn don't have this requirement, but as we package an old version of this module we get stuck with the dependencies until we (manually) update this for each repository. Ick. >>> >>> I think the best solution is to remove the bioperl-local modules in t/lib and list Test::Most instead as a 'build_requires' in Build.PL, e.g. the module is only necessary for the build phase so is optionally installed. Test::Most essentially does exactly the same thing as Bio::Root::Test and more; it also includes Test::Deep and Test::Diff (Bio::Root::Test has a few additional methods of use as well). >>> >>> As this will require developers to use Test::Most instead, though, I though it would be worth asking on the list to see if there are any objections. Any thoughts? >>> >>> >>> chris >>> >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> -- >> =========================================================== >> : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : >> =========================================================== >> >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From hartzell at alerce.com Mon Aug 1 20:06:54 2011 From: hartzell at alerce.com (George Hartzell) Date: Mon, 1 Aug 2011 17:06:54 -0700 Subject: [Bioperl-l] BioPerl Test requirements In-Reply-To: <79E5ED11-1D3F-436F-AA02-D922EAEB123A@illinois.edu> References: <79E5ED11-1D3F-436F-AA02-D922EAEB123A@illinois.edu> Message-ID: <20023.16286.89015.854814@gargle.gargle.HOWL> Sounds great. g. From carandraug+dev at gmail.com Tue Aug 2 10:00:32 2011 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Tue, 2 Aug 2011 15:00:32 +0100 Subject: [Bioperl-l] wiki administrator needed Message-ID: Hi! I have a problem with the bioperl wiki and have sent a support request to 'support at open-bio.org' as instructed here (http://www.bioperl.org/wiki/About_site#Help_with_Wiki_Problems ). I got the ticket ID #966. This was 2 weeks ago. Can someone with administrator rights on the wiki do something about it? Thanks in advance, Carn? Draug From p.j.a.cock at googlemail.com Tue Aug 2 10:56:30 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Tue, 2 Aug 2011 15:56:30 +0100 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: References: Message-ID: 2011/8/2 Carn? Draug : > Hi! > > I have a problem with the bioperl wiki and have sent a support request > to 'support at open-bio.org' as instructed here > (http://www.bioperl.org/wiki/About_site#Help_with_Wiki_Problems ). I > got the ticket ID #966. This was 2 weeks ago. Can someone with > administrator rights on the wiki do something about it? > > Thanks in advance, > Carn? Draug What was the problem with the wiki (for the benefit of those of us who might be able to fix it but are not on the support system and didn't get your email)? Peter From carandraug+dev at gmail.com Tue Aug 2 11:06:10 2011 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Tue, 2 Aug 2011 16:06:10 +0100 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: References: Message-ID: 2011/8/2 Peter Cock : > 2011/8/2 Carn? Draug : >> I have a problem with the bioperl wiki and have sent a support request >> to 'support at open-bio.org' as instructed here >> (http://www.bioperl.org/wiki/About_site#Help_with_Wiki_Problems ). I >> got the ticket ID #966. This was 2 weeks ago. Can someone with >> administrator rights on the wiki do something about it? > > What was the problem with the wiki (for the benefit of those > of us who might be able to fix it but are not on the support > system and didn't get your email)? Guess there should be no problem mentioning this on this open mailing list. Here's the e-mail I sent back then: When logging with OpenID, I accidentally created a new account. Now I can't use that OpenID for my real account since it's connected to that other account. It also doesn't let me remove that OpenID from that account. My real account has the nickname 'Carandraug'. The account I created by accident has the nickname '~carandraug' (because I was trying to connect my account with the OpenID of https://launchpad.net/~carandraug Could someone please remove the '~carandraug' account? I couldn't find a button to do so. From hlapp at drycafe.net Tue Aug 2 12:25:48 2011 From: hlapp at drycafe.net (Hilmar Lapp) Date: Tue, 2 Aug 2011 12:25:48 -0400 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: References: Message-ID: <0E50C8F0-2FCD-4AAB-83DD-DF6229E382C5@drycafe.net> I don't think the wiki allows removing of accounts (only blocking). Someone would have to go into the MySQL database and do that. -hilmar On Aug 2, 2011, at 11:06 AM, Carn? Draug wrote: > 2011/8/2 Peter Cock : >> 2011/8/2 Carn? Draug : >>> I have a problem with the bioperl wiki and have sent a support >>> request >>> to 'support at open-bio.org' as instructed here >>> (http://www.bioperl.org/wiki/About_site#Help_with_Wiki_Problems ). I >>> got the ticket ID #966. This was 2 weeks ago. Can someone with >>> administrator rights on the wiki do something about it? >> >> What was the problem with the wiki (for the benefit of those >> of us who might be able to fix it but are not on the support >> system and didn't get your email)? > > Guess there should be no problem mentioning this on this open mailing > list. Here's the e-mail I sent back then: > > When logging with OpenID, I accidentally created a new account. Now I > can't use that OpenID for my real account since it's connected to that > other account. It also doesn't let me remove that OpenID from that > account. > > My real account has the nickname 'Carandraug'. > > The account I created by accident has the nickname '~carandraug' > (because I was trying to connect my account with the OpenID of > https://launchpad.net/~carandraug > > Could someone please remove the '~carandraug' account? I couldn't find > a button to do so. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : =========================================================== From p.j.a.cock at googlemail.com Tue Aug 2 12:27:11 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Tue, 2 Aug 2011 17:27:11 +0100 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: <0E50C8F0-2FCD-4AAB-83DD-DF6229E382C5@drycafe.net> References: <0E50C8F0-2FCD-4AAB-83DD-DF6229E382C5@drycafe.net> Message-ID: 2011/8/2 Hilmar Lapp : > I don't think the wiki allows removing of accounts (only blocking). Someone > would have to go into the MySQL database and do that. The MediaWiki FAQ says don't do that, but does mention an optional add-on for merging wiki user accounts. We could block the unwanted account instead. Peter From cjfields at illinois.edu Tue Aug 2 12:35:36 2011 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 2 Aug 2011 11:35:36 -0500 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: References: <0E50C8F0-2FCD-4AAB-83DD-DF6229E382C5@drycafe.net> Message-ID: I don't know if blocking that account will solve to OpenID problem (that it is associated with the bad account), but maybe merging that account and Carn?'s good one will work. Maybe it's worth looking at the add-on. chris On Aug 2, 2011, at 11:27 AM, Peter Cock wrote: > 2011/8/2 Hilmar Lapp : >> I don't think the wiki allows removing of accounts (only blocking). Someone >> would have to go into the MySQL database and do that. > > The MediaWiki FAQ says don't do that, but does mention an > optional add-on for merging wiki user accounts. > > We could block the unwanted account instead. > > Peter > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Tue Aug 2 12:38:01 2011 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 2 Aug 2011 11:38:01 -0500 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: References: <0E50C8F0-2FCD-4AAB-83DD-DF6229E382C5@drycafe.net> Message-ID: Carn?, Try logging in with the bad account, then go under 'my preferences'. There is an OpenID tag; this lists your OpenIDs, along with a 'delete' button. See if deleting the OpenID helps. chris On Aug 2, 2011, at 11:27 AM, Peter Cock wrote: > 2011/8/2 Hilmar Lapp : >> I don't think the wiki allows removing of accounts (only blocking). Someone >> would have to go into the MySQL database and do that. > > The MediaWiki FAQ says don't do that, but does mention an > optional add-on for merging wiki user accounts. > > We could block the unwanted account instead. > > Peter > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From carandraug+dev at gmail.com Tue Aug 2 12:58:41 2011 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Tue, 2 Aug 2011 17:58:41 +0100 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: References: <0E50C8F0-2FCD-4AAB-83DD-DF6229E382C5@drycafe.net> Message-ID: On 2 August 2011 17:38, Chris Fields wrote: > Try logging in with the bad account, then go under 'my preferences'. ?There is an OpenID tag; this lists your OpenIDs, along with a 'delete' button. ?See if deleting the OpenID helps. I had try that the first time. However, it didn't let me do it because that OpenID was the one used to create the account. Carn? From ihok at hotmail.com Tue Aug 2 13:29:43 2011 From: ihok at hotmail.com (Jack Tanner) Date: Tue, 2 Aug 2011 13:29:43 -0400 Subject: [Bioperl-l] fastq quality with initial @ Message-ID: i've got a fastq file with PHRED quality strings that sometimes start with '@'. this breaks the _index_file routine in Bio/Index/Fastq.pm. i would've filed this in bugzilla, but i'm not authorized to do that. From cjfields at illinois.edu Tue Aug 2 14:59:00 2011 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 2 Aug 2011 13:59:00 -0500 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: References: <0E50C8F0-2FCD-4AAB-83DD-DF6229E382C5@drycafe.net> Message-ID: Let's see if we can get the merge account add-in working, then. chris On Aug 2, 2011, at 11:58 AM, Carn? Draug wrote: > On 2 August 2011 17:38, Chris Fields wrote: >> Try logging in with the bad account, then go under 'my preferences'. There is an OpenID tag; this lists your OpenIDs, along with a 'delete' button. See if deleting the OpenID helps. > > I had try that the first time. However, it didn't let me do it because > that OpenID was the one used to create the account. > > Carn? > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Tue Aug 2 15:00:47 2011 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 2 Aug 2011 14:00:47 -0500 Subject: [Bioperl-l] fastq quality with initial @ In-Reply-To: References: Message-ID: <441DB637-5586-488F-8943-FEA4D56C276B@illinois.edu> On Aug 2, 2011, at 12:29 PM, Jack Tanner wrote: > > i've got a fastq file with PHRED quality strings that sometimes start with '@'. this breaks the _index_file routine in Bio/Index/Fastq.pm. > i would've filed this in bugzilla, but i'm not authorized to do that. We no longer use bugzilla (as of v 1.6.900); see here: http://www.bioperl.org/wiki/Bugs Just register for an account and submit. I would check the latest code before doing so, just in case it has been fixed. chris From bosborne11 at verizon.net Tue Aug 2 16:24:54 2011 From: bosborne11 at verizon.net (Brian Osborne) Date: Tue, 02 Aug 2011 16:24:54 -0400 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: References: <0E50C8F0-2FCD-4AAB-83DD-DF6229E382C5@drycafe.net> Message-ID: Chris, This is the one I've used: http://www.mediawiki.org/wiki/Extension:User_Merge_and_Delete BIO On Aug 2, 2011, at 2:59 PM, Chris Fields wrote: > Let's see if we can get the merge account add-in working, then. > > chris > > On Aug 2, 2011, at 11:58 AM, Carn? Draug wrote: > >> On 2 August 2011 17:38, Chris Fields wrote: >>> Try logging in with the bad account, then go under 'my preferences'. There is an OpenID tag; this lists your OpenIDs, along with a 'delete' button. See if deleting the OpenID helps. >> >> I had try that the first time. However, it didn't let me do it because >> that OpenID was the one used to create the account. >> >> Carn? >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Tue Aug 2 18:01:42 2011 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 2 Aug 2011 17:01:42 -0500 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: References: <0E50C8F0-2FCD-4AAB-83DD-DF6229E382C5@drycafe.net> Message-ID: <0CE0963F-3D4C-4EDB-A77F-859598E24DE2@illinois.edu> Carn?, I installed the add-in, merged the old account (~carandraug) into the one specified (Carandraug ), and deleted the old account. See if that works. chris On Aug 2, 2011, at 1:59 PM, Chris Fields wrote: > Let's see if we can get the merge account add-in working, then. > > chris > > On Aug 2, 2011, at 11:58 AM, Carn? Draug wrote: > >> On 2 August 2011 17:38, Chris Fields wrote: >>> Try logging in with the bad account, then go under 'my preferences'. There is an OpenID tag; this lists your OpenIDs, along with a 'delete' button. See if deleting the OpenID helps. >> >> I had try that the first time. However, it didn't let me do it because >> that OpenID was the one used to create the account. >> >> Carn? >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From carandraug+dev at gmail.com Tue Aug 2 18:19:38 2011 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Tue, 2 Aug 2011 23:19:38 +0100 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: <0CE0963F-3D4C-4EDB-A77F-859598E24DE2@illinois.edu> References: <0E50C8F0-2FCD-4AAB-83DD-DF6229E382C5@drycafe.net> <0CE0963F-3D4C-4EDB-A77F-859598E24DE2@illinois.edu> Message-ID: On 2 August 2011 23:01, Chris Fields wrote: > Carn?, > > I installed the add-in, merged the old account (~carandraug) into the one specified (Carandraug ), and deleted the old account. ?See if that works. > > chris When I try to add this OpenID to my account, I still get the error: "That is someone else's OpenID." If I try to log in with this OpenID, after saying that I'm logged in successfully, the site still looks as if I'm not logged in, with a button to 'log in' and an IP address instead of a username. Another problem that I have when logging is that sometimes mediawiki sends 'https://login.launchpad.net/ id/y7xtYzD' instead of 'https://login.launchpad.net/~carandraug' to the launchpad server. I don't know what's causing this. Trying to backspace and delete what may be invisible characters before and after the string sometimes solves this. This happens even though I type this character by character so if there's any invisble stuff on the form it must be there before. This occurs when using Iceweasel 3.5 (in Debian), Firefox 3.6 (in Ubuntu) and Firefox 5 (in MacOSX). Carn? Draug From cjfields at illinois.edu Tue Aug 2 18:39:19 2011 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 2 Aug 2011 17:39:19 -0500 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: References: <0E50C8F0-2FCD-4AAB-83DD-DF6229E382C5@drycafe.net> <0CE0963F-3D4C-4EDB-A77F-859598E24DE2@illinois.edu> Message-ID: On Aug 2, 2011, at 5:19 PM, Carn? Draug wrote: > On 2 August 2011 23:01, Chris Fields wrote: >> Carn?, >> >> I installed the add-in, merged the old account (~carandraug) into the one specified (Carandraug ), and deleted the old account. See if that works. >> >> chris > > When I try to add this OpenID to my account, I still get the error: > "That is someone else's OpenID." Apparently UserMerge doesn't clean up empty OpenID. I found that one (login.launchpad.net/~carandraug) and manually deleted it. The user ID it was associated with no longer existed in the user tables. Kinda wondered if that would happen... > If I try to log in with this OpenID, after saying that I'm logged in > successfully, the site still looks as if I'm not logged in, with a > button to 'log in' and an IP address instead of a username. > > Another problem that I have when logging is that sometimes mediawiki > sends 'https://login.launchpad.net/ id/y7xtYzD' instead of > 'https://login.launchpad.net/~carandraug' to the launchpad server. I > don't know what's causing this. Trying to backspace and delete what > may be invisible characters before and after the string sometimes > solves this. This happens even though I type this character by > character so if there's any invisble stuff on the form it must be > there before. This occurs when using Iceweasel 3.5 (in Debian), > Firefox 3.6 (in Ubuntu) and Firefox 5 (in MacOSX). > > Carn? Draug Not sure myself, sounds like a MW bug. See if the OpenID works first, then maybe we can address that. chris From carandraug+dev at gmail.com Tue Aug 2 18:56:49 2011 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Tue, 2 Aug 2011 23:56:49 +0100 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: References: <0E50C8F0-2FCD-4AAB-83DD-DF6229E382C5@drycafe.net> <0CE0963F-3D4C-4EDB-A77F-859598E24DE2@illinois.edu> Message-ID: 2011/8/2 Chris Fields : > On Aug 2, 2011, at 5:19 PM, Carn? Draug wrote: >> On 2 August 2011 23:01, Chris Fields wrote: >>> Carn?, >>> >>> I installed the add-in, merged the old account (~carandraug) into the one specified (Carandraug ), and deleted the old account. ?See if that works. >>> >> >> When I try to add this OpenID to my account, I still get the error: >> "That is someone else's OpenID." > > Apparently UserMerge doesn't clean up empty OpenID. ?I found that one (login.launchpad.net/~carandraug) and manually deleted it. ?The user ID it was associated with no longer existed in the user tables. This is solved. I connected my account with this OpenID and can now log in with it. Thank you >> Another problem that I have when logging is that sometimes mediawiki >> sends 'https://login.launchpad.net/ id/y7xtYzD' instead of >> 'https://login.launchpad.net/~carandraug' to the launchpad server. I >> don't know what's causing this. Trying to backspace and delete what >> may be invisible characters before and after the string sometimes >> solves this. This happens even though I type this character by >> character so if there's any invisble stuff on the form it must be >> there before. This occurs when using Iceweasel 3.5 (in Debian), >> Firefox 3.6 (in Ubuntu) and Firefox 5 (in MacOSX). This still happens sometimes. It just happened now. I had also fill a support request about this issue some weeks ago (ticket #965). No idea what's been causing this. Carn? From cjfields at illinois.edu Tue Aug 2 21:55:23 2011 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 2 Aug 2011 20:55:23 -0500 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: References: <0E50C8F0-2FCD-4AAB-83DD-DF6229E382C5@drycafe.net> <0CE0963F-3D4C-4EDB-A77F-859598E24DE2@illinois.edu> Message-ID: On Aug 2, 2011, at 5:56 PM, Carn? Draug wrote: > 2011/8/2 Chris Fields : >> On Aug 2, 2011, at 5:19 PM, Carn? Draug wrote: >>> On 2 August 2011 23:01, Chris Fields wrote: >>>> Carn?, >>>> >>>> I installed the add-in, merged the old account (~carandraug) into the one specified (Carandraug ), and deleted the old account. See if that works. >>>> >>> >>> When I try to add this OpenID to my account, I still get the error: >>> "That is someone else's OpenID." >> >> Apparently UserMerge doesn't clean up empty OpenID. I found that one (login.launchpad.net/~carandraug) and manually deleted it. The user ID it was associated with no longer existed in the user tables. > > This is solved. I connected my account with this OpenID and can now > log in with it. Thank you No problem. Apparently there is a bug fix in the more recent versions of OpenID and UserMerge, I'll add a redmine task to make sure they get updated (have my hands full right now, and OpenID can sometimes be tricky to debug). >>> Another problem that I have when logging is that sometimes mediawiki >>> sends 'https://login.launchpad.net/ id/y7xtYzD' instead of >>> 'https://login.launchpad.net/~carandraug' to the launchpad server. I >>> don't know what's causing this. Trying to backspace and delete what >>> may be invisible characters before and after the string sometimes >>> solves this. This happens even though I type this character by >>> character so if there's any invisble stuff on the form it must be >>> there before. This occurs when using Iceweasel 3.5 (in Debian), >>> Firefox 3.6 (in Ubuntu) and Firefox 5 (in MacOSX). > > This still happens sometimes. It just happened now. I had also fill a > support request about this issue some weeks ago (ticket #965). No idea > what's been causing this. > > Carn? Okay, as long as it's noted somewhere. chris From kai.blin at biotech.uni-tuebingen.de Wed Aug 3 04:55:04 2011 From: kai.blin at biotech.uni-tuebingen.de (Kai Blin) Date: Wed, 03 Aug 2011 10:55:04 +0200 Subject: [Bioperl-l] Bio::Root::IO _readline/_pushback behavior Message-ID: <4E390CE8.2050100@biotech.uni-tuebingen.de> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi folks, as I mentioned on https://redmine.open-bio.org/issues/3264 there is something odd going on with Bio::Root::IO's _readline/_pushback functions. This seems to be intentional, at least there is a test case asserting the behaviour I'm seeing. It his however very confusing to the unexpecting programmer using the code. One assumption I'd immediately make would be that if I have code that does a $foo = $io->_readline; $io->_pushback($foo); $bar = $io->_readline;, $foo will be the same string as $bar, regardless what other pieces of the code did. Currently, this is not the case, because the readbuffer that _pushback pushes back into has new strings appended to the end but readline removes them from the front. This easily violates the "principle of least surprise", so I think we should change the readbuffer to a stack. As far as I can tell, changing the _pushback function to "unshift" instead of "push" to the readbuffer breaks only the Root/RootIO.t test designed to test the old behaviour. I don't see any other tests failing on my system that don't fail without this patch. Any comments from the core devs? Cheers, Kai - -- Dipl.-Inform. Kai Blin kai.blin at biotech.uni-tuebingen.de Institute for Microbiology and Infection Medicine Division of Microbiology/Biotechnology Eberhard-Karls-Universit?t T?bingen Auf der Morgenstelle 28 Phone : ++49 7071 29-78841 D-72076 T?bingen Fax : ++49 7071 29-5979 Germany Homepage: http://www.mikrobio.uni-tuebingen.de/ag_wohlleben -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJOOQzoAAoJEKM5lwBiwTTPO6QIAMDN1bAm1FFD98F0rhN7TCpW sV2sLkQDESK9YjCxp3kAqCpg7ZCArcA5l7HmEdAZFTzdFnsfnvKJmNB86C30QXJs 6XcYSbvBIPQdhjK7WIhG2pANItiTxKTGgXDZklVjgj2dVT4kSkCgdGYAAMssT1hn n1/jkBJu5uuCq43Wv5Ia+wEhdN0M+xgKc9x7MF/ikO2qr6x24odMNTW8VgyLsYie p9M68U23aStip2rxV1hrhZzbnjLz66V6O9fIEHmm5CYLfcGXkcrclzLIeptepSj1 bj/7dWIdXy8VnoSNx4RbckHSkMbdIkmyPKzmoYFN7p3FvmrSXsOmB6nfD0hEkbY= =S5ff -----END PGP SIGNATURE----- From shelly.mh at gmail.com Tue Aug 2 06:19:33 2011 From: shelly.mh at gmail.com (Shelly M) Date: Tue, 2 Aug 2011 13:19:33 +0300 Subject: [Bioperl-l] question regarding Bio::DB::CUTG Message-ID: Hello, My name is Shelly and I'm a student at the Hebrew university of Jerusalem. I'm trying to use the package Bio::DB::CUTG but I have some trouble retrieving the right table for a given organism. For example, if I write my $cdtable = Bio::DB::CUTG->new(-sp =>'Mus musculus'); I get a warring message :MSG: too many species - not a unique species id, and it return _species => mitochondrion Mus musculus. So my question is what is the exact format for retrieving the the specific organism? Thanks a lot for the help, Shelly From maximilien1er at gmail.com Tue Aug 2 22:50:44 2011 From: maximilien1er at gmail.com (=?ISO-8859-1?Q?Maxime_D=E9raspe?=) Date: Tue, 2 Aug 2011 19:50:44 -0700 (PDT) Subject: [Bioperl-l] Genbank file : bad features (tag) order with /translation Message-ID: Hi, when I parse a genbank file no matter what I do, the / translation="MKAV.." tag value of a CDS never appear in the last place as it should be. Other tags like /note= /product comes after / translation which it's not the usual practice with genbank file. Could anyone have an idea how to deal with it... put /translation tag value in the last place when I write the genbank file. Thank you ! Max From shachigahoimbi at gmail.com Wed Aug 3 02:00:44 2011 From: shachigahoimbi at gmail.com (Shachi Gahoi) Date: Wed, 3 Aug 2011 11:30:44 +0530 Subject: [Bioperl-l] How to show branch length value in tree Message-ID: Dear All I am using Bio::Tree modules for constructing and drawing tree. *I am unable to show branch length value in tree. * Please tell me How can I do this, if anybody knows. Here is my script which i am using...and i also attached generated tree. Thanks in advance ################################################################################################ use Bio::AlignIO; use Bio::Align::ProteinStatistics; use Bio::Tree::DistanceFactory; use Bio::TreeIO; use Bio::Tree::Draw::Cladogram; # for a dna alignment # can also use ProteinStatistics my $alnio = Bio::AlignIO->new(-file => 'ADP.aln', -format=>'clustalw'); my $dfactory = Bio::Tree::DistanceFactory->new(-method => 'UPGMA'); my $stats = Bio::Align::ProteinStatistics->new; my $treeout = Bio::TreeIO->new(-format => 'newick', -file =>'>ADP1.dnd'); while( my $aln = $alnio->next_aln ) { my $mat = $stats->distance(-method => 'Kimura', -align => $aln); my $tree = $dfactory->make_tree($mat); $treeout->write_tree($tree); } my $dir = shift || '.'; opendir(DIR, $dir) || die $!; for my $file ( readdir(DIR) ) { next unless $file =~ /(\S+)\.dnd$/; my $stem = $1; my $treeio = Bio::TreeIO->new('-format' => 'newick', '-file' => "$dir/$file"); if( my $t1 = $treeio->next_tree ) { my $obj1 = Bio::Tree::Draw::Cladogram->new(-bootstrap => 1, -tree => $t1, -compact => 0); $obj1->print(-file => "$dir/$stem.eps"); } } ######################################################################################################## -- Regards, Shachi -------------- next part -------------- A non-text attachment was scrubbed... Name: ADP1.dnd Type: application/octet-stream Size: 1369 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ADP1.eps Type: application/postscript Size: 17718 bytes Desc: not available URL: From cjfields at illinois.edu Wed Aug 3 09:10:20 2011 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 3 Aug 2011 08:10:20 -0500 Subject: [Bioperl-l] Question to Bio::SearchIO::infernal.pm In-Reply-To: <4E32E14B020000EE00004F57@gwia1.boku.ac.at> References: <4E32E14B020000EE00004F57@gwia1.boku.ac.at> Message-ID: Nadine, Hard to guess w/o seeing the report, but I'm not terribly surprised. I believe I only coded for simple 1 CM reports, IIRC. You'll have to file this as a bug on redmine along with an example. chris On Jul 29, 2011, at 9:35 AM, Nadine Elpida Tatto wrote: > Hi There! > > > > I was wondering if you would or can help me. > > > I have an infernal report containing about 2000 CMs from an infernal run against Rfam.cm. To parse this report I wanted to use Bio::SearchIO::infernal.pm. Unfortunately this turned out to be a problem for me, because "$parser->next_result" only delivers the result for the first CM in the report and nothing more. > > > My code: > #!/usr/bin/perl -w > > > use strict;use Data::Dumper; > use Bio::SearchIO; > > > my $infile = $ARGV[0]; # infernal report > my $parser = Bio::SearchIO->new(-format => 'Infernal', > -file => $infile); > > > while( my $result = $parser->next_result ) { > print $result->query_name . "\n"; > } > > > exit; > > > > > The output: > > > ntatto:~$ ./infernalParser.pl infernal.output > 5S_rRNA > ntatto:~$ > > > > > I would expect the following (like parsing a blast report): > > > ntatto:~$ ./infernalParser.pl infernal.output > 5S_rRNA > 5_8S_rRNA > U1 > ... > ntatto:~$ > > > > I would be glad for help. > > > Thank you in advance. > > > Best Regards > > > N Tatto > > > > > > > > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From p.j.a.cock at googlemail.com Wed Aug 3 09:46:06 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 3 Aug 2011 14:46:06 +0100 Subject: [Bioperl-l] Genbank file : bad features (tag) order with /translation In-Reply-To: References: Message-ID: 2011/8/3 Maxime D?raspe : > Hi, > > when I parse a genbank file no matter what I do, the / > translation="MKAV.." tag value of a CDS never appear in the last place > as it should be. Other tags like /note= /product comes after / > translation which it's not the usual practice with genbank file. Could > anyone have an idea how to deal with it... put /translation tag value > in the last place when I write the genbank file. > > Thank you ! > > Max Hi Max, I'm not aware of anything in the feature table specification about the order of the feature qualifiers (the "tags" like /note and /product). See http://www.ncbi.nlm.nih.gov/collab/FT/ I suspect BioPerl is using a hash (Biopython uses a dictionary) for the feature qualifiers, which would discard the order. Why do you care about the order? Peter From roy.chaudhuri at gmail.com Wed Aug 3 09:58:22 2011 From: roy.chaudhuri at gmail.com (Roy Chaudhuri) Date: Wed, 03 Aug 2011 14:58:22 +0100 Subject: [Bioperl-l] How to show branch length value in tree In-Reply-To: References: Message-ID: <4E3953FE.5080304@gmail.com> Hi Shachi, I don't think you can draw labels on branches using Bio::Tree::Draw::Cladogram. However, it will draw node labels, so you could copy the branch lengths over to the node ids: my $obj1 = Bio::Tree::Draw::Cladogram->new(-bootstrap => 1, -tree => $t1, -compact => 0); for my $node ($tree->get_nodes) { $node->id($node->branch_length) if defined $node->branch_length; } $obj1->print(-file => "$dir/$stem.eps") Incidentally, in your script you write the tree out to a file, then read it back in using TreeIO. This is unnecessary, you can use $tree directly as input to Bio::Tree::Draw::Cladogram. Alternatively, you could write out a newick file and use non-Bioperl software such as njplot or MEGA to draw your tree with labelled branch lengths. Cheers, Roy. On 03/08/2011 07:00, Shachi Gahoi wrote: > Dear All > > I am using Bio::Tree modules for constructing and drawing tree. *I am unable > to show branch length value in tree. > * > Please tell me How can I do this, if anybody knows. > > Here is my script which i am using...and i also attached generated tree. > > Thanks in advance > > ################################################################################################ > > use Bio::AlignIO; > use Bio::Align::ProteinStatistics; > use Bio::Tree::DistanceFactory; > use Bio::TreeIO; > use Bio::Tree::Draw::Cladogram; > > # for a dna alignment > # can also use ProteinStatistics > > my $alnio = Bio::AlignIO->new(-file => 'ADP.aln', -format=>'clustalw'); > > my $dfactory = Bio::Tree::DistanceFactory->new(-method => 'UPGMA'); > > my $stats = Bio::Align::ProteinStatistics->new; > > my $treeout = Bio::TreeIO->new(-format => 'newick', -file =>'>ADP1.dnd'); > > while( my $aln = $alnio->next_aln ) > { > my $mat = $stats->distance(-method => 'Kimura', -align => $aln); > > my $tree = $dfactory->make_tree($mat); > $treeout->write_tree($tree); > } > > my $dir = shift || '.'; > > opendir(DIR, $dir) || die $!; > for my $file ( readdir(DIR) ) > { > next unless $file =~ /(\S+)\.dnd$/; > my $stem = $1; > my $treeio = Bio::TreeIO->new('-format' => 'newick', > '-file' => "$dir/$file"); > > if( my $t1 = $treeio->next_tree ) > { > my $obj1 = Bio::Tree::Draw::Cladogram->new(-bootstrap => 1, > -tree => $t1, > -compact => 0); > $obj1->print(-file => "$dir/$stem.eps"); > } > } > > ######################################################################################################## > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From roy.chaudhuri at gmail.com Wed Aug 3 10:01:18 2011 From: roy.chaudhuri at gmail.com (Roy Chaudhuri) Date: Wed, 03 Aug 2011 15:01:18 +0100 Subject: [Bioperl-l] How to show branch length value in tree In-Reply-To: <4E3953FE.5080304@gmail.com> References: <4E3953FE.5080304@gmail.com> Message-ID: <4E3954AE.2080401@gmail.com> Sorry, the code had a typo, it should be: my $obj1 = Bio::Tree::Draw::Cladogram->new(-bootstrap => 1, -tree => $t1, -compact => 0); for my $node ($t1->get_nodes) { $node->id($node->branch_length) if defined $node->branch_length; } $obj1->print(-file => "$dir/$stem.eps") On 03/08/2011 14:58, Roy Chaudhuri wrote: > Hi Shachi, > > I don't think you can draw labels on branches using > Bio::Tree::Draw::Cladogram. However, it will draw node labels, so you > could copy the branch lengths over to the node ids: > > my $obj1 = Bio::Tree::Draw::Cladogram->new(-bootstrap => 1, > -tree => $t1, > -compact => 0); > for my $node ($tree->get_nodes) { > $node->id($node->branch_length) if defined $node->branch_length; > } > $obj1->print(-file => "$dir/$stem.eps") > > Incidentally, in your script you write the tree out to a file, then read > it back in using TreeIO. This is unnecessary, you can use $tree directly > as input to Bio::Tree::Draw::Cladogram. > > Alternatively, you could write out a newick file and use non-Bioperl > software such as njplot or MEGA to draw your tree with labelled branch > lengths. > > Cheers, > Roy. > > On 03/08/2011 07:00, Shachi Gahoi wrote: >> Dear All >> >> I am using Bio::Tree modules for constructing and drawing tree. *I am unable >> to show branch length value in tree. >> * >> Please tell me How can I do this, if anybody knows. >> >> Here is my script which i am using...and i also attached generated tree. >> >> Thanks in advance >> >> ################################################################################################ >> >> use Bio::AlignIO; >> use Bio::Align::ProteinStatistics; >> use Bio::Tree::DistanceFactory; >> use Bio::TreeIO; >> use Bio::Tree::Draw::Cladogram; >> >> # for a dna alignment >> # can also use ProteinStatistics >> >> my $alnio = Bio::AlignIO->new(-file => 'ADP.aln', -format=>'clustalw'); >> >> my $dfactory = Bio::Tree::DistanceFactory->new(-method => 'UPGMA'); >> >> my $stats = Bio::Align::ProteinStatistics->new; >> >> my $treeout = Bio::TreeIO->new(-format => 'newick', -file =>'>ADP1.dnd'); >> >> while( my $aln = $alnio->next_aln ) >> { >> my $mat = $stats->distance(-method => 'Kimura', -align => $aln); >> >> my $tree = $dfactory->make_tree($mat); >> $treeout->write_tree($tree); >> } >> >> my $dir = shift || '.'; >> >> opendir(DIR, $dir) || die $!; >> for my $file ( readdir(DIR) ) >> { >> next unless $file =~ /(\S+)\.dnd$/; >> my $stem = $1; >> my $treeio = Bio::TreeIO->new('-format' => 'newick', >> '-file' => "$dir/$file"); >> >> if( my $t1 = $treeio->next_tree ) >> { >> my $obj1 = Bio::Tree::Draw::Cladogram->new(-bootstrap => 1, >> -tree => $t1, >> -compact => 0); >> $obj1->print(-file => "$dir/$stem.eps"); >> } >> } >> >> ######################################################################################################## >> >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > From cjfields at illinois.edu Wed Aug 3 10:08:33 2011 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 3 Aug 2011 09:08:33 -0500 Subject: [Bioperl-l] Genbank file : bad features (tag) order with /translation In-Reply-To: References: Message-ID: <4585DD3A-8E0A-4820-BA34-8154146A0BC8@illinois.edu> On Aug 3, 2011, at 8:46 AM, Peter Cock wrote: > 2011/8/3 Maxime D?raspe : >> Hi, >> >> when I parse a genbank file no matter what I do, the / >> translation="MKAV.." tag value of a CDS never appear in the last place >> as it should be. Other tags like /note= /product comes after / >> translation which it's not the usual practice with genbank file. Could >> anyone have an idea how to deal with it... put /translation tag value >> in the last place when I write the genbank file. >> >> Thank you ! >> >> Max > > Hi Max, > > I'm not aware of anything in the feature table specification > about the order of the feature qualifiers (the "tags" like /note > and /product). See http://www.ncbi.nlm.nih.gov/collab/FT/ > > I suspect BioPerl is using a hash (Biopython uses a dictionary) > for the feature qualifiers, which would discard the order. > > Why do you care about the order? > > Peter Yes, it uses a hash based on the feature tags. Not sure how Biopython handles it but my guess is something similar (Peter?). The output order was never a chief concern of ours. To tell the truth our main focus has never been simple conversion, except to transform data into a format that is more manageable/normalized. For those interested in making this change, all the code for printing features is in one method in Bio::SeqIO::genbank, _print_GenBank_FTHelper(). The best way to handle this would be to allow an optional coderef/callback that takes the feature (or the tags) and allows custom sorting and printing; I don't want to get into messy semantics on how to specifically sort tags, best to let the user decide. chris From cjfields at illinois.edu Wed Aug 3 10:16:37 2011 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 3 Aug 2011 09:16:37 -0500 Subject: [Bioperl-l] Bio::Root::IO _readline/_pushback behavior In-Reply-To: <4E390CE8.2050100@biotech.uni-tuebingen.de> References: <4E390CE8.2050100@biotech.uni-tuebingen.de> Message-ID: On Aug 3, 2011, at 3:55 AM, Kai Blin wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hi folks, > > as I mentioned on https://redmine.open-bio.org/issues/3264 there is > something odd going on with Bio::Root::IO's _readline/_pushback > functions. This seems to be intentional, at least there is a test case > asserting the behaviour I'm seeing. It his however very confusing to the > unexpecting programmer using the code. > > One assumption I'd immediately make would be that if I have code that > does a $foo = $io->_readline; $io->_pushback($foo); $bar = > $io->_readline;, $foo will be the same string as $bar, regardless what > other pieces of the code did. Currently, this is not the case, because > the readbuffer that _pushback pushes back into has new strings appended > to the end but readline removes them from the front. I think this test is performed in the regressions already, but if not then it is more than welcome. > This easily violates the "principle of least surprise", so I think we > should change the readbuffer to a stack. As far as I can tell, changing > the _pushback function to "unshift" instead of "push" to the readbuffer > breaks only the Root/RootIO.t test designed to test the old behaviour. I > don't see any other tests failing on my system that don't fail without > this patch. > > Any comments from the core devs? I don't have a problem with that beyond the change to the RootIO.t tests (it implies a specific behavior that some developers expect, so is a very subtle API change). However, this is how one would expect it, to be more like an 'unread' stack instead of a queue. In fact, there is a module I used for Biome's pushback/readline called IO::Unread that implements an IO layer for mimicing this behavior, might be worth looking into. > Cheers, > Kai chris Christopher Fields Senior Research Scientist National Center for Supercomputing Applications Institute for Genomic Biology University of Illinois Urbana-Champaign 1206 W. Gregory Dr. , MC-195 Urbana, IL 61801 From p.j.a.cock at googlemail.com Wed Aug 3 10:45:21 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 3 Aug 2011 15:45:21 +0100 Subject: [Bioperl-l] Genbank file : bad features (tag) order with /translation In-Reply-To: <4585DD3A-8E0A-4820-BA34-8154146A0BC8@illinois.edu> References: <4585DD3A-8E0A-4820-BA34-8154146A0BC8@illinois.edu> Message-ID: On Wed, Aug 3, 2011 at 3:08 PM, Chris Fields wrote: > > Yes, it uses a hash based on the feature tags. ?Not sure how Biopython > handles it but my guess is something similar (Peter?). Yes, we key on the feature qualifier (e.g. note or product) and the values are a list of qualifier values (e.g. you can have two notes). > The output order was never a chief concern of ours. ?To tell the truth > our main focus has never been simple conversion, except to transform > data into a format that is more manageable/normalized. > > For those interested in making this change, all the code ?for printing > features is in one method in Bio::SeqIO::genbank, _print_GenBank_FTHelper(). >?The best way to handle this would be to allow an optional coderef/callback > that takes the feature (or the tags) and allows custom sorting and printing; > I don't want to get into messy semantics on how to specifically sort tags, > best to let the user decide. For Biopython switching from the default dictionary (hash type) to an order preserving dictionary would be one option. I too have no wish to try and implement qualifier sorting without an explicit standard. Peter From maximilien1er at gmail.com Wed Aug 3 10:48:05 2011 From: maximilien1er at gmail.com (=?ISO-8859-1?Q?Maxime_D=E9raspe?=) Date: Wed, 3 Aug 2011 07:48:05 -0700 (PDT) Subject: [Bioperl-l] Genbank file : bad features (tag) order with /translation In-Reply-To: References: Message-ID: <7717ecd0-c9b6-4ecc-895f-db60ca89e259@l7g2000vbz.googlegroups.com> > Hi Max, > > I'm not aware of anything in the feature table specification > about the order of the feature qualifiers (the "tags" like /note > and /product). Seehttp://www.ncbi.nlm.nih.gov/collab/FT/ > > I suspect BioPerl is using a hash (Biopython uses a dictionary) > for the feature qualifiers, which would discard the order. > > Why do you care about the order? > > Peter > Hi Peter, I care about the order for the submission to ncbi. But I guess they will reformat the file before getting it in their database. It's also visually better when the translation of the protein comes in the end of the annotation for the CDS and not before /product, /note .... Anyway maybe I'll reformat the file in sequin table for a direct submission to ncbi with sequin. Thank you. Max From p.j.a.cock at googlemail.com Wed Aug 3 12:00:01 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 3 Aug 2011 17:00:01 +0100 Subject: [Bioperl-l] Genbank file : bad features (tag) order with /translation In-Reply-To: <7717ecd0-c9b6-4ecc-895f-db60ca89e259@l7g2000vbz.googlegroups.com> References: <7717ecd0-c9b6-4ecc-895f-db60ca89e259@l7g2000vbz.googlegroups.com> Message-ID: 2011/8/3 Maxime D?raspe : >> >> Why do you care about the order? >> > > Hi Peter, > > I care about the order for the submission to ncbi. Do the NCBI have some guidelines which ask for a particular order? > But I guess they > will reformat the file before getting it in their database. They seem to generate the official GenBank files from their database - so I doubt the input order matters. > It's also > visually better when the translation of the protein comes in the end > of the annotation for the CDS and not before /product, /note .... I do see your point, but if that were the only motivation I wouldn't want to make generating GenBank output any more complicated than it already is. > Anyway maybe I'll reformat the file in sequin table for a direct > submission to ncbi with sequin. > > Thank you. > > Max Peter From cjfields at illinois.edu Wed Aug 3 12:52:02 2011 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 3 Aug 2011 11:52:02 -0500 Subject: [Bioperl-l] Genbank file : bad features (tag) order with /translation In-Reply-To: References: <7717ecd0-c9b6-4ecc-895f-db60ca89e259@l7g2000vbz.googlegroups.com> Message-ID: On Aug 3, 2011, at 11:00 AM, Peter Cock wrote: > 2011/8/3 Maxime D?raspe : >>> >>> Why do you care about the order? >>> >> >> Hi Peter, >> >> I care about the order for the submission to ncbi. > > Do the NCBI have some guidelines which ask for a particular order? No, beyond the feature table there is no specification that indicates such that I am aware of. Submitted data is tabular; sequin is a nicer GUI API for getting data into a useful format for submission to NCBI, where data is converted to ASN.1 I believe. >> But I guess they >> will reformat the file before getting it in their database. > > They seem to generate the official GenBank files from their > database - so I doubt the input order matters. Yep, that's correct. If NCBI ruled the world everyone would be using ASN.1 (b/c that's what they use internally). >> It's also >> visually better when the translation of the protein comes in the end >> of the annotation for the CDS and not before /product, /note .... > > I do see your point, but if that were the only motivation I wouldn't > want to make generating GenBank output any more complicated > than it already is. ... >> Anyway maybe I'll reformat the file in sequin table for a direct >> submission to ncbi with sequin. >> >> Thank you. >> >> Max > > Peter Maxime, I find most users try to avoid using GenBank format except when absolutely needed. There is a very good reason Sequin and tbl2asn are used by NCBI for submissions; they end up generating simple tabular data that is easier to feed into their internal ASN.1 format. Genbank is a nice human-readable format, but structure-wise I find it's a pain to deal with, not to mention the variant third-party 'genbank' data that users want us to handle. We try to support generation of output within reason, but that's never been our primary goal. As long as the output generated is capable of being re-read by our parsers with the data intact and generates sane data we're pretty happy. Saying that, any additions to deal with this are perfectly welcome (I pointed out one mechanism that could be used), but they would have to address the concerns Peter and I alluded to previously, and it would be nice to evaluate how any changes affect performance. You are more than welcome to submit this as a feature request using our redmine server (including patches if you do this yourself): https://redmine.open-bio.org/ chris From cjfields at illinois.edu Wed Aug 3 13:10:31 2011 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 3 Aug 2011 12:10:31 -0500 Subject: [Bioperl-l] Genbank file : bad features (tag) order with /translation In-Reply-To: <5D9B8006-A20E-4AAE-88EF-0A1DCA56B26E@verizon.net> References: <7717ecd0-c9b6-4ecc-895f-db60ca89e259@l7g2000vbz.googlegroups.com> <5D9B8006-A20E-4AAE-88EF-0A1DCA56B26E@verizon.net> Message-ID: <51452A39-42B7-4BBF-9F50-A37419E75454@illinois.edu> IMHO I find genbank too unwieldy, but it's nice to know the output works for NCBI submission. chris On Aug 3, 2011, at 12:06 PM, Brian Osborne wrote: > Peter, > > I currently use BioPerl and SeqIO::genbank to create the *gbf files for NCBI submission, they've always accepted them. In fact I think they don't even use them, I believe they use the *tbl, *fsa, and *agp files and the ASN file as data sources. > > Brian O > > On Aug 3, 2011, at 12:52 PM, Chris Fields wrote: > >> On Aug 3, 2011, at 11:00 AM, Peter Cock wrote: >> >>> 2011/8/3 Maxime D?raspe : >>>>> >>>>> Why do you care about the order? >>>>> >>>> >>>> Hi Peter, >>>> >>>> I care about the order for the submission to ncbi. >>> >>> Do the NCBI have some guidelines which ask for a particular order? >> >> No, beyond the feature table there is no specification that indicates such that I am aware of. Submitted data is tabular; sequin is a nicer GUI API for getting data into a useful format for submission to NCBI, where data is converted to ASN.1 I believe. >> >>>> But I guess they >>>> will reformat the file before getting it in their database. >>> >>> They seem to generate the official GenBank files from their >>> database - so I doubt the input order matters. >> >> Yep, that's correct. If NCBI ruled the world everyone would be using ASN.1 (b/c that's what they use internally). >> >>>> It's also >>>> visually better when the translation of the protein comes in the end >>>> of the annotation for the CDS and not before /product, /note .... >>> >>> I do see your point, but if that were the only motivation I wouldn't >>> want to make generating GenBank output any more complicated >>> than it already is. >> ... >>>> Anyway maybe I'll reformat the file in sequin table for a direct >>>> submission to ncbi with sequin. >>>> >>>> Thank you. >>>> >>>> Max >>> >>> Peter >> >> >> Maxime, I find most users try to avoid using GenBank format except when absolutely needed. There is a very good reason Sequin and tbl2asn are used by NCBI for submissions; they end up generating simple tabular data that is easier to feed into their internal ASN.1 format. Genbank is a nice human-readable format, but structure-wise I find it's a pain to deal with, not to mention the variant third-party 'genbank' data that users want us to handle. >> >> We try to support generation of output within reason, but that's never been our primary goal. As long as the output generated is capable of being re-read by our parsers with the data intact and generates sane data we're pretty happy. >> >> Saying that, any additions to deal with this are perfectly welcome (I pointed out one mechanism that could be used), but they would have to address the concerns Peter and I alluded to previously, and it would be nice to evaluate how any changes affect performance. You are more than welcome to submit this as a feature request using our redmine server (including patches if you do this yourself): >> >> https://redmine.open-bio.org/ >> >> chris >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > From bosborne11 at verizon.net Wed Aug 3 13:06:05 2011 From: bosborne11 at verizon.net (Brian Osborne) Date: Wed, 03 Aug 2011 13:06:05 -0400 Subject: [Bioperl-l] Genbank file : bad features (tag) order with /translation In-Reply-To: References: <7717ecd0-c9b6-4ecc-895f-db60ca89e259@l7g2000vbz.googlegroups.com> Message-ID: <5D9B8006-A20E-4AAE-88EF-0A1DCA56B26E@verizon.net> Peter, I currently use BioPerl and SeqIO::genbank to create the *gbf files for NCBI submission, they've always accepted them. In fact I think they don't even use them, I believe they use the *tbl, *fsa, and *agp files and the ASN file as data sources. Brian O On Aug 3, 2011, at 12:52 PM, Chris Fields wrote: > On Aug 3, 2011, at 11:00 AM, Peter Cock wrote: > >> 2011/8/3 Maxime D?raspe : >>>> >>>> Why do you care about the order? >>>> >>> >>> Hi Peter, >>> >>> I care about the order for the submission to ncbi. >> >> Do the NCBI have some guidelines which ask for a particular order? > > No, beyond the feature table there is no specification that indicates such that I am aware of. Submitted data is tabular; sequin is a nicer GUI API for getting data into a useful format for submission to NCBI, where data is converted to ASN.1 I believe. > >>> But I guess they >>> will reformat the file before getting it in their database. >> >> They seem to generate the official GenBank files from their >> database - so I doubt the input order matters. > > Yep, that's correct. If NCBI ruled the world everyone would be using ASN.1 (b/c that's what they use internally). > >>> It's also >>> visually better when the translation of the protein comes in the end >>> of the annotation for the CDS and not before /product, /note .... >> >> I do see your point, but if that were the only motivation I wouldn't >> want to make generating GenBank output any more complicated >> than it already is. > ... >>> Anyway maybe I'll reformat the file in sequin table for a direct >>> submission to ncbi with sequin. >>> >>> Thank you. >>> >>> Max >> >> Peter > > > Maxime, I find most users try to avoid using GenBank format except when absolutely needed. There is a very good reason Sequin and tbl2asn are used by NCBI for submissions; they end up generating simple tabular data that is easier to feed into their internal ASN.1 format. Genbank is a nice human-readable format, but structure-wise I find it's a pain to deal with, not to mention the variant third-party 'genbank' data that users want us to handle. > > We try to support generation of output within reason, but that's never been our primary goal. As long as the output generated is capable of being re-read by our parsers with the data intact and generates sane data we're pretty happy. > > Saying that, any additions to deal with this are perfectly welcome (I pointed out one mechanism that could be used), but they would have to address the concerns Peter and I alluded to previously, and it would be nice to evaluate how any changes affect performance. You are more than welcome to submit this as a feature request using our redmine server (including patches if you do this yourself): > > https://redmine.open-bio.org/ > > chris > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From lskatz at gmail.com Wed Aug 3 17:01:24 2011 From: lskatz at gmail.com (Lee Katz) Date: Wed, 3 Aug 2011 17:01:24 -0400 Subject: [Bioperl-l] SeqIO: paired end reads Message-ID: Hi all! I was wondering how to construct paired end reads from scratch. I know the locations of certain sequences across the genome with a high degree of confidence and so I want to give them to my assembler as paired end reads, along with my other sequence runs (454 and Illumina runs). I plan to use Newbler. My only problem is that I do not know the correct format in order to specify distance and sequences for a paired end reads run, and so I hope that there is a SeqIO solution. At the least, I hope that one bioperl member can point me to where the definition of the paired end reads file format is...? Thank you! --Lee From jason.stajich at gmail.com Wed Aug 3 17:17:01 2011 From: jason.stajich at gmail.com (Jason Stajich) Date: Wed, 3 Aug 2011 13:17:01 -0800 Subject: [Bioperl-l] SeqIO: paired end reads In-Reply-To: References: Message-ID: <57EA9809-E999-43EF-B340-9A552A4A3FB6@gmail.com> it depends on the assembler - For Illumina usually the paired ends end with /1 /2 and they have the same ID but are in two different files. Depends on if you are using interleaved paired reads or in two separate files. some just expect the paired reads to be mated by virtue of being in same order in two files. the ABYSS and Velvet manuals both explain what is expected so you will want to check on what are Newbler's assumptions on how the paired ends are encoded. There are simulator tools if that is what you are trying to do in the end? checkout wgsim which comes with samtools or try dnaa On Aug 3, 2011, at 1:01 PM, Lee Katz wrote: > Hi all! I was wondering how to construct paired end reads from scratch. I > know the locations of certain sequences across the genome with a high degree > of confidence and so I want to give them to my assembler as paired end > reads, along with my other sequence runs (454 and Illumina runs). I plan to > use Newbler. > > My only problem is that I do not know the correct format in order to specify > distance and sequences for a paired end reads run, and so I hope that there > is a SeqIO solution. At the least, I hope that one bioperl member can point > me to where the definition of the paired end reads file format is...? > > Thank you! > > --Lee > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From roy.chaudhuri at gmail.com Thu Aug 4 07:22:23 2011 From: roy.chaudhuri at gmail.com (Roy Chaudhuri) Date: Thu, 04 Aug 2011 12:22:23 +0100 Subject: [Bioperl-l] How to show branch length value in tree In-Reply-To: References: <4E3953FE.5080304@gmail.com> <4E3954AE.2080401@gmail.com> Message-ID: <4E3A80EF.2010409@gmail.com> Hi Shachi, Please keep replies on the mailing list, that way others can follow the discussion. As I mentioned, it is not possible to draw njplot-style trees with labelled branches using Bio::Tree::Draw::Cladogram, it currently only labels nodes (you could perhaps add branch labels as a feature request on Redmine). The code I gave overwrites the existing "leaf" node ids (the accessions) with branch lengths, if you want to also keep the existing labels you could try something like: for my $node ($t1->get_nodes) { if ($node->is_Leaf) { $node->id($node->branch_length.' '.$node->id); } else { $node->id($node->branch_length) } } Cheers, Roy. On 04/08/2011 05:36, Shachi Gahoi wrote: > Thank You so much. Now branch length is coming in tree. > > But I want Accesssion number in place of node id. > > I attached snapshot of tree as I want. Please tell me how can I do this. > > > > > On Wed, Aug 3, 2011 at 7:31 PM, Roy Chaudhuri > wrote: > > Sorry, the code had a typo, it should be: > > > my $obj1 = Bio::Tree::Draw::Cladogram->__new(-bootstrap => 1, > -tree => $t1, > -compact => 0); > for my $node ($t1->get_nodes) { > > $node->id($node->branch___length) if defined $node->branch_length; > } > $obj1->print(-file => "$dir/$stem.eps") > > On 03/08/2011 14:58, Roy Chaudhuri wrote: > > Hi Shachi, > > I don't think you can draw labels on branches using > Bio::Tree::Draw::Cladogram. However, it will draw node labels, > so you > could copy the branch lengths over to the node ids: > > my $obj1 = Bio::Tree::Draw::Cladogram->__new(-bootstrap => 1, > -tree => $t1, > -compact => 0); > for my $node ($tree->get_nodes) { > $node->id($node->branch___length) if defined > $node->branch_length; > } > $obj1->print(-file => "$dir/$stem.eps") > > Incidentally, in your script you write the tree out to a file, > then read > it back in using TreeIO. This is unnecessary, you can use $tree > directly > as input to Bio::Tree::Draw::Cladogram. > > Alternatively, you could write out a newick file and use non-Bioperl > software such as njplot or MEGA to draw your tree with labelled > branch > lengths. > > Cheers, > Roy. > > On 03/08/2011 07:00, Shachi Gahoi wrote: > > Dear All > > I am using Bio::Tree modules for constructing and drawing > tree. *I am unable > to show branch length value in tree. > * > Please tell me How can I do this, if anybody knows. > > Here is my script which i am using...and i also attached > generated tree. > > Thanks in advance > > ##############################__##############################__##############################__###### > > use Bio::AlignIO; > use Bio::Align::ProteinStatistics; > use Bio::Tree::DistanceFactory; > use Bio::TreeIO; > use Bio::Tree::Draw::Cladogram; > > # for a dna alignment > # can also use ProteinStatistics > > my $alnio = Bio::AlignIO->new(-file => 'ADP.aln', > -format=>'clustalw'); > > my $dfactory = Bio::Tree::DistanceFactory->__new(-method => > 'UPGMA'); > > my $stats = Bio::Align::ProteinStatistics-__>new; > > my $treeout = Bio::TreeIO->new(-format => 'newick', -file > =>'>ADP1.dnd'); > > while( my $aln = $alnio->next_aln ) > { > my $mat = $stats->distance(-method => 'Kimura', -align > => $aln); > > my $tree = $dfactory->make_tree($mat); > $treeout->write_tree($tree); > } > > my $dir = shift || '.'; > > opendir(DIR, $dir) || die $!; > for my $file ( readdir(DIR) ) > { > next unless $file =~ /(\S+)\.dnd$/; > my $stem = $1; > my $treeio = Bio::TreeIO->new('-format' => 'newick', > '-file' => "$dir/$file"); > > if( my $t1 = $treeio->next_tree ) > { > my $obj1 = > Bio::Tree::Draw::Cladogram->__new(-bootstrap => 1, > -tree > => $t1, > > -compact => 0); > $obj1->print(-file => "$dir/$stem.eps"); > } > } > > ##############################__##############################__##############################__############## > > > > > _________________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/__mailman/listinfo/bioperl-l > > > > > > > > -- > Regards, > Shachi From razi.khaja at gmail.com Thu Aug 4 13:39:28 2011 From: razi.khaja at gmail.com (Razi Khaja) Date: Thu, 4 Aug 2011 13:39:28 -0400 Subject: [Bioperl-l] BioPerl on GitHub will not install Message-ID: All, I just checked out the latest development version of BioPerl from GitHub and found that it does not install because bp_das_server.pl is missing. Building BioPerl 'blib/script/bp_das_server.pl' and 'blib/script/bp_das_server.pl' are identical (not copied) at /opt/bioperl-live/Bio/Root/Build.pm line 219 Use of uninitialized value in utime at /usr/lib/perl5/5.8.8/File/Copy.pm line 218. Use of uninitialized value in utime at /usr/lib/perl5/5.8.8/File/Copy.pm line 218. Can't rename 'blib/script/bp_das_server.pl' to 'blib/script/bp_das_server.pl': No such file or directory at /opt/bioperl-live/Bio/Root/Build.pm line 219. After copying the bp_das_server.pl that I had from a previous installation to 'blib/script', I was able to ./Build test and ./Build install the development version I checked out. Could someone test out this problem and fix it on github? if it really is a problem? Thanks, Razi From cjfields at illinois.edu Thu Aug 4 13:42:48 2011 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 4 Aug 2011 12:42:48 -0500 Subject: [Bioperl-l] [Bioperl-guts-l] BioPerl on GitHub will not install In-Reply-To: References: Message-ID: <007DAD37-BC86-4D1F-8C40-816890661F7D@illinois.edu> Yes, I can replicate that. It's from the recent renaming for scripts. I'll look into it. chris On Aug 4, 2011, at 12:39 PM, Razi Khaja wrote: > All, > > I just checked out the latest development version of BioPerl from GitHub and > found that it does not install because bp_das_server.pl is missing. > > Building BioPerl > 'blib/script/bp_das_server.pl' and 'blib/script/bp_das_server.pl' are > identical (not copied) at /opt/bioperl-live/Bio/Root/Build.pm line 219 > Use of uninitialized value in utime at /usr/lib/perl5/5.8.8/File/Copy.pm > line 218. > Use of uninitialized value in utime at /usr/lib/perl5/5.8.8/File/Copy.pm > line 218. > Can't rename 'blib/script/bp_das_server.pl' to 'blib/script/bp_das_server.pl': > No such file or directory at /opt/bioperl-live/Bio/Root/Build.pm line 219. > > After copying the bp_das_server.pl that I had from a previous installation > to 'blib/script', I was able to ./Build test and ./Build install the > development version I checked out. > > Could someone test out this problem and fix it on github? if it really is a > problem? > > Thanks, > > Razi > _______________________________________________ > Bioperl-guts-l mailing list > Bioperl-guts-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-guts-l From hlapp at drycafe.net Thu Aug 4 17:31:52 2011 From: hlapp at drycafe.net (Hilmar Lapp) Date: Thu, 4 Aug 2011 17:31:52 -0400 Subject: [Bioperl-l] Bio::Root::IO _readline/_pushback behavior In-Reply-To: References: <4E390CE8.2050100@biotech.uni-tuebingen.de> Message-ID: I agree. In fact I'm surprised that $io->_pushback() does not act like unshift() - that's I thought how it is used. -hilmar On Aug 3, 2011, at 10:16 AM, Chris Fields wrote: > On Aug 3, 2011, at 3:55 AM, Kai Blin wrote: > >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA1 >> >> Hi folks, >> >> as I mentioned on https://redmine.open-bio.org/issues/3264 there is >> something odd going on with Bio::Root::IO's _readline/_pushback >> functions. This seems to be intentional, at least there is a test >> case >> asserting the behaviour I'm seeing. It his however very confusing >> to the >> unexpecting programmer using the code. >> >> One assumption I'd immediately make would be that if I have code that >> does a $foo = $io->_readline; $io->_pushback($foo); $bar = >> $io->_readline;, $foo will be the same string as $bar, regardless >> what >> other pieces of the code did. Currently, this is not the case, >> because >> the readbuffer that _pushback pushes back into has new strings >> appended >> to the end but readline removes them from the front. > > I think this test is performed in the regressions already, but if > not then it is more than welcome. > >> This easily violates the "principle of least surprise", so I think we >> should change the readbuffer to a stack. As far as I can tell, >> changing >> the _pushback function to "unshift" instead of "push" to the >> readbuffer >> breaks only the Root/RootIO.t test designed to test the old >> behaviour. I >> don't see any other tests failing on my system that don't fail >> without >> this patch. >> >> Any comments from the core devs? > > I don't have a problem with that beyond the change to the RootIO.t > tests (it implies a specific behavior that some developers expect, > so is a very subtle API change). However, this is how one would > expect it, to be more like an 'unread' stack instead of a queue. In > fact, there is a module I used for Biome's pushback/readline called > IO::Unread that implements an IO layer for mimicing this behavior, > might be worth looking into. > >> Cheers, >> Kai > > chris > > > Christopher Fields > Senior Research Scientist > National Center for Supercomputing Applications > Institute for Genomic Biology > University of Illinois Urbana-Champaign > 1206 W. Gregory Dr. , MC-195 > Urbana, IL 61801 > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : =========================================================== From cjfields at illinois.edu Thu Aug 4 17:42:30 2011 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 4 Aug 2011 16:42:30 -0500 Subject: [Bioperl-l] Bio::Root::IO _readline/_pushback behavior In-Reply-To: References: <4E390CE8.2050100@biotech.uni-tuebingen.de> Message-ID: <4196E008-4A81-41E5-A4F9-F9F8D3851E5C@illinois.edu> Yeah, it's a queue; the 'buffering' is a simple internal array using push/shift. I say we merge the change in from the branch and fix any modules accordingly. chris On Aug 4, 2011, at 4:31 PM, Hilmar Lapp wrote: > I agree. In fact I'm surprised that $io->_pushback() does not act like unshift() - that's I thought how it is used. > > -hilmar > > On Aug 3, 2011, at 10:16 AM, Chris Fields wrote: > >> On Aug 3, 2011, at 3:55 AM, Kai Blin wrote: >> >>> -----BEGIN PGP SIGNED MESSAGE----- >>> Hash: SHA1 >>> >>> Hi folks, >>> >>> as I mentioned on https://redmine.open-bio.org/issues/3264 there is >>> something odd going on with Bio::Root::IO's _readline/_pushback >>> functions. This seems to be intentional, at least there is a test case >>> asserting the behaviour I'm seeing. It his however very confusing to the >>> unexpecting programmer using the code. >>> >>> One assumption I'd immediately make would be that if I have code that >>> does a $foo = $io->_readline; $io->_pushback($foo); $bar = >>> $io->_readline;, $foo will be the same string as $bar, regardless what >>> other pieces of the code did. Currently, this is not the case, because >>> the readbuffer that _pushback pushes back into has new strings appended >>> to the end but readline removes them from the front. >> >> I think this test is performed in the regressions already, but if not then it is more than welcome. >> >>> This easily violates the "principle of least surprise", so I think we >>> should change the readbuffer to a stack. As far as I can tell, changing >>> the _pushback function to "unshift" instead of "push" to the readbuffer >>> breaks only the Root/RootIO.t test designed to test the old behaviour. I >>> don't see any other tests failing on my system that don't fail without >>> this patch. >>> >>> Any comments from the core devs? >> >> I don't have a problem with that beyond the change to the RootIO.t tests (it implies a specific behavior that some developers expect, so is a very subtle API change). However, this is how one would expect it, to be more like an 'unread' stack instead of a queue. In fact, there is a module I used for Biome's pushback/readline called IO::Unread that implements an IO layer for mimicing this behavior, might be worth looking into. >> >>> Cheers, >>> Kai >> >> chris >> >> >> Christopher Fields >> Senior Research Scientist >> National Center for Supercomputing Applications >> Institute for Genomic Biology >> University of Illinois Urbana-Champaign >> 1206 W. Gregory Dr. , MC-195 >> Urbana, IL 61801 >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : > =========================================================== > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Thu Aug 4 18:11:29 2011 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 4 Aug 2011 17:11:29 -0500 Subject: [Bioperl-l] [Bioperl-guts-l] BioPerl on GitHub will not install In-Reply-To: <007DAD37-BC86-4D1F-8C40-816890661F7D@illinois.edu> References: <007DAD37-BC86-4D1F-8C40-816890661F7D@illinois.edu> Message-ID: <0A691C42-539E-45A1-B44F-7B0B5D8DE3D8@illinois.edu> Now fixed on github. There was some cruft left in Bio::Root::Build that didn't deal with the recent script renaming. chris On Aug 4, 2011, at 12:42 PM, Chris Fields wrote: > Yes, I can replicate that. It's from the recent renaming for scripts. I'll look into it. > > chris > > On Aug 4, 2011, at 12:39 PM, Razi Khaja wrote: > >> All, >> >> I just checked out the latest development version of BioPerl from GitHub and >> found that it does not install because bp_das_server.pl is missing. >> >> Building BioPerl >> 'blib/script/bp_das_server.pl' and 'blib/script/bp_das_server.pl' are >> identical (not copied) at /opt/bioperl-live/Bio/Root/Build.pm line 219 >> Use of uninitialized value in utime at /usr/lib/perl5/5.8.8/File/Copy.pm >> line 218. >> Use of uninitialized value in utime at /usr/lib/perl5/5.8.8/File/Copy.pm >> line 218. >> Can't rename 'blib/script/bp_das_server.pl' to 'blib/script/bp_das_server.pl': >> No such file or directory at /opt/bioperl-live/Bio/Root/Build.pm line 219. >> >> After copying the bp_das_server.pl that I had from a previous installation >> to 'blib/script', I was able to ./Build test and ./Build install the >> development version I checked out. >> >> Could someone test out this problem and fix it on github? if it really is a >> problem? >> >> Thanks, >> >> Razi >> _______________________________________________ >> Bioperl-guts-l mailing list >> Bioperl-guts-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-guts-l > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From shachigahoimbi at gmail.com Fri Aug 5 01:40:11 2011 From: shachigahoimbi at gmail.com (Shachi Gahoi) Date: Fri, 5 Aug 2011 11:10:11 +0530 Subject: [Bioperl-l] How to show branch length value in tree In-Reply-To: <4E3A80EF.2010409@gmail.com> References: <4E3953FE.5080304@gmail.com> <4E3954AE.2080401@gmail.com> <4E3A80EF.2010409@gmail.com> Message-ID: Instead of both node id and accession, Can I replace node id with accession? On Thu, Aug 4, 2011 at 4:52 PM, Roy Chaudhuri wrote: > Hi Shachi, > > Please keep replies on the mailing list, that way others can follow the > discussion. > > As I mentioned, it is not possible to draw njplot-style trees with labelled > branches using Bio::Tree::Draw::Cladogram, it currently only labels nodes > (you could perhaps add branch labels as a feature request on Redmine). > > The code I gave overwrites the existing "leaf" node ids (the accessions) > with branch lengths, if you want to also keep the existing labels you could > try something like: > > > for my $node ($t1->get_nodes) { > if ($node->is_Leaf) { > $node->id($node->branch_**length.' '.$node->id); > } else { > > $node->id($node->branch_**length) > } > } > > Cheers, > Roy. > > > On 04/08/2011 05:36, Shachi Gahoi wrote: > >> Thank You so much. Now branch length is coming in tree. >> >> But I want Accesssion number in place of node id. >> >> I attached snapshot of tree as I want. Please tell me how can I do this. >> >> >> >> >> On Wed, Aug 3, 2011 at 7:31 PM, Roy Chaudhuri > >> wrote: >> >> Sorry, the code had a typo, it should be: >> >> >> my $obj1 = Bio::Tree::Draw::Cladogram->__**new(-bootstrap => 1, >> -tree => $t1, >> -compact => 0); >> for my $node ($t1->get_nodes) { >> >> $node->id($node->branch___**length) if defined >> $node->branch_length; >> } >> $obj1->print(-file => "$dir/$stem.eps") >> >> On 03/08/2011 14:58, Roy Chaudhuri wrote: >> >> Hi Shachi, >> >> I don't think you can draw labels on branches using >> Bio::Tree::Draw::Cladogram. However, it will draw node labels, >> so you >> could copy the branch lengths over to the node ids: >> >> my $obj1 = Bio::Tree::Draw::Cladogram->__**new(-bootstrap => 1, >> -tree => $t1, >> -compact => 0); >> for my $node ($tree->get_nodes) { >> $node->id($node->branch___**length) if defined >> $node->branch_length; >> } >> $obj1->print(-file => "$dir/$stem.eps") >> >> Incidentally, in your script you write the tree out to a file, >> then read >> it back in using TreeIO. This is unnecessary, you can use $tree >> directly >> as input to Bio::Tree::Draw::Cladogram. >> >> Alternatively, you could write out a newick file and use >> non-Bioperl >> software such as njplot or MEGA to draw your tree with labelled >> branch >> lengths. >> >> Cheers, >> Roy. >> >> On 03/08/2011 07:00, Shachi Gahoi wrote: >> >> Dear All >> >> I am using Bio::Tree modules for constructing and drawing >> tree. *I am unable >> to show branch length value in tree. >> * >> Please tell me How can I do this, if anybody knows. >> >> Here is my script which i am using...and i also attached >> generated tree. >> >> Thanks in advance >> >> ##############################**__############################ >> **##__##########################**####__###### >> >> use Bio::AlignIO; >> use Bio::Align::ProteinStatistics; >> use Bio::Tree::DistanceFactory; >> use Bio::TreeIO; >> use Bio::Tree::Draw::Cladogram; >> >> # for a dna alignment >> # can also use ProteinStatistics >> >> my $alnio = Bio::AlignIO->new(-file => 'ADP.aln', >> -format=>'clustalw'); >> >> my $dfactory = Bio::Tree::DistanceFactory->__**new(-method => >> 'UPGMA'); >> >> my $stats = Bio::Align::ProteinStatistics-**__>new; >> >> my $treeout = Bio::TreeIO->new(-format => 'newick', -file >> =>'>ADP1.dnd'); >> >> while( my $aln = $alnio->next_aln ) >> { >> my $mat = $stats->distance(-method => 'Kimura', -align >> => $aln); >> >> my $tree = $dfactory->make_tree($mat); >> $treeout->write_tree($tree); >> } >> >> my $dir = shift || '.'; >> >> opendir(DIR, $dir) || die $!; >> for my $file ( readdir(DIR) ) >> { >> next unless $file =~ /(\S+)\.dnd$/; >> my $stem = $1; >> my $treeio = Bio::TreeIO->new('-format' => 'newick', >> '-file' => "$dir/$file"); >> >> if( my $t1 = $treeio->next_tree ) >> { >> my $obj1 = >> Bio::Tree::Draw::Cladogram->__**new(-bootstrap => 1, >> -tree >> => $t1, >> >> -compact => 0); >> $obj1->print(-file => "$dir/$stem.eps"); >> } >> } >> >> ##############################**__############################ >> **##__##########################**####__############## >> >> >> >> >> ______________________________**___________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> >> > >> >> http://lists.open-bio.org/__**mailman/listinfo/bioperl-l >> >> > >> >> >> >> >> >> >> -- >> Regards, >> Shachi >> > > -- Regards, Shachi From kai.blin at biotech.uni-tuebingen.de Fri Aug 5 04:40:57 2011 From: kai.blin at biotech.uni-tuebingen.de (Kai Blin) Date: Fri, 05 Aug 2011 10:40:57 +0200 Subject: [Bioperl-l] Bio::Root::IO _readline/_pushback behavior In-Reply-To: <4196E008-4A81-41E5-A4F9-F9F8D3851E5C@illinois.edu> References: <4E390CE8.2050100@biotech.uni-tuebingen.de> <4196E008-4A81-41E5-A4F9-F9F8D3851E5C@illinois.edu> Message-ID: <4E3BAC99.8050806@biotech.uni-tuebingen.de> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2011-08-04 23:42, Chris Fields wrote: > Yeah, it's a queue; the 'buffering' is a simple internal array using > push/shift. I say we merge the change in from the branch and fix > any modules accordingly. Ok, I'm happy to take care of it, if people can tell me how to find and fix modules that use the old assumption. My initial attempt right after making the change was to run the test suite, which came up clean apart from the RootIO.t case that my patch now modifies as well. Cheers, Kai - -- Dipl.-Inform. Kai Blin kai.blin at biotech.uni-tuebingen.de Institute for Microbiology and Infection Medicine Division of Microbiology/Biotechnology Eberhard-Karls-Universit?t T?bingen Auf der Morgenstelle 28 Phone : ++49 7071 29-78841 D-72076 T?bingen Fax : ++49 7071 29-5979 Germany Homepage: http://www.mikrobio.uni-tuebingen.de/ag_wohlleben -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJOO6yZAAoJEKM5lwBiwTTPdjsH/0ELbz9VYIzxlpx+QZ3Jvd55 KTXVP+oOzjIDlOdxbdqYR0w04VXnpkQek3hVt0mbreuKvtdMJY/YhRwZLiOzYSak ruhswUJQnm3K2vkaqpgLESIIUASneFrW7ezfV3R9q/Ov730GBDAtkLTEk7cVV5Cg W515ixJtNC7v6fZmNFJZudQbcUYYgy+8BFgvNUaSoH8YqubMXzjFXknBWeWT0qco ivHjqIc6Nkap799ijPiLEU7ArI1pEOB2jyvjntIocFR72imbo7e86RaVHJCNl/N7 GFbRGoH2m7LVeWFYuNM3vsTS3W4KVLg9U/8UBysykR3uoHAVJhm4T5nCT4NKE/w= =z6QZ -----END PGP SIGNATURE----- From roy.chaudhuri at gmail.com Fri Aug 5 06:54:32 2011 From: roy.chaudhuri at gmail.com (Roy Chaudhuri) Date: Fri, 05 Aug 2011 11:54:32 +0100 Subject: [Bioperl-l] How to show branch length value in tree In-Reply-To: References: <4E3953FE.5080304@gmail.com> <4E3954AE.2080401@gmail.com> <4E3A80EF.2010409@gmail.com> Message-ID: <4E3BCBE8.4030303@gmail.com> In that case then you only want to add branch lengths to non-leaf nodes, so it would be: for my $node ($t1->get_nodes) { $node->id($node->branch_length) unless $node->is_Leaf } On 05/08/2011 06:40, Shachi Gahoi wrote: > > Instead of both node id and accession, Can I replace node id with accession? > > > On Thu, Aug 4, 2011 at 4:52 PM, Roy Chaudhuri > wrote: > > Hi Shachi, > > Please keep replies on the mailing list, that way others can follow > the discussion. > > As I mentioned, it is not possible to draw njplot-style trees with > labelled branches using Bio::Tree::Draw::Cladogram, it currently > only labels nodes (you could perhaps add branch labels as a feature > request on Redmine). > > The code I gave overwrites the existing "leaf" node ids (the > accessions) with branch lengths, if you want to also keep the > existing labels you could try something like: > > > for my $node ($t1->get_nodes) { > if ($node->is_Leaf) { > $node->id($node->branch___length.' '.$node->id); > } else { > > $node->id($node->branch___length) > } > } > > Cheers, > Roy. > > > On 04/08/2011 05:36, Shachi Gahoi wrote: > > Thank You so much. Now branch length is coming in tree. > > But I want Accesssion number in place of node id. > > I attached snapshot of tree as I want. Please tell me how can I > do this. > > > > > On Wed, Aug 3, 2011 at 7:31 PM, Roy Chaudhuri > > >> wrote: > > Sorry, the code had a typo, it should be: > > > my $obj1 = Bio::Tree::Draw::Cladogram->____new(-bootstrap => 1, > -tree => $t1, > -compact => 0); > for my $node ($t1->get_nodes) { > > $node->id($node->branch_____length) if defined > $node->branch_length; > } > $obj1->print(-file => "$dir/$stem.eps") > > On 03/08/2011 14:58, Roy Chaudhuri wrote: > > Hi Shachi, > > I don't think you can draw labels on branches using > Bio::Tree::Draw::Cladogram. However, it will draw node > labels, > so you > could copy the branch lengths over to the node ids: > > my $obj1 = > Bio::Tree::Draw::Cladogram->____new(-bootstrap => 1, > -tree => > $t1, > -compact => > 0); > for my $node ($tree->get_nodes) { > $node->id($node->branch_____length) if defined > $node->branch_length; > } > $obj1->print(-file => "$dir/$stem.eps") > > Incidentally, in your script you write the tree out to a > file, > then read > it back in using TreeIO. This is unnecessary, you can > use $tree > directly > as input to Bio::Tree::Draw::Cladogram. > > Alternatively, you could write out a newick file and use > non-Bioperl > software such as njplot or MEGA to draw your tree with > labelled > branch > lengths. > > Cheers, > Roy. > > On 03/08/2011 07:00, Shachi Gahoi wrote: > > Dear All > > I am using Bio::Tree modules for constructing and > drawing > tree. *I am unable > to show branch length value in tree. > * > Please tell me How can I do this, if anybody knows. > > Here is my script which i am using...and i also attached > generated tree. > > Thanks in advance > > > ##############################____############################__##__##########################__####__###### > > use Bio::AlignIO; > use Bio::Align::ProteinStatistics; > use Bio::Tree::DistanceFactory; > use Bio::TreeIO; > use Bio::Tree::Draw::Cladogram; > > # for a dna alignment > # can also use ProteinStatistics > > my $alnio = Bio::AlignIO->new(-file => 'ADP.aln', > -format=>'clustalw'); > > my $dfactory = > Bio::Tree::DistanceFactory->____new(-method => > 'UPGMA'); > > my $stats = Bio::Align::ProteinStatistics-____>new; > > my $treeout = Bio::TreeIO->new(-format => 'newick', > -file > =>'>ADP1.dnd'); > > while( my $aln = $alnio->next_aln ) > { > my $mat = $stats->distance(-method => 'Kimura', > -align > => $aln); > > my $tree = $dfactory->make_tree($mat); > $treeout->write_tree($tree); > } > > my $dir = shift || '.'; > > opendir(DIR, $dir) || die $!; > for my $file ( readdir(DIR) ) > { > next unless $file =~ /(\S+)\.dnd$/; > my $stem = $1; > my $treeio = Bio::TreeIO->new('-format' => > 'newick', > '-file' => "$dir/$file"); > > if( my $t1 = $treeio->next_tree ) > { > my $obj1 = > Bio::Tree::Draw::Cladogram->____new(-bootstrap => 1, > > -tree > => $t1, > > -compact => 0); > $obj1->print(-file => "$dir/$stem.eps"); > } > } > > > ##############################____############################__##__##########################__####__############## > > > > > ___________________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > > > > http://lists.open-bio.org/____mailman/listinfo/bioperl-l > > > > > > > > > > -- > Regards, > Shachi > > > > > > -- > Regards, > Shachi From lskatz at gmail.com Fri Aug 5 10:32:50 2011 From: lskatz at gmail.com (Lee Katz) Date: Fri, 5 Aug 2011 10:32:50 -0400 Subject: [Bioperl-l] SeqIO: paired end reads In-Reply-To: <57EA9809-E999-43EF-B340-9A552A4A3FB6@gmail.com> References: <57EA9809-E999-43EF-B340-9A552A4A3FB6@gmail.com> Message-ID: Thank you. I figured out through the Newbler manual that there is a linker sequence to separate the paired end reads. Then, the forum at http://seqanswers.com/forums/showthread.php?t=12940 showed me that the linker sequence is "GTTGGAACCGAAAGGGTTTGAATTCAAACCCTTTCGGTTCCAAC". I think a useful addition to bioperl could be to have paired end reads. This is outside of the domain of bioperl, but now I am left wondering how I could specify the distance between reads in Newbler, if the linker sequence is fixed. On Wed, Aug 3, 2011 at 5:17 PM, Jason Stajich wrote: > it depends on the assembler - For Illumina usually the paired ends end with > /1 /2 and they have the same ID but are in two different files. Depends on > if you are using interleaved paired reads or in two separate files. some > just expect the paired reads to be mated by virtue of being in same order in > two files. the ABYSS and Velvet manuals both explain what is expected so > you will want to check on what are Newbler's assumptions on how the paired > ends are encoded. > > There are simulator tools if that is what you are trying to do in the end? > checkout wgsim which comes with samtools or try dnaa > > > On Aug 3, 2011, at 1:01 PM, Lee Katz wrote: > > > Hi all! I was wondering how to construct paired end reads from scratch. > I > > know the locations of certain sequences across the genome with a high > degree > > of confidence and so I want to give them to my assembler as paired end > > reads, along with my other sequence runs (454 and Illumina runs). I plan > to > > use Newbler. > > > > My only problem is that I do not know the correct format in order to > specify > > distance and sequences for a paired end reads run, and so I hope that > there > > is a SeqIO solution. At the least, I hope that one bioperl member can > point > > me to where the definition of the paired end reads file format is...? > > > > Thank you! > > > > --Lee > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From cjfields at illinois.edu Fri Aug 5 11:50:42 2011 From: cjfields at illinois.edu (Chris Fields) Date: Fri, 5 Aug 2011 10:50:42 -0500 Subject: [Bioperl-l] Bio::Root::IO _readline/_pushback behavior In-Reply-To: <4E3BAC99.8050806@biotech.uni-tuebingen.de> References: <4E390CE8.2050100@biotech.uni-tuebingen.de> <4196E008-4A81-41E5-A4F9-F9F8D3851E5C@illinois.edu> <4E3BAC99.8050806@biotech.uni-tuebingen.de> Message-ID: <86DE321E-E532-4089-9B89-E257DB37CE46@illinois.edu> I would just go based on the test suite for now. If we run into others that don't have tests we need to add new tests for those anyway. chris On Aug 5, 2011, at 3:40 AM, Kai Blin wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 2011-08-04 23:42, Chris Fields wrote: > >> Yeah, it's a queue; the 'buffering' is a simple internal array using >> push/shift. I say we merge the change in from the branch and fix >> any modules accordingly. > > Ok, I'm happy to take care of it, if people can tell me how to find and > fix modules that use the old assumption. My initial attempt right after > making the change was to run the test suite, which came up clean apart > from the RootIO.t case that my patch now modifies as well. > > Cheers, > Kai > > - -- > Dipl.-Inform. Kai Blin kai.blin at biotech.uni-tuebingen.de > Institute for Microbiology and Infection Medicine > Division of Microbiology/Biotechnology > Eberhard-Karls-Universit?t T?bingen > Auf der Morgenstelle 28 Phone : ++49 7071 29-78841 > D-72076 T?bingen Fax : ++49 7071 29-5979 > Germany > Homepage: http://www.mikrobio.uni-tuebingen.de/ag_wohlleben > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.10 (GNU/Linux) > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ > > iQEcBAEBAgAGBQJOO6yZAAoJEKM5lwBiwTTPdjsH/0ELbz9VYIzxlpx+QZ3Jvd55 > KTXVP+oOzjIDlOdxbdqYR0w04VXnpkQek3hVt0mbreuKvtdMJY/YhRwZLiOzYSak > ruhswUJQnm3K2vkaqpgLESIIUASneFrW7ezfV3R9q/Ov730GBDAtkLTEk7cVV5Cg > W515ixJtNC7v6fZmNFJZudQbcUYYgy+8BFgvNUaSoH8YqubMXzjFXknBWeWT0qco > ivHjqIc6Nkap799ijPiLEU7ArI1pEOB2jyvjntIocFR72imbo7e86RaVHJCNl/N7 > GFbRGoH2m7LVeWFYuNM3vsTS3W4KVLg9U/8UBysykR3uoHAVJhm4T5nCT4NKE/w= > =z6QZ > -----END PGP SIGNATURE----- > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Fri Aug 5 16:49:54 2011 From: cjfields at illinois.edu (Chris Fields) Date: Fri, 5 Aug 2011 15:49:54 -0500 Subject: [Bioperl-l] BioPerl Test requirements In-Reply-To: <0D28A228-53D1-4843-B99D-9F8A48132EA2@illinois.edu> References: <79E5ED11-1D3F-436F-AA02-D922EAEB123A@illinois.edu> <0D28A228-53D1-4843-B99D-9F8A48132EA2@illinois.edu> Message-ID: <1FDBD8D4-E8E6-44EB-A18A-7E74A0EF9014@illinois.edu> Okay, I tested this out on a branch and then merged into 'master'. Test::Most is a 'build_requires'; Bio::Root::Test is now just a wrapper for Test::Most methods, with a few extra wrinkles to deal with Test::Warn and a few additional methods. I also removed extraneous modules in t/lib along with Bio::Root::Test::Warn (that code was merged into Bio::Root::Test to keep all evilness in one contained location). The nice thing is the transition didn't require changing any tests. However, this will require some testing across the board to make sure everything's working. Maybe worth getting the code cleaned up for another quick point release prior to the GSoC mayhem to ensue shortly... :) chris On Aug 1, 2011, at 3:34 PM, Chris Fields wrote: > Okay, will do. I'll initially test on a branch and then pull in. Thanks for the feedback Hilmar and Dave! > > chris > > On Aug 1, 2011, at 3:30 PM, Hilmar Lapp wrote: > >> I think the small burden this change incurs for each developer is well outweighed by the reduced maintenance and installation burden. Go for it. >> >> -hilmar >> >> On Aug 1, 2011, at 12:07 AM, Chris Fields wrote: >> >>> All, >>> >>> We are currently using a BioPerl-specific module for running tests called Bio::Root::Test. It is essentially a wrapper module, re-exporting all the methods for Test::More, Test::Exception, and Test::Warn. One problem: it currently expects a copy of Test::Warn and Test::Exception in each repository as a fallback. Another problem: these included modules appear to be triggering dependencies with debian packaging. >>> >>> As an example of one hidden dependency, the included Test::Warn requires Array::Compare, which converted to Moose a few years ago, so you automatically have to install the entire Moose dependency tree, even though Bioperl doesn't require it (not a slam on Moose, you really SHOULD be using Moose these days. No, really :). >>> >>> Anway, more recent versions of Test::Warn don't have this requirement, but as we package an old version of this module we get stuck with the dependencies until we (manually) update this for each repository. Ick. >>> >>> I think the best solution is to remove the bioperl-local modules in t/lib and list Test::Most instead as a 'build_requires' in Build.PL, e.g. the module is only necessary for the build phase so is optionally installed. Test::Most essentially does exactly the same thing as Bio::Root::Test and more; it also includes Test::Deep and Test::Diff (Bio::Root::Test has a few additional methods of use as well). >>> >>> As this will require developers to use Test::Most instead, though, I though it would be worth asking on the list to see if there are any objections. Any thoughts? >>> >>> >>> chris >>> >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> -- >> =========================================================== >> : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : >> =========================================================== >> >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From kai.blin at biotech.uni-tuebingen.de Fri Aug 5 18:35:32 2011 From: kai.blin at biotech.uni-tuebingen.de (Kai Blin) Date: Sat, 06 Aug 2011 00:35:32 +0200 Subject: [Bioperl-l] Bio::Root::IO _readline/_pushback behavior In-Reply-To: <86DE321E-E532-4089-9B89-E257DB37CE46@illinois.edu> References: <4E390CE8.2050100@biotech.uni-tuebingen.de> <4196E008-4A81-41E5-A4F9-F9F8D3851E5C@illinois.edu> <4E3BAC99.8050806@biotech.uni-tuebingen.de> <86DE321E-E532-4089-9B89-E257DB37CE46@illinois.edu> Message-ID: <4E3C7034.2000106@biotech.uni-tuebingen.de> On 2011-08-05 17:50, Chris Fields wrote: > I would just go based on the test suite for now. If we run into > others that don't have tests we need to add new tests for those > anyway. Ok, pushed to master. Cheers, Kai -- Dipl.-Inform. Kai Blin kai.blin at biotech.uni-tuebingen.de Institute for Microbiology and Infection Medicine Division of Microbiology/Biotechnology Eberhard-Karls-University of T?bingen Auf der Morgenstelle 28 Phone : ++49 7071 29-78841 D-72076 T?bingen Fax : ++49 7071 29-5979 Deutschland Homepage: http://www.mikrobio.uni-tuebingen.de/ag_wohlleben From shachigahoimbi at gmail.com Sat Aug 6 00:25:43 2011 From: shachigahoimbi at gmail.com (Shachi Gahoi) Date: Sat, 6 Aug 2011 09:55:43 +0530 Subject: [Bioperl-l] How to show branch length value in tree In-Reply-To: <4E3BCBE8.4030303@gmail.com> References: <4E3953FE.5080304@gmail.com> <4E3954AE.2080401@gmail.com> <4E3A80EF.2010409@gmail.com> <4E3BCBE8.4030303@gmail.com> Message-ID: Thank you so much. Please tell me one more thing, *can I reduce branch length font? * On Fri, Aug 5, 2011 at 4:24 PM, Roy Chaudhuri wrote: > In that case then you only want to add branch lengths to non-leaf nodes, so > it would be: > > > for my $node ($t1->get_nodes) { > $node->id($node->branch_**length) unless $node->is_Leaf > > } > > > On 05/08/2011 06:40, Shachi Gahoi wrote: > >> >> Instead of both node id and accession, Can I replace node id with >> accession? >> >> >> On Thu, Aug 4, 2011 at 4:52 PM, Roy Chaudhuri > >> wrote: >> >> Hi Shachi, >> >> Please keep replies on the mailing list, that way others can follow >> the discussion. >> >> As I mentioned, it is not possible to draw njplot-style trees with >> labelled branches using Bio::Tree::Draw::Cladogram, it currently >> only labels nodes (you could perhaps add branch labels as a feature >> request on Redmine). >> >> The code I gave overwrites the existing "leaf" node ids (the >> accessions) with branch lengths, if you want to also keep the >> existing labels you could try something like: >> >> >> for my $node ($t1->get_nodes) { >> if ($node->is_Leaf) { >> $node->id($node->branch___**length.' '.$node->id); >> } else { >> >> $node->id($node->branch___**length) >> } >> } >> >> Cheers, >> Roy. >> >> >> On 04/08/2011 05:36, Shachi Gahoi wrote: >> >> Thank You so much. Now branch length is coming in tree. >> >> But I want Accesssion number in place of node id. >> >> I attached snapshot of tree as I want. Please tell me how can I >> do this. >> >> >> >> >> On Wed, Aug 3, 2011 at 7:31 PM, Roy Chaudhuri >> >> > >> > >>> >> wrote: >> >> Sorry, the code had a typo, it should be: >> >> >> my $obj1 = Bio::Tree::Draw::Cladogram->__**__new(-bootstrap => >> 1, >> -tree => $t1, >> -compact => 0); >> for my $node ($t1->get_nodes) { >> >> $node->id($node->branch_____**length) if defined >> $node->branch_length; >> } >> $obj1->print(-file => "$dir/$stem.eps") >> >> On 03/08/2011 14:58, Roy Chaudhuri wrote: >> >> Hi Shachi, >> >> I don't think you can draw labels on branches using >> Bio::Tree::Draw::Cladogram. However, it will draw node >> labels, >> so you >> could copy the branch lengths over to the node ids: >> >> my $obj1 = >> Bio::Tree::Draw::Cladogram->__**__new(-bootstrap => 1, >> -tree => >> $t1, >> -compact => >> 0); >> for my $node ($tree->get_nodes) { >> $node->id($node->branch_____**length) if defined >> $node->branch_length; >> } >> $obj1->print(-file => "$dir/$stem.eps") >> >> Incidentally, in your script you write the tree out to a >> file, >> then read >> it back in using TreeIO. This is unnecessary, you can >> use $tree >> directly >> as input to Bio::Tree::Draw::Cladogram. >> >> Alternatively, you could write out a newick file and use >> non-Bioperl >> software such as njplot or MEGA to draw your tree with >> labelled >> branch >> lengths. >> >> Cheers, >> Roy. >> >> On 03/08/2011 07:00, Shachi Gahoi wrote: >> >> Dear All >> >> I am using Bio::Tree modules for constructing and >> drawing >> tree. *I am unable >> to show branch length value in tree. >> * >> Please tell me How can I do this, if anybody knows. >> >> Here is my script which i am using...and i also >> attached >> generated tree. >> >> Thanks in advance >> >> >> ##############################**____##########################** >> ##__##__######################**####__####__###### >> >> use Bio::AlignIO; >> use Bio::Align::ProteinStatistics; >> use Bio::Tree::DistanceFactory; >> use Bio::TreeIO; >> use Bio::Tree::Draw::Cladogram; >> >> # for a dna alignment >> # can also use ProteinStatistics >> >> my $alnio = Bio::AlignIO->new(-file => 'ADP.aln', >> -format=>'clustalw'); >> >> my $dfactory = >> Bio::Tree::DistanceFactory->__**__new(-method => >> 'UPGMA'); >> >> my $stats = Bio::Align::ProteinStatistics-**____>new; >> >> my $treeout = Bio::TreeIO->new(-format => 'newick', >> -file >> =>'>ADP1.dnd'); >> >> while( my $aln = $alnio->next_aln ) >> { >> my $mat = $stats->distance(-method => 'Kimura', >> -align >> => $aln); >> >> my $tree = $dfactory->make_tree($mat); >> $treeout->write_tree($tree); >> } >> >> my $dir = shift || '.'; >> >> opendir(DIR, $dir) || die $!; >> for my $file ( readdir(DIR) ) >> { >> next unless $file =~ /(\S+)\.dnd$/; >> my $stem = $1; >> my $treeio = Bio::TreeIO->new('-format' => >> 'newick', >> '-file' => "$dir/$file"); >> >> if( my $t1 = $treeio->next_tree ) >> { >> my $obj1 = >> Bio::Tree::Draw::Cladogram->__**__new(-bootstrap => >> 1, >> >> -tree >> => $t1, >> >> -compact => 0); >> $obj1->print(-file => "$dir/$stem.eps"); >> } >> } >> >> >> ##############################**____##########################** >> ##__##__######################**####__####__############## >> >> >> >> >> ______________________________**_____________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org > bio.org > >> >> >> >> >> >> http://lists.open-bio.org/____**mailman/listinfo/bioperl-l >> >> > >> >> >> >> >> >> >> >> >> >> >> -- >> Regards, >> Shachi >> >> >> >> >> >> -- >> Regards, >> Shachi >> > > -- Regards, Shachi From p.j.a.cock at googlemail.com Sun Aug 7 05:40:52 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Sun, 7 Aug 2011 10:40:52 +0100 Subject: [Bioperl-l] SeqIO: paired end reads In-Reply-To: References: <57EA9809-E999-43EF-B340-9A552A4A3FB6@gmail.com> Message-ID: On Friday, August 5, 2011, Lee Katz wrote: > Thank you. I figured out through the Newbler manual that there is a linker > sequence to separate the paired end reads. Then, the forum at > http://seqanswers.com/forums/showthread.php?t=12940 showed me that the > linker sequence is "GTTGGAACCGAAAGGGTTTGAATTCAAACCCTTTCGGTTCCAAC". There is more than one Roche 454 linker sequence depending on the chemistry used, one is the same as it's reversve complement, one isn't. There is nothing in the SFF file format (nor the Roche specific XML manifest last time I checked) that handles the paired end information explicitly. > I think a useful addition to bioperl could be to have paired end reads. > Maybe, but to do this well you'd want to do flow space alignment of the reads to the linker sequence to find the imperfectly called linker sequences. Personally I use ssf_extract which is a free open source command line tool for this (calling an external aligned tool for paid end 454). > This is outside of the domain of bioperl, but now I am left wondering how I > could specify the distance between reads in Newbler, if the linker sequence > is fixed. How to do that depends on the aligned or assembly tool you are using. Peter From cjfields at illinois.edu Sun Aug 7 11:51:19 2011 From: cjfields at illinois.edu (Chris Fields) Date: Sun, 7 Aug 2011 10:51:19 -0500 Subject: [Bioperl-l] SeqIO: paired end reads In-Reply-To: References: <57EA9809-E999-43EF-B340-9A552A4A3FB6@gmail.com> Message-ID: <19923C8B-6C84-4D9B-8D37-86CAE9BC681E@illinois.edu> On Aug 7, 2011, at 4:40 AM, Peter Cock wrote: > On Friday, August 5, 2011, Lee Katz wrote: >> Thank you. I figured out through the Newbler manual that there is a > linker >> sequence to separate the paired end reads. Then, the forum at >> http://seqanswers.com/forums/showthread.php?t=12940 showed me that the >> linker sequence is "GTTGGAACCGAAAGGGTTTGAATTCAAACCCTTTCGGTTCCAAC". > > There is more than one Roche 454 linker sequence depending on the chemistry > used, one is the same as it's reversve complement, one isn't. > > There is nothing in the SFF file format (nor the Roche specific XML manifest > last time I checked) that handles the paired end information explicitly. Yep, it's all implied AFAIK. >> I think a useful addition to bioperl could be to have paired end reads. >> > > Maybe, but to do this well you'd want to do flow space alignment of the > reads to the linker sequence to find the imperfectly called linker > sequences. > > Personally I use ssf_extract which is a free open source command line tool > for this (calling an external aligned tool for paid end 454). I think it could be done, but I would implement something like this as a wrapper around faster tools (like sff_extract or similar). Implementing the functionality in pure (bio)perl/(bio)python doesn't make much sense if there are newer/faster tools out there. >> This is outside of the domain of bioperl, but now I am left wondering how > I >> could specify the distance between reads in Newbler, if the linker > sequence >> is fixed. > > How to do that depends on the aligned or assembly tool you are using. > > Peter Yep. I don't think there is a defined way to specify that in any format that I know of. chris From Russell.Smithies at agresearch.co.nz Sun Aug 7 17:45:19 2011 From: Russell.Smithies at agresearch.co.nz (Smithies, Russell) Date: Mon, 8 Aug 2011 09:45:19 +1200 Subject: [Bioperl-l] How to show branch length value in tree In-Reply-To: References: <4E3953FE.5080304@gmail.com> <4E3954AE.2080401@gmail.com> <4E3A80EF.2010409@gmail.com> <4E3BCBE8.4030303@gmail.com> Message-ID: <18DF7D20DFEC044098A1062202F5FFF3396074D3C9@exchsth.agresearch.co.nz> The constructor for Bio::Tree::Draw::Cladogram lets you specify the font and size, did you try setting it there? Title : new Usage : my $obj = Bio::Tree::Draw::Cladogram->new(); Function: Builds a new Bio::Tree::Draw::Cladogram object Returns : Bio::Tree::Draw::Cladogram Args : -tree => Bio::Tree::Tree object -second => Bio::Tree::Tree object (optional) -font => font name [string] (optional) <<<<------------- -size => font size [integer] (optional) <<<<------------- -top => top margin [integer] (optional) -bottom => bottom margin [integer] (optional) -left => left margin [integer] (optional) -right => right margin [integer] (optional) -tip => extra tip space [integer] (optional) -column => extra space between cladograms [integer] (optional) -compact => ignore branch lengths [boolean] (optional) -ratio => horizontal to vertical ratio [integer] (optional) -colors => use colors to color edges [boolean] (optional) -bootstrap => draw bootstrap or internal ids [boolean] --Russell > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Shachi Gahoi > Sent: Saturday, 6 August 2011 4:26 p.m. > To: Roy Chaudhuri > Cc: bioperl-l List > Subject: Re: [Bioperl-l] How to show branch length value in tree > > Thank you so much. > > Please tell me one more thing, *can I reduce branch length font? > * > On Fri, Aug 5, 2011 at 4:24 PM, Roy Chaudhuri > wrote: > > > In that case then you only want to add branch lengths to non-leaf > nodes, so > > it would be: > > > > > > for my $node ($t1->get_nodes) { > > $node->id($node->branch_**length) unless $node->is_Leaf > > > > } > > > > > > On 05/08/2011 06:40, Shachi Gahoi wrote: > > > >> > >> Instead of both node id and accession, Can I replace node id with > >> accession? > >> > >> > >> On Thu, Aug 4, 2011 at 4:52 PM, Roy Chaudhuri > >> >> wrote: > >> > >> Hi Shachi, > >> > >> Please keep replies on the mailing list, that way others can > follow > >> the discussion. > >> > >> As I mentioned, it is not possible to draw njplot-style trees > with > >> labelled branches using Bio::Tree::Draw::Cladogram, it currently > >> only labels nodes (you could perhaps add branch labels as a > feature > >> request on Redmine). > >> > >> The code I gave overwrites the existing "leaf" node ids (the > >> accessions) with branch lengths, if you want to also keep the > >> existing labels you could try something like: > >> > >> > >> for my $node ($t1->get_nodes) { > >> if ($node->is_Leaf) { > >> $node->id($node->branch___**length.' '.$node->id); > >> } else { > >> > >> $node->id($node->branch___**length) > >> } > >> } > >> > >> Cheers, > >> Roy. > >> > >> > >> On 04/08/2011 05:36, Shachi Gahoi wrote: > >> > >> Thank You so much. Now branch length is coming in tree. > >> > >> But I want Accesssion number in place of node id. > >> > >> I attached snapshot of tree as I want. Please tell me how can > I > >> do this. > >> > >> > >> > >> > >> On Wed, Aug 3, 2011 at 7:31 PM, Roy Chaudhuri > >> > >> > > >> >> >>> > >> wrote: > >> > >> Sorry, the code had a typo, it should be: > >> > >> > >> my $obj1 = Bio::Tree::Draw::Cladogram->__**__new(- > bootstrap => > >> 1, > >> -tree => > $t1, > >> -compact => > 0); > >> for my $node ($t1->get_nodes) { > >> > >> $node->id($node->branch_____**length) if defined > >> $node->branch_length; > >> } > >> $obj1->print(-file => "$dir/$stem.eps") > >> > >> On 03/08/2011 14:58, Roy Chaudhuri wrote: > >> > >> Hi Shachi, > >> > >> I don't think you can draw labels on branches using > >> Bio::Tree::Draw::Cladogram. However, it will draw > node > >> labels, > >> so you > >> could copy the branch lengths over to the node ids: > >> > >> my $obj1 = > >> Bio::Tree::Draw::Cladogram->__**__new(-bootstrap => 1, > >> -tree > => > >> $t1, > >> -compact > => > >> 0); > >> for my $node ($tree->get_nodes) { > >> $node->id($node->branch_____**length) if > defined > >> $node->branch_length; > >> } > >> $obj1->print(-file => "$dir/$stem.eps") > >> > >> Incidentally, in your script you write the tree out > to a > >> file, > >> then read > >> it back in using TreeIO. This is unnecessary, you can > >> use $tree > >> directly > >> as input to Bio::Tree::Draw::Cladogram. > >> > >> Alternatively, you could write out a newick file and > use > >> non-Bioperl > >> software such as njplot or MEGA to draw your tree > with > >> labelled > >> branch > >> lengths. > >> > >> Cheers, > >> Roy. > >> > >> On 03/08/2011 07:00, Shachi Gahoi wrote: > >> > >> Dear All > >> > >> I am using Bio::Tree modules for constructing and > >> drawing > >> tree. *I am unable > >> to show branch length value in tree. > >> * > >> Please tell me How can I do this, if anybody > knows. > >> > >> Here is my script which i am using...and i also > >> attached > >> generated tree. > >> > >> Thanks in advance > >> > >> > >> > ##############################**____##########################** > >> ##__##__######################**####__####__###### > >> > >> use Bio::AlignIO; > >> use Bio::Align::ProteinStatistics; > >> use Bio::Tree::DistanceFactory; > >> use Bio::TreeIO; > >> use Bio::Tree::Draw::Cladogram; > >> > >> # for a dna alignment > >> # can also use ProteinStatistics > >> > >> my $alnio = Bio::AlignIO->new(-file => 'ADP.aln', > >> -format=>'clustalw'); > >> > >> my $dfactory = > >> Bio::Tree::DistanceFactory->__**__new(-method => > >> 'UPGMA'); > >> > >> my $stats = Bio::Align::ProteinStatistics- > **____>new; > >> > >> my $treeout = Bio::TreeIO->new(-format => > 'newick', > >> -file > >> =>'>ADP1.dnd'); > >> > >> while( my $aln = $alnio->next_aln ) > >> { > >> my $mat = $stats->distance(-method => > 'Kimura', > >> -align > >> => $aln); > >> > >> my $tree = $dfactory->make_tree($mat); > >> $treeout->write_tree($tree); > >> } > >> > >> my $dir = shift || '.'; > >> > >> opendir(DIR, $dir) || die $!; > >> for my $file ( readdir(DIR) ) > >> { > >> next unless $file =~ /(\S+)\.dnd$/; > >> my $stem = $1; > >> my $treeio = Bio::TreeIO->new('-format' => > >> 'newick', > >> '-file' => "$dir/$file"); > >> > >> if( my $t1 = $treeio->next_tree ) > >> { > >> my $obj1 = > >> Bio::Tree::Draw::Cladogram->__**__new(-bootstrap > => > >> 1, > >> > >> -tree > >> => $t1, > >> > >> -compact => 0); > >> $obj1->print(-file => > "$dir/$stem.eps"); > >> } > >> } > >> > >> > >> > ##############################**____##########################** > >> ##__##__######################**####__####__############## > >> > >> > >> > >> > >> > ______________________________**_____________________ > >> Bioperl-l mailing list > >> Bioperl-l at lists.open-bio.org >> bio.org > > >> l at lists.open-__bio.org> > >> bio.org> > >> >> > >> > >> http://lists.open-bio.org/____**mailman/listinfo/bioperl- > l > >> l > >> > > >> l > >> l > >> >> > >> > >> > >> > >> > >> > >> > >> -- > >> Regards, > >> Shachi > >> > >> > >> > >> > >> > >> -- > >> Regards, > >> Shachi > >> > > > > > > > -- > Regards, > Shachi > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= From cjfields at illinois.edu Tue Aug 9 16:10:37 2011 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 9 Aug 2011 15:10:37 -0500 Subject: [Bioperl-l] Question to Bio::SearchIO::infernal.pm In-Reply-To: References: <4E32E14B020000EE00004F57@gwia1.boku.ac.at> Message-ID: <683C7B42-338F-42AE-AF93-11BFB4DB2CB7@illinois.edu> Following this up: Nadine, did you have a bug to report? It's kind of hard to fix this without some example data. chris On Aug 3, 2011, at 8:10 AM, Chris Fields wrote: > Nadine, > > Hard to guess w/o seeing the report, but I'm not terribly surprised. I believe I only coded for simple 1 CM reports, IIRC. You'll have to file this as a bug on redmine along with an example. > > chris > > On Jul 29, 2011, at 9:35 AM, Nadine Elpida Tatto wrote: > >> Hi There! >> >> >> >> I was wondering if you would or can help me. >> >> >> I have an infernal report containing about 2000 CMs from an infernal run against Rfam.cm. To parse this report I wanted to use Bio::SearchIO::infernal.pm. Unfortunately this turned out to be a problem for me, because "$parser->next_result" only delivers the result for the first CM in the report and nothing more. >> >> >> My code: >> #!/usr/bin/perl -w >> >> >> use strict;use Data::Dumper; >> use Bio::SearchIO; >> >> >> my $infile = $ARGV[0]; # infernal report >> my $parser = Bio::SearchIO->new(-format => 'Infernal', >> -file => $infile); >> >> >> while( my $result = $parser->next_result ) { >> print $result->query_name . "\n"; >> } >> >> >> exit; >> >> >> >> >> The output: >> >> >> ntatto:~$ ./infernalParser.pl infernal.output >> 5S_rRNA >> ntatto:~$ >> >> >> >> >> I would expect the following (like parsing a blast report): >> >> >> ntatto:~$ ./infernalParser.pl infernal.output >> 5S_rRNA >> 5_8S_rRNA >> U1 >> ... >> ntatto:~$ >> >> >> >> I would be glad for help. >> >> >> Thank you in advance. >> >> >> Best Regards >> >> >> N Tatto >> >> >> >> >> >> >> >> >> >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From torsten.seemann at infotech.monash.edu.au Sun Aug 14 04:32:46 2011 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Sun, 14 Aug 2011 18:32:46 +1000 Subject: [Bioperl-l] Genbank file : bad features (tag) order with /translation In-Reply-To: <5D9B8006-A20E-4AAE-88EF-0A1DCA56B26E@verizon.net> References: <7717ecd0-c9b6-4ecc-895f-db60ca89e259@l7g2000vbz.googlegroups.com> <5D9B8006-A20E-4AAE-88EF-0A1DCA56B26E@verizon.net> Message-ID: > I currently use BioPerl and SeqIO::genbank to create the *gbf files for NCBI submission, they've always accepted them. In fact I think they don't even use them, I believe they use the *tbl, *fsa, and *agp files and the ASN file as data sources. I'm pretty sure that NCBI/Genbank do *not* accept Genbank files for submission - which I found somewhat ironic! They require an ASN1 formatted file (XML-like hierarchial format, pre-dates XML), which is sometimes given a .sqn extenison if you use the Sequin GUI to prepare it. There are command line tools like "tbl2asn" which will take the .tbl and .fsa files Brian has listed to produce the ASN file too. As far as I know, there is no NCBI tools to take a .gbk and produce the .tbl/.fsa/.agp - does anyone know otherwise? -- --Torsten Seemann --Victorian Bioinformatics Consortium, Dept. Microbiology, Monash University, AUSTRALIA From cjfields at illinois.edu Sun Aug 14 10:22:10 2011 From: cjfields at illinois.edu (Chris Fields) Date: Sun, 14 Aug 2011 09:22:10 -0500 Subject: [Bioperl-l] Genbank file : bad features (tag) order with /translation In-Reply-To: References: <7717ecd0-c9b6-4ecc-895f-db60ca89e259@l7g2000vbz.googlegroups.com> <5D9B8006-A20E-4AAE-88EF-0A1DCA56B26E@verizon.net> Message-ID: <410A8BF5-D5EF-4E7A-B91C-D3DDACBABB75@illinois.edu> Not that I'm aware of, though it shouldn't be hard to set something up using Bio::SeqIO for that. chris On Aug 14, 2011, at 3:32 AM, Torsten Seemann wrote: >> I currently use BioPerl and SeqIO::genbank to create the *gbf files for NCBI submission, they've always accepted them. In fact I think they don't even use them, I believe they use the *tbl, *fsa, and *agp files and the ASN file as data sources. > > I'm pretty sure that NCBI/Genbank do *not* accept Genbank files for > submission - which I found somewhat ironic! > > They require an ASN1 formatted file (XML-like hierarchial format, > pre-dates XML), which is sometimes given a .sqn extenison if you use > the Sequin GUI to prepare it. There are command line tools like > "tbl2asn" which will take the .tbl and .fsa files Brian has listed to > produce the ASN file too. > > As far as I know, there is no NCBI tools to take a .gbk and produce > the .tbl/.fsa/.agp - does anyone know otherwise? > > -- > --Torsten Seemann > --Victorian Bioinformatics Consortium, Dept. Microbiology, Monash > University, AUSTRALIA > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From maximilien1er at gmail.com Sun Aug 14 10:23:39 2011 From: maximilien1er at gmail.com (Maxime =?ISO-8859-1?Q?D=E9raspe?=) Date: Sun, 14 Aug 2011 10:23:39 -0400 Subject: [Bioperl-l] Genbank file : bad features (tag) order with /translation In-Reply-To: References: <7717ecd0-c9b6-4ecc-895f-db60ca89e259@l7g2000vbz.googlegroups.com> <5D9B8006-A20E-4AAE-88EF-0A1DCA56B26E@verizon.net> Message-ID: <1313331819.15034.4.camel@maximilian-home> I know that Artemis from sanger institute can convert a genbank file into a sequin tab file. Then you could use that file to submit it to ncbi with their sequin soft. But I think that the genbank file would be ok too. Max On Sun, 2011-08-14 at 18:32 +1000, Torsten Seemann wrote: > > I currently use BioPerl and SeqIO::genbank to create the *gbf files for NCBI submission, they've always accepted them. In fact I think they don't even use them, I believe they use the *tbl, *fsa, and *agp files and the ASN file as data sources. > > I'm pretty sure that NCBI/Genbank do *not* accept Genbank files for > submission - which I found somewhat ironic! > > They require an ASN1 formatted file (XML-like hierarchial format, > pre-dates XML), which is sometimes given a .sqn extenison if you use > the Sequin GUI to prepare it. There are command line tools like > "tbl2asn" which will take the .tbl and .fsa files Brian has listed to > produce the ASN file too. > > As far as I know, there is no NCBI tools to take a .gbk and produce > the .tbl/.fsa/.agp - does anyone know otherwise? > From punit_vergoboy2004 at yahoo.co.in Thu Aug 18 08:14:54 2011 From: punit_vergoboy2004 at yahoo.co.in (punit kumar) Date: Thu, 18 Aug 2011 17:44:54 +0530 (IST) Subject: [Bioperl-l] query about Bio::Tools::Run::RemoteBlast In-Reply-To: <410A8BF5-D5EF-4E7A-B91C-D3DDACBABB75@illinois.edu> References: <7717ecd0-c9b6-4ecc-895f-db60ca89e259@l7g2000vbz.googlegroups.com> <5D9B8006-A20E-4AAE-88EF-0A1DCA56B26E@verizon.net> <410A8BF5-D5EF-4E7A-B91C-D3DDACBABB75@illinois.edu> Message-ID: <1313669694.59013.YahooMailNeo@web137303.mail.in.yahoo.com> hi friends ,? i am new to Bioperl , and i am using "Bio::Tools::Run::RemoteBlast" for remote blast ?i tried to use this module and i?succeed?a little yet, i want to get the description part of blast alignments which were found against my query sequence, as result is shown in format as given below, which is the out put table of ONLINE BLAST, Sequences producing significant alignments: Accession Description Max score Total score Query coverage E value Links NP_216760.1 acyl carrier protein [Mycobacterium tuberculosis H37Rv] >ref|NP_336774.1| acyl carrier protein [Mycobacterium tuberculosis CDC1551] >ref|NP_855917.1| acyl carrier protein [Mycobacterium bovis AF2122/97] >ref|YP_978350.1| acyl carrier protein [Mycobacterium bovis BCG str. Pasteur 1173P2] >ref|YP_001283588.1| acyl carrier protein [Mycobacterium tuberculosis H37Ra] >ref|YP_001288206.1| acyl carrier protein [Mycobacterium tuberculosis F11] >ref|ZP_02551632.1| acyl carrier protein [Mycobacterium tuberculosis H37Ra] >ref|YP_002645307.1| acyl carrier protein [Mycobacterium bovis BCG str. Tokyo 172] >ref|YP_003031689.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis KZN 1435] >ref|ZP_04925721.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis C] >ref|ZP_04981085.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis str. Haarlem] >ref|ZP_05141736.1| acyl carrier protein [Mycobacterium tuberculosis '98-R604 INH-RIF-EM'] >ref|ZP_06433498.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis T46] >ref|ZP_06437620.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis CPHL_A] >ref|ZP_06443178.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis KZN 605] >ref|ZP_06450592.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis T17] >ref|ZP_06455160.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis K85] >ref|ZP_06504896.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis 02_1987] >ref|ZP_06510220.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis T92] >ref|ZP_06513730.1| meromycolate extension acyl carrier protein acpm [Mycobacterium tuberculosis EAS054] >ref|ZP_06517747.1| meromycolate extension acyl carrier protein acpm [Mycobacterium tuberculosis T85] >ref|ZP_06521786.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis GM 1503] >ref|ZP_06799170.1| acyl carrier protein [Mycobacterium tuberculosis 210] >ref|ZP_06952619.1| acyl carrier protein [Mycobacterium tuberculosis KZN 4207] >ref|ZP_06960948.1| acyl carrier protein [Mycobacterium tuberculosis KZN R506] >ref|ZP_07013145.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis 94_M4241A] >ref|ZP_07414839.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu001] >ref|ZP_07418616.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu002] >ref|ZP_07423348.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu003] >ref|ZP_07427715.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu004] >ref|ZP_07432018.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu005] >ref|ZP_07436410.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu006] >ref|ZP_07440655.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu008] >ref|ZP_07445228.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu007] >ref|ZP_07481045.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu009] >ref|ZP_07485275.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu010] >ref|ZP_07489492.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu011] >ref|ZP_07494023.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu012] >ref|ZP_07816044.1| acyl carrier protein [Mycobacterium tuberculosis KZN V2475] >ref|YP_004723912.1| meromycolate extension acyl carrier protein ACPM [Mycobacterium africanum GM041182] >ref|YP_004745700.1| meromycolate extension acyl carrier protein ACPM [Mycobacterium canettii CIPT 140010059] >sp|P0A4W6.1|ACPM_MYCTU RecName: Full=Meromycolate extension acyl carrier protein; Short=ACP >sp|P0A4W7.1|ACPM_MYCBO RecName: Full=Meromycolate extension acyl carrier protein; Short=ACP >emb|CAA94640.1| MEROMYCOLATE EXTENSION ACYL CARRIER PROTEIN ACPM [Mycobacterium tuberculosis H37Rv] >gb|AAK46588.1| acyl carrier protein [Mycobacterium tuberculosis CDC1551] >emb|CAD97121.1| MEROMYCOLATE EXTENSION ACYL CARRIER PROTEIN ACPM [Mycobacterium bovis AF2122/97] >emb|CAL72249.1| Meromycolate extension acyl carrier protein acpM [Mycobacterium bovis BCG str. Pasteur 1173P2] >gb|EAY60463.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis C] >gb|EBA42598.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis str. Haarlem] >gb|ABQ74026.1| meromycolate extension acyl carrier protein AcpM [Mycobacterium tuberculosis H37Ra] >gb|ABR06604.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis F11] >dbj|BAH26539.1| acyl carrier protein [Mycobacterium bovis BCG str. Tokyo 172] >gb|ACT24794.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis KZN 1435] >gb|EFD13913.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis T46] >gb|EFD18035.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis CPHL_A] >gb|EFD21093.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis KZN 605] >gb|EFD43942.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis K85] >gb|EFD47767.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis T17] >gb|EFD53534.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis 02_1987] >gb|EFD58858.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis T92] >gb|EFD62368.1| meromycolate extension acyl carrier protein acpm [Mycobacterium tuberculosis EAS054] >gb|EFD73930.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis GM 1503] >gb|EFD77945.1| meromycolate extension acyl carrier protein acpm [Mycobacterium tuberculosis T85] >gb|EFI30824.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis 94_M4241A] >gb|EFO74536.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu001] >gb|EFP15742.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu002] >gb|EFP19094.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu003] >gb|EFP22930.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu004] >gb|EFP26734.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu005] >gb|EFP30496.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu006] >gb|EFP33906.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu007] >gb|EFP38213.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu008] >gb|EFP42922.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu009] >gb|EFP46864.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu010] >gb|EFP50800.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu011] >gb|EFP54373.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu012] >gb|EGB28294.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis CDC1551A] >gb|EGE50793.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis W-148] >gb|AEB03875.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis KZN 4207] >gb|AEJ47271.1| acyl carrier protein [Mycobacterium tuberculosis CCDC5079] >gb|AEJ50890.1| acyl carrier protein [Mycobacterium tuberculosis CCDC5180] >emb|CCC27325.1| meromycolate extension acyl carrier protein ACPM [Mycobacterium africanum GM041182] >emb|CCC44598.1| meromycolate extension acyl carrier protein ACPM [Mycobacterium canettii CIPT 140010059] >emb|CCC64838.1| Meromycolate extension acyl carrier protein acpM [Mycobacterium bovis BCG str. Moreau RDJ] 223 223 100% 1e-74 1KLP_A Chain A, The Solution Structure Of Acyl Carrier Protein From Mycobacterium Tuberculosis 220 220 99% 2e-73 ZP_04748738.1 acyl carrier protein [Mycobacterium kansasii ATCC 12478] 165 165 100% 9e-52 ZP_05224070.1 acyl carrier protein [Mycobacterium intracellulare ATCC 13950] 162 162 100% 8e-51 NP_960931.1 acyl carrier protein [Mycobacterium avium subsp. paratuberculosis K-10] >ref|YP_881402.1| acyl carrier protein [Mycobacterium avium 104] >ref|ZP_05216419.1| acyl carrier protein [Mycobacterium avium subsp. avium ATCC 25291] >gb|AAS04314.1| AcpM [Mycobacterium avium subsp. paratuberculosis K-10] >gb|ABK65172.1| acyl carrier protein [Mycobacterium avium 104] >gb|EGO40713.1| acyl carrier protein [Mycobacterium avium subsp. paratuberculosis S397] 162 162 100% 8e-51 NP_302135.1 acyl carrier protein [Mycobacterium leprae TN] >ref|YP_002503765.1| acyl carrier protein [Mycobacterium leprae Br4923] >sp|O69475.1|ACPM_MYCLE RecName: Full=Meromycolate extension acyl carrier protein; Short=ACP >emb|CAA19202.1| acyl carrier protein [Mycobacterium leprae] >emb|CAC30605.1| acyl carrier protein (meromycolate extension) [Mycobacterium leprae] >emb|CAR71749.1| acyl carrier protein (meromycolate extension) [Mycobacterium leprae Br4923] 162 162 100% 2e-50 ZP_07966703.1 hypothetical protein HMPREF9336_03075 [Segniliparus rugosus ATCC BAA-974] >gb|EFV12044.1| hypothetical protein HMPREF9336_03075 [Segniliparus rugosus ATCC BAA-974] 162 162 88% 3e-50 YP_905336.1 acyl carrier protein [Mycobacterium ulcerans Agy99] >ref|YP_001851618.1| meromycolate extension acyl carrier protein AcpM [Mycobacterium marinum M] >gb|ABL03865.1| meromycolate extension acyl carrier protein AcpM [Mycobacterium ulcerans Agy99] >gb|ACC41763.1| meromycolate extension acyl carrier protein AcpM [Mycobacterium marinum M] 161 161 100% 3e-50 ZP_08713925.1 acyl carrier protein [Mycobacterium colombiense CECT 3035] >gb|EGT87768.1| acyl carrier protein [Mycobacterium colombiense CECT 3035] 160 160 100% 6e-50 YP_003660002.1 phosphopantetheine-binding protein [Segniliparus rotundus DSM 44985] >gb|ADG99171.1| phosphopantetheine-binding protein [Segniliparus rotundus DSM 44985] 160 160 88% 8e-50 ? ? ? ? ? ? ? ? ? ? ? where in my code: print "hit name is ",$hit->name, "\n"; # gives me the refrence of aligned sequence ? ? ? print"Score: ".$hsp->score."\n";??# gives me the score of aligned sequence ? ? ?print"E-val: ".$hsp->expect."\n";??# gives me the evalue of aligned sequence ? ? ?print"percent identity: ".$hsp->percent_identity."\n";??# gives me the query coverage ?of aligned sequence i want to use??#print "Description ",$hsp->desc, "\n"; to show the description but i am not getting can any body help me out for this i need to know urgently, thanks to read and i hope i was succesfull to explain my problem . below is the copy of my code i am trying to use : ? use Bio::Tools::Run::RemoteBlast; ? use strict; ? my $v = 1; ? my $prog = 'blastp'; ? my $db ? = 'refseq_protein'; ? my $e_val= '1e-10'; #1e-10 ?my $result; ?#my $code=q| my $answer = my $a / my $b;|; ? ? ? my @params = ( '-prog' => $prog, ? '-data' => $db, ? '-expect' => $e_val ); ? my $factory = Bio::Tools::Run::RemoteBlast->new(@params); ? $v = 1; ? my $str = Bio::SeqIO->new(-file=>'prot.txt' , '-format' => 'fasta' ); ? my $input; ? while($input = $str->next_seq()) ? { ?? ? # ?Blast a sequence against a database: ?? ? my $r = $factory->submit_blast($input); ? print STDERR "waiting..." if( $v > 0 ); ?? ? my %hit_evalue; ? my @evalue; ?? ? while ( my @rids = $factory->each_rid ) { ? ? ? foreach my $rid ( @rids ) { ? ?my $rc = $factory->retrieve_blast($rid); ? ?if( !ref($rc) ) { ? ? ? ?if( $rc < 0 ) { ? ? ? ?$factory->remove_rid($rid); ? ?} ? ? ? ?print STDERR "." if ( $v > 0 ); ? ? ? ?sleep 5; ? ?} else {? ? ? ? ?$factory->remove_rid($rid); ? ? ? ?#print $rid."\n\n"; ? ? ?my $result = $rc->next_result; ? ? ? ? ? ? ?print "db is ", $result->database_name(), "\n"; ? ? ? ?my $count = 0; ? ? ? ?while( my $hit = $result->next_hit ) { ? ?$count++; ? ?#next unless ( $v > 0); ? ?#print "hit name is ", $hit->name, "\n"; ? ?while( my $hsp = $hit->next_hsp ) ?{ ? ? ?print "hit name is ",$hit->name, "\n"; ? ? ?#print "Query name is ",$hsp->desc, "\n"; exit; ? ? ?? ? ? ?print"Score: ".$hsp->score."\n"; ? ? ?print"E-val: ".$hsp->expect."\n"; ? ? ?print"percent identity: ".$hsp->percent_identity."\n"; ?? ?} ? ? ? ? ? ?} ? ?} ? ? ? } ? } ? } From pcantalupo at gmail.com Thu Aug 18 08:55:18 2011 From: pcantalupo at gmail.com (Paul Cantalupo) Date: Thu, 18 Aug 2011 08:55:18 -0400 Subject: [Bioperl-l] query about Bio::Tools::Run::RemoteBlast In-Reply-To: <1313669694.59013.YahooMailNeo@web137303.mail.in.yahoo.com> References: <7717ecd0-c9b6-4ecc-895f-db60ca89e259@l7g2000vbz.googlegroups.com> <5D9B8006-A20E-4AAE-88EF-0A1DCA56B26E@verizon.net> <410A8BF5-D5EF-4E7A-B91C-D3DDACBABB75@illinois.edu> <1313669694.59013.YahooMailNeo@web137303.mail.in.yahoo.com> Message-ID: Punit I think you want '$hit->description' not '$hsp->desc' Paul Paul Cantalupo University of Pittsburgh On Thu, Aug 18, 2011 at 8:14 AM, punit kumar wrote: > hi friends , > > i am new to Bioperl , and i am using "Bio::Tools::Run::RemoteBlast" for remote blast i tried to use this module and i succeed a little yet, i want to get the description part of blast alignments which were found against my query sequence, as result is shown in format as given below, which is the out put table of ONLINE BLAST, > > Sequences producing significant alignments: > Accession > Description > Max score > Total score > Query coverage > E value > Links > NP_216760.1 acyl carrier protein [Mycobacterium tuberculosis H37Rv] >ref|NP_336774.1| acyl carrier protein [Mycobacterium tuberculosis CDC1551] >ref|NP_855917.1| acyl carrier protein [Mycobacterium bovis AF2122/97] >ref|YP_978350.1| acyl carrier protein [Mycobacterium bovis BCG str. Pasteur 1173P2] >ref|YP_001283588.1| acyl carrier protein [Mycobacterium tuberculosis H37Ra] >ref|YP_001288206.1| acyl carrier protein [Mycobacterium tuberculosis F11] >ref|ZP_02551632.1| acyl carrier protein [Mycobacterium tuberculosis H37Ra] >ref|YP_002645307.1| acyl carrier protein [Mycobacterium bovis BCG str. Tokyo 172] >ref|YP_003031689.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis KZN 1435] >ref|ZP_04925721.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis C] >ref|ZP_04981085.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis str. Haarlem] >ref|ZP_05141736.1| acyl carrier > protein [Mycobacterium tuberculosis '98-R604 INH-RIF-EM'] >ref|ZP_06433498.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis T46] >ref|ZP_06437620.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis CPHL_A] >ref|ZP_06443178.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis KZN 605] >ref|ZP_06450592.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis T17] >ref|ZP_06455160.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis K85] >ref|ZP_06504896.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis 02_1987] >ref|ZP_06510220.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis T92] >ref|ZP_06513730.1| meromycolate extension acyl carrier protein acpm [Mycobacterium tuberculosis EAS054] >ref|ZP_06517747.1| meromycolate extension acyl carrier protein acpm > [Mycobacterium tuberculosis T85] >ref|ZP_06521786.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis GM 1503] >ref|ZP_06799170.1| acyl carrier protein [Mycobacterium tuberculosis 210] >ref|ZP_06952619.1| acyl carrier protein [Mycobacterium tuberculosis KZN 4207] >ref|ZP_06960948.1| acyl carrier protein [Mycobacterium tuberculosis KZN R506] >ref|ZP_07013145.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis 94_M4241A] >ref|ZP_07414839.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu001] >ref|ZP_07418616.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu002] >ref|ZP_07423348.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu003] >ref|ZP_07427715.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu004] >ref|ZP_07432018.1| meromycolate extension acyl carrier protein > acpM [Mycobacterium tuberculosis SUMu005] >ref|ZP_07436410.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu006] >ref|ZP_07440655.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu008] >ref|ZP_07445228.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu007] >ref|ZP_07481045.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu009] >ref|ZP_07485275.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu010] >ref|ZP_07489492.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu011] >ref|ZP_07494023.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu012] >ref|ZP_07816044.1| acyl carrier protein [Mycobacterium tuberculosis KZN V2475] >ref|YP_004723912.1| meromycolate extension acyl carrier protein ACPM [Mycobacterium africanum > GM041182] >ref|YP_004745700.1| meromycolate extension acyl carrier protein ACPM [Mycobacterium canettii CIPT 140010059] >sp|P0A4W6.1|ACPM_MYCTU RecName: Full=Meromycolate extension acyl carrier protein; Short=ACP >sp|P0A4W7.1|ACPM_MYCBO RecName: Full=Meromycolate extension acyl carrier protein; Short=ACP >emb|CAA94640.1| MEROMYCOLATE EXTENSION ACYL CARRIER PROTEIN ACPM [Mycobacterium tuberculosis H37Rv] >gb|AAK46588.1| acyl carrier protein [Mycobacterium tuberculosis CDC1551] >emb|CAD97121.1| MEROMYCOLATE EXTENSION ACYL CARRIER PROTEIN ACPM [Mycobacterium bovis AF2122/97] >emb|CAL72249.1| Meromycolate extension acyl carrier protein acpM [Mycobacterium bovis BCG str. Pasteur 1173P2] >gb|EAY60463.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis C] >gb|EBA42598.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis str. Haarlem] >gb|ABQ74026.1| meromycolate extension acyl carrier protein AcpM > [Mycobacterium tuberculosis H37Ra] >gb|ABR06604.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis F11] >dbj|BAH26539.1| acyl carrier protein [Mycobacterium bovis BCG str. Tokyo 172] >gb|ACT24794.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis KZN 1435] >gb|EFD13913.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis T46] >gb|EFD18035.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis CPHL_A] >gb|EFD21093.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis KZN 605] >gb|EFD43942.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis K85] >gb|EFD47767.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis T17] >gb|EFD53534.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis 02_1987] >gb|EFD58858.1| meromycolate extension acyl carrier > protein acpM [Mycobacterium tuberculosis T92] >gb|EFD62368.1| meromycolate extension acyl carrier protein acpm [Mycobacterium tuberculosis EAS054] >gb|EFD73930.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis GM 1503] >gb|EFD77945.1| meromycolate extension acyl carrier protein acpm [Mycobacterium tuberculosis T85] >gb|EFI30824.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis 94_M4241A] >gb|EFO74536.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu001] >gb|EFP15742.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu002] >gb|EFP19094.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu003] >gb|EFP22930.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu004] >gb|EFP26734.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu005] > >gb|EFP30496.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu006] >gb|EFP33906.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu007] >gb|EFP38213.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu008] >gb|EFP42922.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu009] >gb|EFP46864.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu010] >gb|EFP50800.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu011] >gb|EFP54373.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu012] >gb|EGB28294.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis CDC1551A] >gb|EGE50793.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis W-148] >gb|AEB03875.1| meromycolate extension acyl > carrier protein acpM [Mycobacterium tuberculosis KZN 4207] >gb|AEJ47271.1| acyl carrier protein [Mycobacterium tuberculosis CCDC5079] >gb|AEJ50890.1| acyl carrier protein [Mycobacterium tuberculosis CCDC5180] >emb|CCC27325.1| meromycolate extension acyl carrier protein ACPM [Mycobacterium africanum GM041182] >emb|CCC44598.1| meromycolate extension acyl carrier protein ACPM [Mycobacterium canettii CIPT 140010059] >emb|CCC64838.1| Meromycolate extension acyl carrier protein acpM [Mycobacterium bovis BCG str. Moreau RDJ] 223 223 100% 1e-74 > 1KLP_A Chain A, The Solution Structure Of Acyl Carrier Protein From Mycobacterium Tuberculosis 220 220 99% 2e-73 > ZP_04748738.1 acyl carrier protein [Mycobacterium kansasii ATCC 12478] 165 165 100% 9e-52 > ZP_05224070.1 acyl carrier protein [Mycobacterium intracellulare ATCC 13950] 162 162 100% 8e-51 > NP_960931.1 acyl carrier protein [Mycobacterium avium subsp. paratuberculosis K-10] >ref|YP_881402.1| acyl carrier protein [Mycobacterium avium 104] >ref|ZP_05216419.1| acyl carrier protein [Mycobacterium avium subsp. avium ATCC 25291] >gb|AAS04314.1| AcpM [Mycobacterium avium subsp. paratuberculosis K-10] >gb|ABK65172.1| acyl carrier protein [Mycobacterium avium 104] >gb|EGO40713.1| acyl carrier protein [Mycobacterium avium subsp. paratuberculosis S397] 162 162 100% 8e-51 > NP_302135.1 acyl carrier protein [Mycobacterium leprae TN] >ref|YP_002503765.1| acyl carrier protein [Mycobacterium leprae Br4923] >sp|O69475.1|ACPM_MYCLE RecName: Full=Meromycolate extension acyl carrier protein; Short=ACP >emb|CAA19202.1| acyl carrier protein [Mycobacterium leprae] >emb|CAC30605.1| acyl carrier protein (meromycolate extension) [Mycobacterium leprae] >emb|CAR71749.1| acyl carrier protein (meromycolate extension) [Mycobacterium leprae Br4923] 162 162 100% 2e-50 > ZP_07966703.1 hypothetical protein HMPREF9336_03075 [Segniliparus rugosus ATCC BAA-974] >gb|EFV12044.1| hypothetical protein HMPREF9336_03075 [Segniliparus rugosus ATCC BAA-974] 162 162 88% 3e-50 > YP_905336.1 acyl carrier protein [Mycobacterium ulcerans Agy99] >ref|YP_001851618.1| meromycolate extension acyl carrier protein AcpM [Mycobacterium marinum M] >gb|ABL03865.1| meromycolate extension acyl carrier protein AcpM [Mycobacterium ulcerans Agy99] >gb|ACC41763.1| meromycolate extension acyl carrier protein AcpM [Mycobacterium marinum M] 161 161 100% 3e-50 > ZP_08713925.1 acyl carrier protein [Mycobacterium colombiense CECT 3035] >gb|EGT87768.1| acyl carrier protein [Mycobacterium colombiense CECT 3035] 160 160 100% 6e-50 > YP_003660002.1 phosphopantetheine-binding protein [Segniliparus rotundus DSM 44985] >gb|ADG99171.1| phosphopantetheine-binding protein [Segniliparus rotundus DSM 44985] 160 160 88% 8e-50 > > where in my code: > > print "hit name is ",$hit->name, "\n"; # gives me the refrence of aligned sequence > print"Score: ".$hsp->score."\n"; # gives me the score of aligned sequence > print"E-val: ".$hsp->expect."\n"; # gives me the evalue of aligned sequence > print"percent identity: ".$hsp->percent_identity."\n"; # gives me the query coverage of aligned sequence > > i want to use #print "Description ",$hsp->desc, "\n"; to show the description but i am not getting can any body help me out for this i need to know urgently, thanks to read and i hope i was succesfull to explain my problem . > > below is the copy of my code i am trying to use : > > > > > use Bio::Tools::Run::RemoteBlast; > use strict; > my $v = 1; > my $prog = 'blastp'; > my $db = 'refseq_protein'; > my $e_val= '1e-10'; #1e-10 > > my $result; > #my $code=q| my $answer = my $a / my $b;|; > > > > > > my @params = ( > '-prog' => $prog, > '-data' => $db, > '-expect' => $e_val > ); > > my $factory = Bio::Tools::Run::RemoteBlast->new(@params); > $v = 1; > my $str = Bio::SeqIO->new(-file=>'prot.txt' , '-format' => 'fasta' ); > my $input; > while($input = $str->next_seq()) > { > > # Blast a sequence against a database: > > my $r = $factory->submit_blast($input); > print STDERR "waiting..." if( $v > 0 ); > > my %hit_evalue; > my @evalue; > > while ( my @rids = $factory->each_rid ) { > foreach my $rid ( @rids ) { > my $rc = $factory->retrieve_blast($rid); > if( !ref($rc) ) { > if( $rc < 0 ) { > $factory->remove_rid($rid); > } > print STDERR "." if ( $v > 0 ); > sleep 5; > } else { > $factory->remove_rid($rid); > #print $rid."\n\n"; > my $result = $rc->next_result; > > print "db is ", $result->database_name(), "\n"; > my $count = 0; > while( my $hit = $result->next_hit ) { > $count++; > #next unless ( $v > 0); > #print "hit name is ", $hit->name, "\n"; > while( my $hsp = $hit->next_hsp ) > { > print "hit name is ",$hit->name, "\n"; > #print "Query name is ",$hsp->desc, "\n"; exit; > > print"Score: ".$hsp->score."\n"; > print"E-val: ".$hsp->expect."\n"; > print"percent identity: ".$hsp->percent_identity."\n"; > } > > > } > } > } > } > } > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From David.Messina at sbc.su.se Fri Aug 19 05:07:35 2011 From: David.Messina at sbc.su.se (Dave Messina) Date: Fri, 19 Aug 2011 11:07:35 +0200 Subject: [Bioperl-l] Fwd: pls help.. In-Reply-To: References: Message-ID: Whoops, resending ? the attachment was too big. Ravi, please provide only a few example lines from your GFF file, or host the file elsewhere and post a link to it. Dave ---------- Forwarded message ---------- From: Dave Messina Date: Fri, Aug 19, 2011 at 10:53 Subject: pls help.. To: ravi.devani89 at gmail.com Cc: bioperl-l Ravi, Your message belongs on the main BioPerl list, not the bioperl-dev list, so I'm reposting it there. To sign up for the main list, go to: http://bioperl.org/mailman/listinfo/bioperl-l Dave ---------- Forwarded message ---------- From: Ravi Devani To: bioperl-dev at lists.open-bio.org Date: Fri, 19 Aug 2011 13:54:22 +0530 Subject: Fwd: pls help.. i tried to create a gff3 file from .gbk file using bioperl genbank2gff3 script but what i get is same features repeating many times.. and the file keeps growing in size ntil my harddisk gets full.. i have tried to filter all other features except "region" but still it repeats a single entry many times.. i have attached a part of the file generated.. pls kindly help me. From ravi.devani89 at gmail.com Fri Aug 19 01:16:00 2011 From: ravi.devani89 at gmail.com (Ravi Devani) Date: Fri, 19 Aug 2011 10:46:00 +0530 Subject: [Bioperl-l] pls help.. In-Reply-To: References: Message-ID: ---------- Forwarded message ---------- From: Ravi Devani Date: Thu, Aug 18, 2011 at 12:40 PM Subject: pls help.. To: scott at scottcain.net i tried to create a gff3 file from .gbk file using bp_genbank2gff3.pl but what i get is same features repeating many times.. and the file keeps growing in size ntil my harddisk gets full.. i have tried to filter all other features except "region" but still it repeats a single entry many times.. i have attached a part of the file generated.. pls kindly help me. -------------- next part -------------- A non-text attachment was scrubbed... Name: ref_chrUn.gff Type: application/octet-stream Size: 602112 bytes Desc: not available URL: From anjan.purkayastha at gmail.com Mon Aug 15 10:32:39 2011 From: anjan.purkayastha at gmail.com (ANJAN PURKAYASTHA) Date: Mon, 15 Aug 2011 10:32:39 -0400 Subject: [Bioperl-l] Problem with Bio::DB::Taxonomy Message-ID: Hello, I wrote a short test script for the Bio::DB::Taxonomy module: ================================================ #!/usr/bin/perl -w use strict; use Bio::DB::Taxonomy; my ($nodesfile, $namesfile)= ('nodes.dmp', 'names.dmp'); my $db= new Bio::DB::Taxonomy(-source => 'flatfile', -nodesfile => $nodesfile, -namesfile => $namesfile ); my $bacteria= $db->get_Taxonomy_Node(-taxonid => '2'); print("$bacteria->id\t$bacteria->name\n"); ================================================ On running this script I expect the following output: 2 Bacteria. Instead I get a warning: UNIVERSAL->import is deprecated and will be removed in a future perl at /usr/share/perl5/vendor_perl/Bio/Tree/TreeFunctionsI.pm line 94. and the following ouput: Bio::Taxon=HASH(0x158dbe0)->id Bio::Taxon=HASH(0x158dbe0)->name The script seems to be working but there seems to be a problem with dereferencing a Bio::Taxon object. Any leads on how to troubleshoot this will be much appreciated. Thanks Anjan -- =================================== Anjan Purkayastha, PhD Senior Computational Biologist TessArae LLC 46090 Lake Center Plaza, Suite 304 Potomac Falls, VA 20165** Office- 703.444.7188 ext. 116 Mobile-703.740.6939 =================================== From scott at scottcain.net Fri Aug 19 09:45:47 2011 From: scott at scottcain.net (Scott Cain) Date: Fri, 19 Aug 2011 09:45:47 -0400 Subject: [Bioperl-l] pls help.. In-Reply-To: References: Message-ID: <92CA808D-16F0-4F08-BC44-8A0C06292EA8@scottcain.net> Ravi, The gff file is fairly useless from a debugging perspective. Can you please attach the genbank file you're using? Also, please indicate what version of bioperl you're using. Scott Sent from my iPhone On Aug 19, 2011, at 1:16 AM, Ravi Devani wrote: > ---------- Forwarded message ---------- > From: Ravi Devani > Date: Thu, Aug 18, 2011 at 12:40 PM > Subject: pls help.. > To: scott at scottcain.net > > > i tried to create a gff3 file from .gbk file using > bp_genbank2gff3.pl but > what i get is same features repeating many times.. and the file keeps > growing in size ntil my harddisk gets full.. i have tried to filter > all > other features except "region" but still it repeats a single entry > many > times.. i have attached a part of the file generated.. pls kindly > help me. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Fri Aug 19 10:05:03 2011 From: cjfields at illinois.edu (Chris Fields) Date: Fri, 19 Aug 2011 09:05:03 -0500 Subject: [Bioperl-l] Problem with Bio::DB::Taxonomy In-Reply-To: References: Message-ID: <7A733494-D831-43ED-9AE4-AB62AC5A2761@illinois.edu> Anjan, You are likely using an old version of BioPerl (this was fixed in the latest release on CPAN I believe). Bio::DB::Taxonomy uses Bio::Taxon, so the use ofname() is incorrect; it is node_name(); if this is documented somewhere it is incorrect, so let us know where that came from. Also, the print statement at the end isn't interpolating correctly; in general with objects I make this more explicit: print $bacteria->id."\t".$bacteria->node_name."\n"; Correcting that, it works for me: [cjfields at pyrimidine1 anjan]$ perl test.pl 2 Bacteria chris On Aug 15, 2011, at 9:32 AM, ANJAN PURKAYASTHA wrote: > Hello, > I wrote a short test script for the Bio::DB::Taxonomy module: > ================================================ > #!/usr/bin/perl -w > use strict; > use Bio::DB::Taxonomy; > > my ($nodesfile, $namesfile)= ('nodes.dmp', 'names.dmp'); > > my $db= new Bio::DB::Taxonomy(-source => 'flatfile', > -nodesfile => $nodesfile, > -namesfile => $namesfile > ); > > my $bacteria= $db->get_Taxonomy_Node(-taxonid => '2'); > print("$bacteria->id\t$bacteria->name\n"); > ================================================ > > On running this script I expect the following output: 2 Bacteria. > > Instead I get a warning: > UNIVERSAL->import is deprecated and will be removed in a future perl at > /usr/share/perl5/vendor_perl/Bio/Tree/TreeFunctionsI.pm line 94. > > and the following ouput: > Bio::Taxon=HASH(0x158dbe0)->id Bio::Taxon=HASH(0x158dbe0)->name > > The script seems to be working but there seems to be a problem with > dereferencing a Bio::Taxon object. > > Any leads on how to troubleshoot this will be much appreciated. > Thanks > Anjan > > > > -- > =================================== > Anjan Purkayastha, PhD > Senior Computational Biologist > TessArae LLC > 46090 Lake Center Plaza, Suite 304 > Potomac Falls, VA 20165** > Office- 703.444.7188 ext. 116 > Mobile-703.740.6939 > =================================== > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Fri Aug 19 10:26:06 2011 From: cjfields at illinois.edu (Chris Fields) Date: Fri, 19 Aug 2011 09:26:06 -0500 Subject: [Bioperl-l] pls help.. In-Reply-To: <92CA808D-16F0-4F08-BC44-8A0C06292EA8@scottcain.net> References: <92CA808D-16F0-4F08-BC44-8A0C06292EA8@scottcain.net> Message-ID: <1D308407-17A9-4203-9D6C-D71FA0FD74D0@illinois.edu> Scott, http://www.ncbi.nlm.nih.gov/nuccore/NW_002121371.1?report=gbwithparts&log$=seqview (it's in the GFF file) It definitely is getting stuck in a loop for the genomic region, but using the file for GFF3 doesn't make sense (very few features of note). On Aug 19, 2011, at 8:45 AM, Scott Cain wrote: > Ravi, > > The gff file is fairly useless from a debugging perspective. Can you please attach the genbank file you're using? Also, please indicate what version of bioperl you're using. > > Scott > > > Sent from my iPhone > > On Aug 19, 2011, at 1:16 AM, Ravi Devani wrote: > >> ---------- Forwarded message ---------- >> From: Ravi Devani >> Date: Thu, Aug 18, 2011 at 12:40 PM >> Subject: pls help.. >> To: scott at scottcain.net >> >> >> i tried to create a gff3 file from .gbk file using bp_genbank2gff3.pl but >> what i get is same features repeating many times.. and the file keeps >> growing in size ntil my harddisk gets full.. i have tried to filter all >> other features except "region" but still it repeats a single entry many >> times.. i have attached a part of the file generated.. pls kindly help me. >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From scott at scottcain.net Fri Aug 19 10:38:16 2011 From: scott at scottcain.net (Scott Cain) Date: Fri, 19 Aug 2011 10:38:16 -0400 Subject: [Bioperl-l] pls help.. In-Reply-To: <1D308407-17A9-4203-9D6C-D71FA0FD74D0@illinois.edu> References: <92CA808D-16F0-4F08-BC44-8A0C06292EA8@scottcain.net> <1D308407-17A9-4203-9D6C-D71FA0FD74D0@illinois.edu> Message-ID: I was wondering if perhaps the genbank file had been manipulated in some way. Scott On Fri, Aug 19, 2011 at 10:26 AM, Chris Fields wrote: > Scott, > > http://www.ncbi.nlm.nih.gov/nuccore/NW_002121371.1?report=gbwithparts&log$=seqview > > (it's in the GFF file) > > It definitely is getting stuck in a loop for the genomic region, but using the file for GFF3 doesn't make sense (very few features of note). > > On Aug 19, 2011, at 8:45 AM, Scott Cain wrote: > >> Ravi, >> >> The gff file is fairly useless from a debugging perspective. Can you please attach the genbank file you're using? ?Also, please indicate what version of bioperl you're using. >> >> Scott >> >> >> Sent from my iPhone >> >> On Aug 19, 2011, at 1:16 AM, Ravi Devani wrote: >> >>> ---------- Forwarded message ---------- >>> From: Ravi Devani >>> Date: Thu, Aug 18, 2011 at 12:40 PM >>> Subject: pls help.. >>> To: scott at scottcain.net >>> >>> >>> i tried to create a gff3 file from .gbk file using bp_genbank2gff3.pl but >>> what i get is same features repeating many times.. and the file keeps >>> growing in size ntil my harddisk gets full.. i have tried to filter all >>> other features except "region" but still it repeats a single entry many >>> times.. ?i have attached a part of the file generated.. pls kindly help me. >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- ------------------------------------------------------------------------ Scott Cain, Ph. D.? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? scott at scottcain dot net GMOD Coordinator (http://gmod.org/)? ? ? ? ? ? ? ? ? ?? 216-392-3087 Ontario Institute for Cancer Research From cjfields at illinois.edu Fri Aug 19 15:19:40 2011 From: cjfields at illinois.edu (Chris Fields) Date: Fri, 19 Aug 2011 14:19:40 -0500 Subject: [Bioperl-l] pls help.. In-Reply-To: References: <92CA808D-16F0-4F08-BC44-8A0C06292EA8@scottcain.net> <1D308407-17A9-4203-9D6C-D71FA0FD74D0@illinois.edu> Message-ID: Yeah, the output is rather odd. Maybe it's using the contig file version? chris On Aug 19, 2011, at 9:38 AM, Scott Cain wrote: > I was wondering if perhaps the genbank file had been manipulated in some way. > > Scott > > > On Fri, Aug 19, 2011 at 10:26 AM, Chris Fields wrote: >> Scott, >> >> http://www.ncbi.nlm.nih.gov/nuccore/NW_002121371.1?report=gbwithparts&log$=seqview >> >> (it's in the GFF file) >> >> It definitely is getting stuck in a loop for the genomic region, but using the file for GFF3 doesn't make sense (very few features of note). >> >> On Aug 19, 2011, at 8:45 AM, Scott Cain wrote: >> >>> Ravi, >>> >>> The gff file is fairly useless from a debugging perspective. Can you please attach the genbank file you're using? Also, please indicate what version of bioperl you're using. >>> >>> Scott >>> >>> >>> Sent from my iPhone >>> >>> On Aug 19, 2011, at 1:16 AM, Ravi Devani wrote: >>> >>>> ---------- Forwarded message ---------- >>>> From: Ravi Devani >>>> Date: Thu, Aug 18, 2011 at 12:40 PM >>>> Subject: pls help.. >>>> To: scott at scottcain.net >>>> >>>> >>>> i tried to create a gff3 file from .gbk file using bp_genbank2gff3.pl but >>>> what i get is same features repeating many times.. and the file keeps >>>> growing in size ntil my harddisk gets full.. i have tried to filter all >>>> other features except "region" but still it repeats a single entry many >>>> times.. i have attached a part of the file generated.. pls kindly help me. >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> > > > > -- > ------------------------------------------------------------------------ > Scott Cain, Ph. D. scott at scottcain dot net > GMOD Coordinator (http://gmod.org/) 216-392-3087 > Ontario Institute for Cancer Research From hlapp at drycafe.net Fri Aug 19 23:38:51 2011 From: hlapp at drycafe.net (Hilmar Lapp) Date: Fri, 19 Aug 2011 22:38:51 -0500 Subject: [Bioperl-l] [BioSQL-l] How is is_circular recorded in BioSQL (by BioPerl)? In-Reply-To: <4E2D79D6.6020108@gmail.com> References: <4E2D5000.30305@gmail.com> <4E2D5314.5090107@gmail.com> <4E2D5BAC.8020001@gmail.com> <4E2D79D6.6020108@gmail.com> Message-ID: <59AF5708-AECD-4375-9EB8-6E79D4B21C26@drycafe.net> I realize I'm chiming in here late, but the below sums it up quite well. In fact, biosequence.alphabet column was originally (pre-2002) called molecule, and the BioPerl Genbank writer defaults to alphabet() if molecule() is not defined. -hilmar Sent with a tap. On Jul 25, 2011, at 9:12 AM, Roy Chaudhuri wrote: > As with the is_circular hack, you could store the molecule type by adding it as an annotation in the SequenceProcessor (it's stored as $seq->molecule by BioPerl). > > Actually, when round-tripping a GenBank file through BioSQL, the LOCUS line molecule type ends up in lower case, which makes me wonder if it is coming from alphabet in the biosequence table. From hlapp at drycafe.net Fri Aug 19 21:02:12 2011 From: hlapp at drycafe.net (Hilmar Lapp) Date: Fri, 19 Aug 2011 20:02:12 -0500 Subject: [Bioperl-l] Error writing SequenceProcessor to associate GO terms in biosql database In-Reply-To: <26C59A57-F54A-4237-8D97-4E7A77E55D59@sgul.ac.uk> References: <26C59A57-F54A-4237-8D97-4E7A77E55D59@sgul.ac.uk> Message-ID: <6BDB69DE-5856-4061-96FA-0CF2884EDD9E@drycafe.net> Hi Adam I'm not sure whether you've received a response to this. Apologies if not. There is indeed a NOT NULL constraint on seqfeature_qualifier_value.value. The only other metadata association table in BioSQL that does this is location_qualifier_value. In the latter case there is arguably some sense to that (at least originally for locations the purpose of that table was pretty much to store the fuzzy location start/end properties), but for seqfeatures this looks like a bug to me. I'll post this to the BioSQL list and fix it f there are no objections, but feel free to drop the NOT NULL on that column yourself in the meantime. The INSERT query gets constructed in the innards of Bioperl-db. There is no reason to mess with that for this problem though - just drop the NOT NULL constraint. -hilmar Sent with a tap. On Jul 26, 2011, at 10:07 AM, Adam Witney wrote: > > Hi, > > I'm trying to write a SequenceProcessor for a genbank file to associate GO terms to the GO data preloaded in my biosql database. The command looks like this: > > perl load_seqdatabase.pl --dbname=biosql --driver=Pg --host=myhost --port= 5432 --dbuser=user --dbpass=pass -format genbank -namespace testing -pipeline 'GOSequenceProcessor' --debug S_sonnei.EB1_s_sonnei.dat > > The SequenceProcessor process_seq looks like this: > > sub process_seq{ > my ($self,$seq) = @_; > > my @features = $seq->get_SeqFeatures(); > foreach my $feat ( @features ) { > if ( $feat->has_tag('db_xref') ) { > my @db_xrefs = $feat->get_tag_values('db_xref'); > > foreach my $db_xref (@db_xrefs) { > if ( $db_xref =~ m/^GO:/ ) { > my $term = Bio::Annotation::OntologyTerm->new(-identifier => $db_xref, > -ontology => 'Gene Ontology'); > $feat->annotation->add_Annotation($term); > } > } > } > } > > return ($seq); > } > > But this gives this error: > > preparing INSERT statement: INSERT INTO seqfeature_qualifier_value (seqfeature_id, term_id, rank) VALUES (?, ?, ?) > TermAdaptor::add_assoc: binding column 1 to "935181" (FK to Bio::SeqFeature::Generic) > TermAdaptor::add_assoc: binding column 2 to "50253" (FK to Bio::Annotation::OntologyTerm) > TermAdaptor::add_assoc: binding column 3 to "1" (rank) > > --------------------- WARNING --------------------- > MSG: TermAdaptor::add_assoc: unexpected failure of statement execution: ERROR: null value in column "value" violates not-null constraint > name: INSERT ASSOC [1] Bio::SeqFeature::Generic;Bio::Annotation::OntologyTerm > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::add_association /var/users/adam/BioPerl/bioperl-db/lib//Bio/DB/BioSQL/BasePersistenceAdaptor.pm:458 > STACK Bio::DB::BioSQL::AnnotationCollectionAdaptor::add_association /var/users/adam/BioPerl/bioperl-db/lib//Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:468 > STACK Bio::DB::BioSQL::SeqFeatureAdaptor::store_children /var/users/adam/BioPerl/bioperl-db/lib//Bio/DB/BioSQL/SeqFeatureAdaptor.pm:304 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create /var/users/adam/BioPerl/bioperl-db/lib//Bio/DB/BioSQL/BasePersistenceAdaptor.pm:227 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store /var/users/adam/BioPerl/bioperl-db/lib//Bio/DB/BioSQL/BasePersistenceAdaptor.pm:264 > STACK Bio::DB::Persistent::PersistentObject::store /var/users/adam/BioPerl/bioperl-db/lib//Bio/DB/Persistent/PersistentObject.pm:284 > STACK Bio::DB::BioSQL::SeqAdaptor::store_children /var/users/adam/BioPerl/bioperl-db/lib//Bio/DB/BioSQL/SeqAdaptor.pm:257 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create /var/users/adam/BioPerl/bioperl-db/lib//Bio/DB/BioSQL/BasePersistenceAdaptor.pm:227 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store /var/users/adam/BioPerl/bioperl-db/lib//Bio/DB/BioSQL/BasePersistenceAdaptor.pm:264 > STACK Bio::DB::Persistent::PersistentObject::store /var/users/adam/BioPerl/bioperl-db/lib//Bio/DB/Persistent/PersistentObject.pm:284STACK (eval) /var/users/adam/BioPerl/bioperl-db/scripts/biosql/load_seqdatabase.pl:630 > STACK toplevel /var/users/adam/BioPerl/bioperl-db/scripts/biosql/load_seqdatabase.pl:612 > > As you can see it generates an INSERT against seqfeature_qualifier_value without including a 'value' field, which is of course defined as NOT NULL. > > Firstly, is this the best way to achieve this? And secondly, where is the INSERT statement put together, I can't seem to find it in the object hierarchy > > Thanks > > adam > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From ulrik.stervbo at gmail.com Sun Aug 21 13:33:44 2011 From: ulrik.stervbo at gmail.com (Ulrik Stervbo) Date: Sun, 21 Aug 2011 19:33:44 +0200 Subject: [Bioperl-l] Change of Expasy Protparam url Message-ID: it seems the there are some minor changes with the urls for the expasy-services. In the Protparam.pm, line 110 should be changed from @args=('-url'=>'http://www.expasy.org/cgi-bin/protparam','-form'=>'sequence', at args); to @args=('-url'=>'http://web.expasy.org/cgi-bin/protparam/protparam','-form'=>'sequence', at args); At least it seems to be working here, after adding the change to my local Protparam.pm Cheers, Ulrik From cjfields at illinois.edu Sun Aug 21 13:56:10 2011 From: cjfields at illinois.edu (Chris Fields) Date: Sun, 21 Aug 2011 12:56:10 -0500 Subject: [Bioperl-l] Change of Expasy Protparam url In-Reply-To: References: Message-ID: <9178B7E4-6EF2-4BC7-9B1C-9E5B282B5012@illinois.edu> Thanks for pointing that out. I've updated that on github. The critical thing is to get some tests working, so a failure for the webservice doesn't happen again w/o some exceptions (so we can track this). chris On Aug 21, 2011, at 12:33 PM, Ulrik Stervbo wrote: > it seems the there are some minor changes with the urls for the expasy-services. > > In the Protparam.pm, line 110 should be changed from > @args=('-url'=>'http://www.expasy.org/cgi-bin/protparam','-form'=>'sequence', at args); > > to > > @args=('-url'=>'http://web.expasy.org/cgi-bin/protparam/protparam','-form'=>'sequence', at args); > > At least it seems to be working here, after adding the change to my > local Protparam.pm > > Cheers, > Ulrik > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From scott at scottcain.net Mon Aug 22 11:51:55 2011 From: scott at scottcain.net (Scott Cain) Date: Mon, 22 Aug 2011 11:51:55 -0400 Subject: [Bioperl-l] pls help.. In-Reply-To: References: <92CA808D-16F0-4F08-BC44-8A0C06292EA8@scottcain.net> <1D308407-17A9-4203-9D6C-D71FA0FD74D0@illinois.edu> Message-ID: Hello Ravi, Please keep the BioPerl mailing list cc'ed. I downloaded your 1.7GB multi-genbank file and started processing it with bp_genbank2gff3.pl and I killed it when the GFF file got to 10GB, however, it was working as expected. I suggest you upgrade to the most recent release of BioPerl and try again. Additionally, it might make sense to break that big multi-genbank file into smaller files. Scott On Sun, Aug 21, 2011 at 11:33 AM, Ravi Devani wrote: > scott i hv given the link to the gbk file, please kindly help me > > On 8/19/11, Scott Cain wrote: >> Ravi, >> >> I also meant to ask what version of BioPerl you are using. ?When I run >> this command >> >> ? bp_genbank2gff3.pl NW_002121371.gbk >> >> I get a rather dull GFF3 file with 4 lines of GFF (one region and >> three gaps) and a fasta section. >> >> Scott >> >> >> On Fri, Aug 19, 2011 at 12:33 PM, Ravi Devani >> wrote: >>> No the genbank file has not been manipulated >>> >>> On 8/19/11, Scott Cain wrote: >>>> I was wondering if perhaps the genbank file had been manipulated in some >>>> way. >>>> >>>> Scott >>>> >>>> >>>> On Fri, Aug 19, 2011 at 10:26 AM, Chris Fields >>>> wrote: >>>>> Scott, >>>>> >>>>> http://www.ncbi.nlm.nih.gov/nuccore/NW_002121371.1?report=gbwithparts&log$=seqview >>>>> >>>>> (it's in the GFF file) >>>>> >>>>> It definitely is getting stuck in a loop for the genomic region, but >>>>> using >>>>> the file for GFF3 doesn't make sense (very few features of note). >>>>> >>>>> On Aug 19, 2011, at 8:45 AM, Scott Cain wrote: >>>>> >>>>>> Ravi, >>>>>> >>>>>> The gff file is fairly useless from a debugging perspective. Can you >>>>>> please attach the genbank file you're using? ?Also, please indicate >>>>>> what >>>>>> version of bioperl you're using. >>>>>> >>>>>> Scott >>>>>> >>>>>> >>>>>> Sent from my iPhone >>>>>> >>>>>> On Aug 19, 2011, at 1:16 AM, Ravi Devani >>>>>> wrote: >>>>>> >>>>>>> ---------- Forwarded message ---------- >>>>>>> From: Ravi Devani >>>>>>> Date: Thu, Aug 18, 2011 at 12:40 PM >>>>>>> Subject: pls help.. >>>>>>> To: scott at scottcain.net >>>>>>> >>>>>>> >>>>>>> i tried to create a gff3 file from .gbk file using bp_genbank2gff3.pl >>>>>>> but >>>>>>> what i get is same features repeating many times.. and the file keeps >>>>>>> growing in size ntil my harddisk gets full.. i have tried to filter >>>>>>> all >>>>>>> other features except "region" but still it repeats a single entry >>>>>>> many >>>>>>> times.. ?i have attached a part of the file generated.. pls kindly >>>>>>> help >>>>>>> me. >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Bioperl-l mailing list >>>>>>> Bioperl-l at lists.open-bio.org >>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>>>> _______________________________________________ >>>>>> Bioperl-l mailing list >>>>>> Bioperl-l at lists.open-bio.org >>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>>> >>>>> >>>> >>>> >>>> >>>> -- >>>> ------------------------------------------------------------------------ >>>> Scott Cain, Ph. D.? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? scott at scottcain >>>> dot >>>> net >>>> GMOD Coordinator (http://gmod.org/)? ? ? ? ? ? ? ? ? ?? 216-392-3087 >>>> Ontario Institute for Cancer Research >>>> >>> >> >> >> >> -- >> ------------------------------------------------------------------------ >> Scott Cain, Ph. D.? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? scott at scottcain dot >> net >> GMOD Coordinator (http://gmod.org/)? ? ? ? ? ? ? ? ? ?? 216-392-3087 >> Ontario Institute for Cancer Research >> > -- ------------------------------------------------------------------------ Scott Cain, Ph. D.? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? scott at scottcain dot net GMOD Coordinator (http://gmod.org/)? ? ? ? ? ? ? ? ? ?? 216-392-3087 Ontario Institute for Cancer Research From allenday at ionflux.com Mon Aug 22 14:40:33 2011 From: allenday at ionflux.com (Allen Day, PhD) Date: Mon, 22 Aug 2011 18:40:33 +0000 Subject: [Bioperl-l] Beijing and Los Angeles Human NGS Biostatistics/Informatics jobs Message-ID: Hi all, Ion Flux is a startup that I just created to apply NGS technology to the clinical diagnostics field. We like to think of ourselves as an enterprise class "23andme". This is an early-stage startup -- you will have a chance to influence the company and to be rewarded accordingly. I am Allen, the founder. We have a couple of open positions - for smart, passionate, scientist / engineering types. Others need not apply. Please check out these job descriptions, if this sparks your interest: http://ionflux.com/blog/careers/bioinformatician-data-modeling-and-processing/ http://ionflux.com/blog/careers/bioinformatician-data-analysis-and-algorithms/ Our offices are in Los Angeles (UCLA adjacent) and Beijing (????@??????). I'm happy to post future openings to other lists in the future if this isn't the right venue for an occasional job announcement. -Allen From acpatel at gmail.com Mon Aug 22 15:25:50 2011 From: acpatel at gmail.com (Anand Patel) Date: Mon, 22 Aug 2011 14:25:50 -0500 Subject: [Bioperl-l] primer3redux 0.09 add_targets is not there Message-ID: I'm trying to get Primer3Redux to work, and am noticing some strange things. While I found and changed my parameters to the new primer3 2.2.3 parameters, I still can't find add_targets. Assigning the parameters using set_parameters works for primer3redux, add_targets is ?leftover? from primer3. So is this a doc/POD issue? Thanks, Anand Anand C. Patel, MD MS Washington University School of Medicine acpatel at gmail.com From cjfields1 at gmail.com Mon Aug 22 15:42:28 2011 From: cjfields1 at gmail.com (Christopher Fields) Date: Mon, 22 Aug 2011 14:42:28 -0500 Subject: [Bioperl-l] primer3redux 0.09 add_targets is not there In-Reply-To: References: Message-ID: On Aug 22, 2011, at 2:25 PM, Anand Patel wrote: > I'm trying to get Primer3Redux to work, and am noticing some strange > things. While I found and changed my parameters to the new primer3 > 2.2.3 parameters, I still can't find add_targets. > > Assigning the parameters using set_parameters works for primer3redux, > add_targets is ?leftover? from primer3. > > So is this a doc/POD issue? I'm confused. You are trying to use add_targets with the latest primer3, but it isn't there? Or is the Primer3Redux wrapper missing this parameter? chris > Thanks, > Anand > > Anand C. Patel, MD MS > Washington University School of Medicine > acpatel at gmail.com > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From acpatel at gmail.com Mon Aug 22 15:52:12 2011 From: acpatel at gmail.com (Anand Patel) Date: Mon, 22 Aug 2011 14:52:12 -0500 Subject: [Bioperl-l] primer3redux 0.09 add_targets is not there In-Reply-To: References: Message-ID: my $primer3 = Bio::Tools::Run::Primer3Redux->new(-outfile => "temp.out", -path => "/usr/bin/primer3_core"); If I use this: $primer3->add_targets( 'PRIMER_OPT_TM'=>$PRIMER_OPT_TM,'PRIMER_MIN_TM'=>$PRIMER_MIN_TM, 'PRIMER_MAX_TM'=>$PRIMER_MAX_TM, 'PRIMER_PAIR_MAX_DIFF_TM'=>$PRIMER_MAX_DIFF_TM, 'PRIMER_MAX_SIZE'=>$PRIMER_MAX_SIZE,'PRIMER_OPT_SIZE'=>$PRIMER_OPT_SIZE, 'PRIMER_MIN_SIZE'=>$PRIMER_MIN_SIZE, 'PRIMER_MAX_GC'=>$PRIMER_MAX_GC, 'PRIMER_OPT_GC_PERCENT'=>$PRIMER_OPT_GC_PERCENT, 'PRIMER_MIN_GC'=>$PRIMER_MIN_GC, 'SEQUENCE_TARGET'=>$TARGET, 'PRIMER_PRODUCT_SIZE_RANGE' =>$PRIMER_PRODUCT_SIZE_RANGE); I get: Can't locate object method "add_targets" via package "Bio::Tools::Run::Primer3Redux" at p3ra.pl line 31, line 1. On the other hand, if I change that line to: $primer3->set_parameters( 'PRIMER_OPT_TM'=>$PRIMER_OPT_TM,'PRIMER_MIN_TM'=>$PRIMER_MIN_TM, 'PRIMER_MAX_TM'=>$PRIMER_MAX_TM, 'PRIMER_PAIR_MAX_DIFF_TM'=>$PRIMER_MAX_DIFF_TM, 'PRIMER_MAX_SIZE'=>$PRIMER_MAX_SIZE,'PRIMER_OPT_SIZE'=>$PRIMER_OPT_SIZE, 'PRIMER_MIN_SIZE'=>$PRIMER_MIN_SIZE, 'PRIMER_MAX_GC'=>$PRIMER_MAX_GC, 'PRIMER_OPT_GC_PERCENT'=>$PRIMER_OPT_GC_PERCENT, 'PRIMER_MIN_GC'=>$PRIMER_MIN_GC, 'SEQUENCE_TARGET'=>$TARGET, 'PRIMER_PRODUCT_SIZE_RANGE' =>$PRIMER_PRODUCT_SIZE_RANGE); It works. When I looked at the source code for Primer3Redux, I couldn't find add_targets, but set_parameters looked like it might work, so I used that instead, and it worked. But I see over in the github that there are other issues with the documentation (how primer3redux's result object is now 3 deep rather than 2 deep). Not sure if this is in that category or not. Thanks, Anand On Mon, Aug 22, 2011 at 2:42 PM, Christopher Fields wrote: > On Aug 22, 2011, at 2:25 PM, Anand Patel wrote: > >> I'm trying to get Primer3Redux to work, and am noticing some strange >> things. ?While I found and changed my parameters to the new primer3 >> 2.2.3 parameters, I still can't find add_targets. >> >> Assigning the parameters using set_parameters works for primer3redux, >> add_targets is ?leftover? from primer3. >> >> So is this a doc/POD issue? > > I'm confused. ?You are trying to use add_targets with the latest primer3, but it isn't there? ?Or is the Primer3Redux wrapper missing this parameter? > > chris > >> Thanks, >> Anand >> >> Anand C. Patel, MD MS >> Washington University School of Medicine >> acpatel at gmail.com >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From cjfields at illinois.edu Mon Aug 22 16:10:25 2011 From: cjfields at illinois.edu (Chris Fields) Date: Mon, 22 Aug 2011 15:10:25 -0500 Subject: [Bioperl-l] primer3redux 0.09 add_targets is not there In-Reply-To: References: Message-ID: <3BE41688-C163-4EA1-AF6A-34A6052FCFEA@illinois.edu> On Aug 22, 2011, at 2:52 PM, Anand Patel wrote: > my $primer3 = Bio::Tools::Run::Primer3Redux->new(-outfile => > "temp.out", -path => "/usr/bin/primer3_core"); > > If I use this: > $primer3->add_targets( > 'PRIMER_OPT_TM'=>$PRIMER_OPT_TM,'PRIMER_MIN_TM'=>$PRIMER_MIN_TM, > 'PRIMER_MAX_TM'=>$PRIMER_MAX_TM, > 'PRIMER_PAIR_MAX_DIFF_TM'=>$PRIMER_MAX_DIFF_TM, > 'PRIMER_MAX_SIZE'=>$PRIMER_MAX_SIZE,'PRIMER_OPT_SIZE'=>$PRIMER_OPT_SIZE, > 'PRIMER_MIN_SIZE'=>$PRIMER_MIN_SIZE, > 'PRIMER_MAX_GC'=>$PRIMER_MAX_GC, > 'PRIMER_OPT_GC_PERCENT'=>$PRIMER_OPT_GC_PERCENT, > 'PRIMER_MIN_GC'=>$PRIMER_MIN_GC, > 'SEQUENCE_TARGET'=>$TARGET, 'PRIMER_PRODUCT_SIZE_RANGE' > =>$PRIMER_PRODUCT_SIZE_RANGE); > > I get: > Can't locate object method "add_targets" via package > "Bio::Tools::Run::Primer3Redux" at p3ra.pl line 31, line 1. > > On the other hand, if I change that line to: > $primer3->set_parameters( > 'PRIMER_OPT_TM'=>$PRIMER_OPT_TM,'PRIMER_MIN_TM'=>$PRIMER_MIN_TM, > 'PRIMER_MAX_TM'=>$PRIMER_MAX_TM, > 'PRIMER_PAIR_MAX_DIFF_TM'=>$PRIMER_MAX_DIFF_TM, > 'PRIMER_MAX_SIZE'=>$PRIMER_MAX_SIZE,'PRIMER_OPT_SIZE'=>$PRIMER_OPT_SIZE, > 'PRIMER_MIN_SIZE'=>$PRIMER_MIN_SIZE, > 'PRIMER_MAX_GC'=>$PRIMER_MAX_GC, > 'PRIMER_OPT_GC_PERCENT'=>$PRIMER_OPT_GC_PERCENT, > 'PRIMER_MIN_GC'=>$PRIMER_MIN_GC, > 'SEQUENCE_TARGET'=>$TARGET, 'PRIMER_PRODUCT_SIZE_RANGE' > =>$PRIMER_PRODUCT_SIZE_RANGE); > > It works. When I looked at the source code for Primer3Redux, I > couldn't find add_targets, but set_parameters looked like it might > work, so I used that instead, and it worked. > > But I see over in the github that there are other issues with the > documentation (how primer3redux's result object is now 3 deep rather > than 2 deep). Not sure if this is in that category or not. That is true; documentation was to be updated but that hasn't happened yet (haven't had the free time to work specifically on this, and I think fschwach was to work on some HOWTO documentation). I do plan on an update in the next few weeks to address the various Issues on github, if you can file this as well it would help. I have to go back and look at the history of add_targets() reative to primer3 bioperl code, but I don't think this was part of the commit history of Bio::Tools::Run::Primer3Redux (maybe for the old version, Bio::Tools::Run::Primer3), so that is probably cruft left over from the update. Would be easy enough to alias it for convenience... chris > Thanks, > Anand ... From miquel.amat at me.com Tue Aug 23 16:11:15 2011 From: miquel.amat at me.com (Miguel A. Amat) Date: Tue, 23 Aug 2011 16:11:15 -0400 Subject: [Bioperl-l] Installation on OS X Lion Message-ID: I am trying to install bioperl on mac os x 10.7 but ran into problems with the dependency packages Bio::ASN1::EntrezGene and DBD::mysql. I am running the latest version of CPAN and perl -v 5.12.3 and the BioPerl-1.6.1 package. The installation was being conducted interactively through via the "perl Build.PL" command. Can you provide some help, or suggest an alternative way of installing BioPerl? From cjfields at illinois.edu Tue Aug 23 20:14:49 2011 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 23 Aug 2011 19:14:49 -0500 Subject: [Bioperl-l] Installation on OS X Lion In-Reply-To: References: Message-ID: Try installing the latest version from CPAN; this bypasses the Bio::ASN1::EntrezGene req. DBD::mysql is only needed if you intend on using modules requiring that functionality. chris On Aug 23, 2011, at 3:11 PM, Miguel A. Amat wrote: > I am trying to install bioperl on mac os x 10.7 but ran into problems with the dependency packages Bio::ASN1::EntrezGene and DBD::mysql. > > I am running the latest version of CPAN and perl -v 5.12.3 and the BioPerl-1.6.1 package. The installation was being conducted interactively through via the "perl Build.PL" command. > > Can you provide some help, or suggest an alternative way of installing BioPerl? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From miquel.amat at me.com Tue Aug 23 23:25:31 2011 From: miquel.amat at me.com (Miguel A Amat) Date: Tue, 23 Aug 2011 23:25:31 -0400 Subject: [Bioperl-l] Installation on OS X Lion In-Reply-To: References: Message-ID: Thanks for the feedback, Chris. Now I just need to get GD to install ... On Aug 23, 2011, at 8:14 PM, Chris Fields wrote: > Try installing the latest version from CPAN; this bypasses the Bio::ASN1::EntrezGene req. DBD::mysql is only needed if you intend on using modules requiring that functionality. > > chris > > On Aug 23, 2011, at 3:11 PM, Miguel A. Amat wrote: > >> I am trying to install bioperl on mac os x 10.7 but ran into problems with the dependency packages Bio::ASN1::EntrezGene and DBD::mysql. >> >> I am running the latest version of CPAN and perl -v 5.12.3 and the BioPerl-1.6.1 package. The installation was being conducted interactively through via the "perl Build.PL" command. >> >> Can you provide some help, or suggest an alternative way of installing BioPerl? >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > From scott at scottcain.net Wed Aug 24 10:31:44 2011 From: scott at scottcain.net (Scott Cain) Date: Wed, 24 Aug 2011 10:31:44 -0400 Subject: [Bioperl-l] Installation on OS X Lion In-Reply-To: References: Message-ID: <0D4184A9-2166-4869-823A-BC780E684DCE@scottcain.net> Hi Miguel, Did you try the installer for snow leopard on sourceforge: http://sourceforge.net/projects/gmod/files/Generic%20Genome%20Browser/libgd-MacOSX/ I don't know if it will work on lion but I don't have a copy of lion yet to try it out on. Scott Sent from my iPhone On Aug 23, 2011, at 11:25 PM, Miguel A Amat wrote: > Thanks for the feedback, Chris. Now I just need to get GD to > install ... > > On Aug 23, 2011, at 8:14 PM, Chris Fields > wrote: > >> Try installing the latest version from CPAN; this bypasses the >> Bio::ASN1::EntrezGene req. DBD::mysql is only needed if you intend >> on using modules requiring that functionality. >> >> chris >> >> On Aug 23, 2011, at 3:11 PM, Miguel A. Amat wrote: >> >>> I am trying to install bioperl on mac os x 10.7 but ran into >>> problems with the dependency packages Bio::ASN1::EntrezGene and >>> DBD::mysql. >>> >>> I am running the latest version of CPAN and perl -v 5.12.3 and the >>> BioPerl-1.6.1 package. The installation was being conducted >>> interactively through via the "perl Build.PL" command. >>> >>> Can you provide some help, or suggest an alternative way of >>> installing BioPerl? >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From sheena.scroggins at gmail.com Wed Aug 24 12:21:07 2011 From: sheena.scroggins at gmail.com (Sheena Scroggins) Date: Wed, 24 Aug 2011 09:21:07 -0700 Subject: [Bioperl-l] End of GSoC Message-ID: I just wanted to give a GIANT thanks to my mentors on the BioPerl project, Rob Buels and Chris Fields. They helped me tremendously and we made great progress on the reorganization. All of the modules we extracted can be found on github at https://github.com/bioperl We used a Dist Zilla plugin bundle, which can also be found there. The steps used in the process will be outlined on the BioPerl wiki in the upcoming weeks. The reorganization is off to a great start and by outlining the workflow I'm hoping others will be able to contribute more easily. The progress updates were posted at techomics.com during the project, although they were sporadic. The original outline of the project can be found there as well. Thanks again to all the mentors of GSoC, this program wouldn't work without you! Sheena From miquel.amat at me.com Wed Aug 24 13:48:06 2011 From: miquel.amat at me.com (Miguel A. Amat) Date: Wed, 24 Aug 2011 13:48:06 -0400 Subject: [Bioperl-l] Installation on OS X Lion In-Reply-To: <0D4184A9-2166-4869-823A-BC780E684DCE@scottcain.net> References: <0D4184A9-2166-4869-823A-BC780E684DCE@scottcain.net> Message-ID: <251484F2-E0EB-454D-B664-BB0834FFCF76@me.com> Thanks for all the help; I finally got it to work. Here are the steps I took: upgraded CPAN and used latest version of BioPerl installed dependencies in interactive mode, but GD failed. Quit the installation and tried ?install GD-SVG?; this one seems to have less functionality than GD, but it worked. Installed Bio::Perl. Then, installed Bio::ASN1::EntrezGene Best. On Aug 24, 2011, at 10:31 AM, Scott Cain wrote: > Hi Miguel, > > Did you try the installer for snow leopard on sourceforge: > > http://sourceforge.net/projects/gmod/files/Generic%20Genome%20Browser/libgd-MacOSX/ > > I don't know if it will work on lion but I don't have a copy of lion yet to try it out on. > > Scott > > > Sent from my iPhone > > On Aug 23, 2011, at 11:25 PM, Miguel A Amat wrote: > >> Thanks for the feedback, Chris. Now I just need to get GD to install ... >> >> On Aug 23, 2011, at 8:14 PM, Chris Fields wrote: >> >>> Try installing the latest version from CPAN; this bypasses the Bio::ASN1::EntrezGene req. DBD::mysql is only needed if you intend on using modules requiring that functionality. >>> >>> chris >>> >>> On Aug 23, 2011, at 3:11 PM, Miguel A. Amat wrote: >>> >>>> I am trying to install bioperl on mac os x 10.7 but ran into problems with the dependency packages Bio::ASN1::EntrezGene and DBD::mysql. >>>> >>>> I am running the latest version of CPAN and perl -v 5.12.3 and the BioPerl-1.6.1 package. The installation was being conducted interactively through via the "perl Build.PL" command. >>>> >>>> Can you provide some help, or suggest an alternative way of installing BioPerl? >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Wed Aug 24 13:51:19 2011 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 24 Aug 2011 12:51:19 -0500 Subject: [Bioperl-l] Installation on OS X Lion In-Reply-To: <251484F2-E0EB-454D-B664-BB0834FFCF76@me.com> References: <0D4184A9-2166-4869-823A-BC780E684DCE@scottcain.net> <251484F2-E0EB-454D-B664-BB0834FFCF76@me.com> Message-ID: <200F67E8-7B4E-40AD-9C0D-37160B970F22@illinois.edu> Interesting, since GD::SVG requires GD. Anyway, glad to know it's working for you! chris On Aug 24, 2011, at 12:48 PM, Miguel A. Amat wrote: > Thanks for all the help; I finally got it to work. Here are the steps I took: > > > ? upgraded CPAN and used latest version of BioPerl > ? installed dependencies in interactive mode, but GD failed. > ? Quit the installation and tried ?install GD-SVG?; this one seems to have less functionality than GD, but it worked. > ? Installed Bio::Perl. > ? Then, installed Bio::ASN1::EntrezGene > > > > > > > Best. > > > On Aug 24, 2011, at 10:31 AM, Scott Cain wrote: > >> Hi Miguel, >> >> Did you try the installer for snow leopard on sourceforge: >> >> http://sourceforge.net/projects/gmod/files/Generic%20Genome%20Browser/libgd-MacOSX/ >> >> I don't know if it will work on lion but I don't have a copy of lion yet to try it out on. >> >> Scott >> >> >> Sent from my iPhone >> >> On Aug 23, 2011, at 11:25 PM, Miguel A Amat wrote: >> >>> Thanks for the feedback, Chris. Now I just need to get GD to install ... >>> >>> On Aug 23, 2011, at 8:14 PM, Chris Fields wrote: >>> >>>> Try installing the latest version from CPAN; this bypasses the Bio::ASN1::EntrezGene req. DBD::mysql is only needed if you intend on using modules requiring that functionality. >>>> >>>> chris >>>> >>>> On Aug 23, 2011, at 3:11 PM, Miguel A. Amat wrote: >>>> >>>>> I am trying to install bioperl on mac os x 10.7 but ran into problems with the dependency packages Bio::ASN1::EntrezGene and DBD::mysql. >>>>> >>>>> I am running the latest version of CPAN and perl -v 5.12.3 and the BioPerl-1.6.1 package. The installation was being conducted interactively through via the "perl Build.PL" command. >>>>> >>>>> Can you provide some help, or suggest an alternative way of installing BioPerl? >>>>> _______________________________________________ >>>>> Bioperl-l mailing list >>>>> Bioperl-l at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > From abualiga2 at gmail.com Wed Aug 24 14:09:10 2011 From: abualiga2 at gmail.com (galeb abu-ali) Date: Wed, 24 Aug 2011 14:09:10 -0400 Subject: [Bioperl-l] append schema to proxy In-Reply-To: References: Message-ID: Hi, I'm trying to run a program that generates a circular genome homology atlas "BLASTatlas" ( http://www.cbs.dtu.dk/ws/ws.php?entry=BLASTatlas ). I think the problem is with the module that appends schemas to the proxy, and I don't know how to do that manually. I've emailed the author couple times and have not heard back. Pasted below is the error message. At your convenience, I'd greatly appreciate your help. thanks galeb p/s - also, is there another program that can generate concetric circular plots of BLAST scores for multiple bacterial genomes with a per nucleotide resolution? thanks [galeb at localhost GeneWiz]$ BLASTatlas -modus circle -ref BX571966.fsa -proteins BX571966.proteins.fsa -ann BX571966.ann -blastcfg blast.cfg -customcfg custom.cfg --dnap="Intrinsic Curvature,Stacking Energy,Position Preference" -title "B. pseudomallei K96243" > sgeneric.ps # title set to 'B. pseudomallei K96243' # output format is ps # modus is 'circle' # loading reference genome ... # loading proteins ... # parsing blast lane configuration (blast.cfg) ... # .. parsing blast lane (B. ubonensis Bu) ... # .. .. program: tblastn # .. .. parsing color 101010_040410 # .. .. .. color from: r:10, g:10, b:10 # .. .. .. color to: r:04, g:04, b:10 # .. .. byrange: 0 .. 0.8 # .. parsing sequene source 'cat ./19539.fsa |' ... 1142 done # .. parsing blast lane (B. pseudomallei DM98) ... # .. .. program: tblastn # .. .. parsing color 101010_040410 # .. .. .. color from: r:10, g:10, b:10 # .. .. .. color to: r:04, g:04, b:10 # .. .. byrange: 0 .. 0.8 # .. parsing sequene source 'cat ./19509.fsa |' ... 2370 done # parsing custom lane configuration (custom.cfg) ... # .. parsing custom data entry SIDD at -0.035 ... # .. .. parsing color 000010_101010 # .. .. .. color from: r:00, g:00, b:10 # .. .. .. color to: r:10, g:10, b:10 # .. .. byrange: 9 .. 10 # .. .. boxfilter 5000 ... # .. parsing data source 'gunzip -c BX571966-57a2f2c2e11ca0dd8cd74493d667d4d6-3173005.sidd--0.035-c-10-c.out.gz | cut -f4 |' ... # .. .. parsing data source ... 3173005 done # reading external files and build hash of sequences ... *panic: schemas() removed in v2.00, not needed anymore* at /usr/local/lib/perl5/site_perl/5.12.2/XML/Compile/WSDL11.pm line 65 XML::Compile::WSDL11::schemas(XML::Compile::WSDL11=HASH(0x1fed6740)) at xml-compile.pl line 48 main::appendSchemas(XML::Compile::WSDL11=HASH(0x1fed6740), " http://www.cbs.dtu.dk/ws/common/ws_common_1_0b.xsd", " http://www.cbs.dtu.dk/ws/BLASTatlas/ws_blastatlas_1_0_ws2.xsd") at BLASTatlas line 177 From roy.chaudhuri at gmail.com Wed Aug 24 14:21:12 2011 From: roy.chaudhuri at gmail.com (Roy Chaudhuri) Date: Wed, 24 Aug 2011 19:21:12 +0100 Subject: [Bioperl-l] append schema to proxy In-Reply-To: References: Message-ID: <4E554118.90108@gmail.com> Hi Galeb, This is the wrong mailing list for your question - it's intended for discussion of the Bioperl toolkit, not general bioinformatics questions. Next time, try a general bioinformatics mailing list such as BBB: http://www.bioinformatics.org/lists/bbb Having said all that, maybe you could try BRIG: http://sourceforge.net/projects/brig/ http://www.biomedcentral.com/1471-2164/12/402 Cheers, Roy. On 24/08/2011 19:09, galeb abu-ali wrote: > Hi, > > I'm trying to run a program that generates a circular genome homology atlas > "BLASTatlas" ( http://www.cbs.dtu.dk/ws/ws.php?entry=BLASTatlas ). I think > the problem is with the module that appends schemas to the proxy, and I > don't know how to do that manually. I've emailed the author couple times and > have not heard back. Pasted below is the error message. At your convenience, > I'd greatly appreciate your help. > > thanks > > galeb > > p/s - also, is there another program that can generate concetric circular > plots of BLAST scores for multiple bacterial genomes with a per nucleotide > resolution? thanks > > [galeb at localhost GeneWiz]$ BLASTatlas -modus circle -ref BX571966.fsa > -proteins BX571966.proteins.fsa -ann BX571966.ann -blastcfg blast.cfg > -customcfg custom.cfg --dnap="Intrinsic Curvature,Stacking Energy,Position > Preference" -title "B. pseudomallei K96243"> sgeneric.ps > # title set to 'B. pseudomallei K96243' > # output format is ps > # modus is 'circle' > # loading reference genome ... > # loading proteins ... > # parsing blast lane configuration (blast.cfg) ... > # .. parsing blast lane (B. ubonensis Bu) ... > # .. .. program: tblastn > # .. .. parsing color 101010_040410 > # .. .. .. color from: r:10, g:10, b:10 > # .. .. .. color to: r:04, g:04, b:10 > # .. .. byrange: 0 .. 0.8 > # .. parsing sequene source 'cat ./19539.fsa |' ... 1142 done > # .. parsing blast lane (B. pseudomallei DM98) ... > # .. .. program: tblastn > # .. .. parsing color 101010_040410 > # .. .. .. color from: r:10, g:10, b:10 > # .. .. .. color to: r:04, g:04, b:10 > # .. .. byrange: 0 .. 0.8 > # .. parsing sequene source 'cat ./19509.fsa |' ... 2370 done > # parsing custom lane configuration (custom.cfg) ... > # .. parsing custom data entry SIDD at -0.035 ... > # .. .. parsing color 000010_101010 > # .. .. .. color from: r:00, g:00, b:10 > # .. .. .. color to: r:10, g:10, b:10 > # .. .. byrange: 9 .. 10 > # .. .. boxfilter 5000 ... > # .. parsing data source 'gunzip -c > BX571966-57a2f2c2e11ca0dd8cd74493d667d4d6-3173005.sidd--0.035-c-10-c.out.gz > | cut -f4 |' ... > # .. .. parsing data source ... 3173005 done > # reading external files and build hash of sequences ... > *panic: schemas() removed in v2.00, not needed anymore* > at /usr/local/lib/perl5/site_perl/5.12.2/XML/Compile/WSDL11.pm line 65 > XML::Compile::WSDL11::schemas(XML::Compile::WSDL11=HASH(0x1fed6740)) at > xml-compile.pl line 48 > main::appendSchemas(XML::Compile::WSDL11=HASH(0x1fed6740), " > http://www.cbs.dtu.dk/ws/common/ws_common_1_0b.xsd", " > http://www.cbs.dtu.dk/ws/BLASTatlas/ws_blastatlas_1_0_ws2.xsd") at > BLASTatlas line 177 > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Wed Aug 24 14:22:26 2011 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 24 Aug 2011 13:22:26 -0500 Subject: [Bioperl-l] append schema to proxy In-Reply-To: References: Message-ID: <61E5BF3C-653F-40D3-8764-0DA61859BC8B@illinois.edu> Sorry, but this doesn't have anything to do with BioPerl. Not sure you'll get an answer here. chris On Aug 24, 2011, at 1:09 PM, galeb abu-ali wrote: > Hi, > > I'm trying to run a program that generates a circular genome homology atlas > "BLASTatlas" ( http://www.cbs.dtu.dk/ws/ws.php?entry=BLASTatlas ). I think > the problem is with the module that appends schemas to the proxy, and I > don't know how to do that manually. I've emailed the author couple times and > have not heard back. Pasted below is the error message. At your convenience, > I'd greatly appreciate your help. > > thanks > > galeb > > p/s - also, is there another program that can generate concetric circular > plots of BLAST scores for multiple bacterial genomes with a per nucleotide > resolution? thanks > > [galeb at localhost GeneWiz]$ BLASTatlas -modus circle -ref BX571966.fsa > -proteins BX571966.proteins.fsa -ann BX571966.ann -blastcfg blast.cfg > -customcfg custom.cfg --dnap="Intrinsic Curvature,Stacking Energy,Position > Preference" -title "B. pseudomallei K96243" > sgeneric.ps > # title set to 'B. pseudomallei K96243' > # output format is ps > # modus is 'circle' > # loading reference genome ... > # loading proteins ... > # parsing blast lane configuration (blast.cfg) ... > # .. parsing blast lane (B. ubonensis Bu) ... > # .. .. program: tblastn > # .. .. parsing color 101010_040410 > # .. .. .. color from: r:10, g:10, b:10 > # .. .. .. color to: r:04, g:04, b:10 > # .. .. byrange: 0 .. 0.8 > # .. parsing sequene source 'cat ./19539.fsa |' ... 1142 done > # .. parsing blast lane (B. pseudomallei DM98) ... > # .. .. program: tblastn > # .. .. parsing color 101010_040410 > # .. .. .. color from: r:10, g:10, b:10 > # .. .. .. color to: r:04, g:04, b:10 > # .. .. byrange: 0 .. 0.8 > # .. parsing sequene source 'cat ./19509.fsa |' ... 2370 done > # parsing custom lane configuration (custom.cfg) ... > # .. parsing custom data entry SIDD at -0.035 ... > # .. .. parsing color 000010_101010 > # .. .. .. color from: r:00, g:00, b:10 > # .. .. .. color to: r:10, g:10, b:10 > # .. .. byrange: 9 .. 10 > # .. .. boxfilter 5000 ... > # .. parsing data source 'gunzip -c > BX571966-57a2f2c2e11ca0dd8cd74493d667d4d6-3173005.sidd--0.035-c-10-c.out.gz > | cut -f4 |' ... > # .. .. parsing data source ... 3173005 done > # reading external files and build hash of sequences ... > *panic: schemas() removed in v2.00, not needed anymore* > at /usr/local/lib/perl5/site_perl/5.12.2/XML/Compile/WSDL11.pm line 65 > XML::Compile::WSDL11::schemas(XML::Compile::WSDL11=HASH(0x1fed6740)) at > xml-compile.pl line 48 > main::appendSchemas(XML::Compile::WSDL11=HASH(0x1fed6740), " > http://www.cbs.dtu.dk/ws/common/ws_common_1_0b.xsd", " > http://www.cbs.dtu.dk/ws/BLASTatlas/ws_blastatlas_1_0_ws2.xsd") at > BLASTatlas line 177 > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From abualiga2 at gmail.com Wed Aug 24 14:39:33 2011 From: abualiga2 at gmail.com (abualiga2 at gmail.com) Date: Wed, 24 Aug 2011 18:39:33 +0000 Subject: [Bioperl-l] append schema to proxy In-Reply-To: <4E554118.90108@gmail.com> Message-ID: <00504502ec3723598f04ab44a23f@google.com> Roy, thanks! I'll try that. galeb On Aug 24, 2011 2:21pm, Roy Chaudhuri wrote: > Hi Galeb, > This is the wrong mailing list for your question - it's intended for > discussion of the Bioperl toolkit, not general bioinformatics questions. > Next time, try a general bioinformatics mailing list such as BBB: > http://www.bioinformatics.org/lists/bbb > Having said all that, maybe you could try BRIG: > http://sourceforge.net/projects/brig/ > http://www.biomedcentral.com/1471-2164/12/402 > Cheers, > Roy. From slucky at ibab.ac.in Mon Aug 22 02:01:16 2011 From: slucky at ibab.ac.in (Lucky Singh) Date: Mon, 22 Aug 2011 11:31:16 +0530 (IST) Subject: [Bioperl-l] Problem using Bio::Tools::Run::RemoteBlast Message-ID: <37711.192.168.1.254.1313992876.squirrel@webmail.ibab.ac.in> Dear sir/Ma'am, I am student of Institute of Bioinformatics and Applied Biotechnology, Bangalore, India. While doing my project work I needed remoteblast.pm. So I used default example program which is available with this package. Now I wanted to host it from web server, but This program is not working from it may be it is not able to create or write on file from web server but in command line it is working fine. I don't know the possible reason, please help me to figure it out. -> I am using same example program with basic cgi modification for taking input from web browser. -> Ubuntu 10.04 64 bit OS -> apache2 server -> I have given all permissions 777 recursively to cgi-bin folder -- Regards, Lucky Singh Institute of Bioinformatics and Applied Biotechnology, ------------------------------------------------------ Biotech Park Electronics City Phase I Bangalore 560 100 India. Tel: 080-28528900, 080-28528901, 080-28528902 Fax: 080-28528904 From abualiga2 at gmail.com Wed Aug 24 13:26:10 2011 From: abualiga2 at gmail.com (galeb abu-ali) Date: Wed, 24 Aug 2011 13:26:10 -0400 Subject: [Bioperl-l] append schema to proxy Message-ID: Hi, I'm trying to run a program that generates a circular genome homology atlas "BLASTatlas" ( http://www.cbs.dtu.dk/ws/ws.php?entry=BLASTatlas ). I think the problem is with the module that appends schemas to the proxy, and I don't know how to do that manually. I've emailed the author couple times and have not heard back. Pasted below is the error message. At your convenience, I'd greatly appreciate your help. thanks galeb p/s - also, is there another program that can generate concetric circular plots of BLAST scores for multiple bacterial genomes with a per nucleotide resolution? thanks [galeb at localhost GeneWiz]$ BLASTatlas -modus circle -ref BX571966.fsa -proteins BX571966.proteins.fsa -ann BX571966.ann -blastcfg blast.cfg -customcfg custom.cfg --dnap="Intrinsic Curvature,Stacking Energy,Position Preference" -title "B. pseudomallei K96243" > sgeneric.ps # title set to 'B. pseudomallei K96243' # output format is ps # modus is 'circle' # loading reference genome ... # loading proteins ... # parsing blast lane configuration (blast.cfg) ... # .. parsing blast lane (B. ubonensis Bu) ... # .. .. program: tblastn # .. .. parsing color 101010_040410 # .. .. .. color from: r:10, g:10, b:10 # .. .. .. color to: r:04, g:04, b:10 # .. .. byrange: 0 .. 0.8 # .. parsing sequene source 'cat ./19539.fsa |' ... 1142 done # .. parsing blast lane (B. pseudomallei DM98) ... # .. .. program: tblastn # .. .. parsing color 101010_040410 # .. .. .. color from: r:10, g:10, b:10 # .. .. .. color to: r:04, g:04, b:10 # .. .. byrange: 0 .. 0.8 # .. parsing sequene source 'cat ./19509.fsa |' ... 2370 done # parsing custom lane configuration (custom.cfg) ... # .. parsing custom data entry SIDD at -0.035 ... # .. .. parsing color 000010_101010 # .. .. .. color from: r:00, g:00, b:10 # .. .. .. color to: r:10, g:10, b:10 # .. .. byrange: 9 .. 10 # .. .. boxfilter 5000 ... # .. parsing data source 'gunzip -c BX571966-57a2f2c2e11ca0dd8cd74493d667d4d6-3173005.sidd--0.035-c-10-c.out.gz | cut -f4 |' ... # .. .. parsing data source ... 3173005 done # reading external files and build hash of sequences ... *panic: schemas() removed in v2.00, not needed anymore* at /usr/local/lib/perl5/site_perl/5.12.2/XML/Compile/WSDL11.pm line 65 XML::Compile::WSDL11::schemas(XML::Compile::WSDL11=HASH(0x1fed6740)) at xml-compile.pl line 48 main::appendSchemas(XML::Compile::WSDL11=HASH(0x1fed6740), " http://www.cbs.dtu.dk/ws/common/ws_common_1_0b.xsd", " http://www.cbs.dtu.dk/ws/BLASTatlas/ws_blastatlas_1_0_ws2.xsd") at BLASTatlas line 177 From jj.emerson at gmail.com Wed Aug 24 21:53:38 2011 From: jj.emerson at gmail.com (J.J. Emerson) Date: Wed, 24 Aug 2011 18:53:38 -0700 Subject: [Bioperl-l] Bio::SeqIO can't guess the format of data from a pipe Message-ID: Hello All, I have experienced some behavior in SeqIO that doesn't seem to be what I would expect. Basically, for a certain script, if I try to pass something like "-fh => \*STDIN" to Bio::SeqIO->new(), it will fail if both of the following two conditions are met simultaneously: 1. STDIN is coming from a pipe; 2. SeqIO is trying to guess the format. If STDIO is coming from redirection instead of a pipe or if the format is specified manually (i.e. BioPERL doesn't have to guess), the error doesn't seem to occur. This issue has been reported previously: http://lists.open-bio.org/pipermail/bioperl-l/2010-July/033681.html https://redmine.open-bio.org/issues/3122 This issue is ultimately one of using seek() on a pipe, which is forbidden (see below). To be clear, there are kludgy ways around this that allow BioPERL to take input from a pipe AND guess the format. My naive and inefficient kludge was to test for reading from STDIN and for the absence of a format. If both of these conditions are met, then I slurp STDIN into a variable and then open a filehandle on that variable, and pass it to SeqIO, which can guess the format if the fh isn't opened on a pipe. SeqIO then successfully guesses the format and does the SeqIO thing, at the expense of having the program pass over the data at least twice. And if the input file is huge, it could potentially consume all the memory. A better way to address the problem would be to process the input one line at a time, but this seems to require more extensive changes. The reason I'm reposting this is because I think that the inability to guess the sequence format from data originating from a pipe is an important limitation for a fundamental part of BioPERL. When designing scripts to be used in pipelines, the inability to guess formats for piped data limits BioPERL's pipelineability substantially. Even though previous reports of this have been made and a bug opened and closed, I was wondering if anyone thought this was worthwhile fixing so as to make SeqIO (and probably AlignIO as well?) more flexible? Does anyone think this should be refiled as a bug? Cheers, J.J. PS Below are snippets of code and/or errors related to reproducing the failure to guess unspecified formats. I'll see how Mailman treats my attachments and post the code as a reply if they don't work. The bioperl_fhtest.pl attachment is the script that reproduces the error. The w.fa is a fasta file containing some sequence. Here are the command lines to generate the behavior I observe (w.fa is a file containing some fasta sequences, in my case it was the w gene from different *Drosophila* species): ./bioperl_fhtest.pl fasta < w.fa # Works (redirection, no guessing) > ./bioperl_fhtest.pl < w.fa # Works (redirection, guessing) > > cat w.fa | ./bioperl_fhtest.pl fasta # Works (pipe, no guessing) > cat w.fa | ./bioperl_fhtest.pl # DOESN'T work (pipe, guessing) > Here's the error I get in the last case: ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: Failed resetting the filehandle; IO error occurred > STACK: Error::throw > STACK: Bio::Root::Root::throw > /usr/local/share/perl/5.10.1/Bio/Root/Root.pm:472 > STACK: Bio::Tools::GuessSeqFormat::guess > /usr/local/share/perl/5.10.1/Bio/Tools/GuessSeqFormat.pm:512 > STACK: Bio::SeqIO::new /usr/local/share/perl/5.10.1/Bio/SeqIO.pm:381 > STACK: ./bioperl_fhtest.pl:8 > ----------------------------------------------------------- > >From what I gather, the error is triggered by a failure of seek() on a STDIO fh on lines 517-518 (text from the version GuessSeqFormat.pm installed on my server): 512 if (defined $self->{-file}) { > 513 # Close the file we opened. > 514 close($fh); > 515 } elsif (ref $fh eq 'GLOB') { > 516 # Try seeking to the start position. > 517 seek($fh, $start_pos, 0) || $self->throw("Failed resetting > the ". > 518 "filehandle; IO error > occurred");; > 519 } elsif (defined $fh && $fh->can('setpos')) { > 520 # Seek to the start position. > 521 $fh->setpos($start_pos); > 522 } > -------------- next part -------------- A non-text attachment was scrubbed... Name: bioperl_fhtest.pl Type: text/x-perl-script Size: 505 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: w.fa Type: application/octet-stream Size: 6335 bytes Desc: not available URL: From frederic.sapet at gmail.com Thu Aug 25 09:24:08 2011 From: frederic.sapet at gmail.com (=?UTF-8?B?RnLDqWTDqXJpYyBTYXBldA==?=) Date: Thu, 25 Aug 2011 15:24:08 +0200 Subject: [Bioperl-l] fasta35 and fasta36 parsing support in BioPerl Message-ID: Hello I have tried to parse a fasta35 report file using BioPerl, in order to produce a valid HTML file. It seems to work well, but there's a small issue with homology string in the report. Please find in joined files, a test script. After that, I have tried to parse a fasta36 file, but this seems to be not supported yet: here is the error thrown : Uncaught exception from user code: ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: Unrecognized alignment line (3) '>--' STACK: Error::throw STACK: Bio::Root::Root::throw /home/bga/bioinfo/fsapet/BioPerlLive/lib/perl5/Bio/Root/Root.pm:472 STACK: Bio::SearchIO::fasta::next_result /home/bga/bioinfo/fsapet/BioPerlLive/lib/perl5/Bio/SearchIO/fasta.pm:1061 STACK: ./test.pl:36 ----------------------------------------------------------- at /usr/lib/perl5/site_perl/5.10.0/Error.pm line 184 Error::throw('Bio::Root::Exception', 'Unrecognized alignment line (3) \'>--\'') called at /home/bga/bioinfo/fsapet/BioPerlLive/lib/perl5/Bio/Root/Root.pm line 472 Bio::Root::Root::throw('Bio::SearchIO::fasta=HASH', 'Unrecognized alignment line (3) \'>--\'') called at /home/bga/bioinfo/fsapet/BioPerlLive/lib/perl5/Bio/SearchIO/fasta.pm line 1061 Bio::SearchIO::fasta::next_result('Bio::SearchIO::fasta=HASH') called at ./test.pl line 36 Thank you Fred -------------- next part -------------- A non-text attachment was scrubbed... Name: FastaBioPerl.tar.bz2 Type: application/x-bzip2 Size: 7692 bytes Desc: not available URL: From miquel.amat at me.com Tue Aug 23 02:07:54 2011 From: miquel.amat at me.com (Miguel A. Amat) Date: Tue, 23 Aug 2011 02:07:54 -0400 Subject: [Bioperl-l] Help Message-ID: <44829080-5467-4103-AF5B-D09CBDA6F99F@me.com> I am trying to install bioperl on mac os x 10.7 but ran into problems with the dependencies Bio::ASN1::EntrezGene and DBD::mysql. I am running the latest version of CPAN and perl -v 5.12.3 and the BioPerl-1.6.1 package. The installation was being conducted interactively through via the "perl Build.PL" command. Can you provide some help? From bosborne11 at verizon.net Thu Aug 25 10:35:29 2011 From: bosborne11 at verizon.net (Brian Osborne) Date: Thu, 25 Aug 2011 10:35:29 -0400 Subject: [Bioperl-l] SeqIO alters Genbank files Message-ID: bioperl-l, I need to run something by you before I commit code and tests. I have code that takes a Genbank file as input and creates another Genbank file as output. I noticed that SeqIO - specifically FTHelper.pm - was taking a tag like this in the input file: /score=100.1 And adding a "note" tag, so the output file contains this: /score=100.1 /note="score=100.1" I'm assuming that the code does this because NCBI will not accept score tags and values even though Bioperl, generally speaking, does not say that NCBI defines the fine details of Genbank format. On the other hand I don't like the idea that SeqIO is altering the content. It also turns out that if you have code that does multiple round-trips you end up with text like this: /score=100.1 /note="score=100.1" /note="score=100.1" /note="score=100.1" /note="score=100.1" Should I comment out the code that's doing these edits or not? Thanks again, Brian O. From cjfields at illinois.edu Thu Aug 25 12:21:15 2011 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 25 Aug 2011 11:21:15 -0500 Subject: [Bioperl-l] Problem using Bio::Tools::Run::RemoteBlast In-Reply-To: <37711.192.168.1.254.1313992876.squirrel@webmail.ibab.ac.in> References: <37711.192.168.1.254.1313992876.squirrel@webmail.ibab.ac.in> Message-ID: It's hard to evaluate what the problem is w/o code, the BioPerl version, and so on. It's very possible you are using an out-of-date BioPerl. chris On Aug 22, 2011, at 1:01 AM, Lucky Singh wrote: > Dear sir/Ma'am, > > I am student of Institute of Bioinformatics and Applied Biotechnology, > Bangalore, India. While doing my project work I needed remoteblast.pm. So > I used default example program which is available with this package. Now I > wanted to host it from web server, but This program is not working from it > may be it is not able to create or write on file from web server but in > command line it is working fine. I don't know the possible reason, please > help me to figure it out. > > > -> I am using same example program with basic cgi modification for taking > input from web browser. > -> Ubuntu 10.04 64 bit OS > -> apache2 server > -> I have given all permissions 777 recursively to cgi-bin folder > > > -- > Regards, > Lucky Singh > > Institute of Bioinformatics and Applied Biotechnology, > ------------------------------------------------------ > Biotech Park > Electronics City Phase I > Bangalore 560 100 > India. > Tel: 080-28528900, 080-28528901, 080-28528902 > Fax: 080-28528904 > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Thu Aug 25 12:34:40 2011 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 25 Aug 2011 11:34:40 -0500 Subject: [Bioperl-l] fasta35 and fasta36 parsing support in BioPerl In-Reply-To: References: Message-ID: <4C95797A-343C-4651-AF0C-964A7E10E8D1@illinois.edu> Frederic, The best place to post this is to our bug server: http://redmine.open-bio.org Attach all relevant data for the bug, this really helps us to diagnose the issue. chris On Aug 25, 2011, at 8:24 AM, Fr?d?ric Sapet wrote: > Hello > I have tried to parse a fasta35 report file using BioPerl, in order to > produce a valid HTML file. > It seems to work well, but there's a small issue with homology string > in the report. > Please find in joined files, a test script. > > After that, I have tried to parse a fasta36 file, but this seems to be > not supported yet: here is the error thrown : > > Uncaught exception from user code: > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: Unrecognized alignment line (3) '>--' > STACK: Error::throw > STACK: Bio::Root::Root::throw > /home/bga/bioinfo/fsapet/BioPerlLive/lib/perl5/Bio/Root/Root.pm:472 > STACK: Bio::SearchIO::fasta::next_result > /home/bga/bioinfo/fsapet/BioPerlLive/lib/perl5/Bio/SearchIO/fasta.pm:1061 > STACK: ./test.pl:36 > ----------------------------------------------------------- > at /usr/lib/perl5/site_perl/5.10.0/Error.pm line 184 > Error::throw('Bio::Root::Exception', 'Unrecognized alignment line (3) > \'>--\'') called at > /home/bga/bioinfo/fsapet/BioPerlLive/lib/perl5/Bio/Root/Root.pm line > 472 > Bio::Root::Root::throw('Bio::SearchIO::fasta=HASH', 'Unrecognized > alignment line (3) \'>--\'') called at > /home/bga/bioinfo/fsapet/BioPerlLive/lib/perl5/Bio/SearchIO/fasta.pm > line 1061 > Bio::SearchIO::fasta::next_result('Bio::SearchIO::fasta=HASH') called > at ./test.pl line 36 > > Thank you > > Fred > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Thu Aug 25 12:42:30 2011 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 25 Aug 2011 11:42:30 -0500 Subject: [Bioperl-l] SeqIO alters Genbank files In-Reply-To: References: Message-ID: Brian, I think comment out the code; our baked-in validation is only half-correct anyway, and I think it's probably a good idea to veer towards separation of format validation and parsing (they're two related but different concerns). To tell the truth, I think we should eschew using FTHelper altogether and just use a Bio::SeqFeatureI-based class directly. I haven't quite grasped the reasoning behind FTHelper.pm, and I would bet removing it as a middleman across the board would help parsing speed. Anyone have an objection to that, or at least an explanation for generation of tons of FTHelper instances that couldn't be handled by a Factory? chris On Aug 25, 2011, at 9:35 AM, Brian Osborne wrote: > bioperl-l, > > I need to run something by you before I commit code and tests. I have code that takes a Genbank file as input and creates another Genbank file as output. I noticed that SeqIO - specifically FTHelper.pm - was taking a tag like this in the input file: > > /score=100.1 > > And adding a "note" tag, so the output file contains this: > > /score=100.1 > /note="score=100.1" > > I'm assuming that the code does this because NCBI will not accept score tags and values even though Bioperl, generally speaking, does not say that NCBI defines the fine details of Genbank format. > > On the other hand I don't like the idea that SeqIO is altering the content. It also turns out that if you have code that does multiple round-trips you end up with text like this: > > /score=100.1 > /note="score=100.1" > /note="score=100.1" > /note="score=100.1" > /note="score=100.1" > > Should I comment out the code that's doing these edits or not? > > Thanks again, > > Brian O. > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Thu Aug 25 12:58:51 2011 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 25 Aug 2011 11:58:51 -0500 Subject: [Bioperl-l] Bio::SeqIO can't guess the format of data from a pipe In-Reply-To: References: Message-ID: <2E070303-ADDF-472B-835B-6D3959B83E9A@illinois.edu> On Aug 24, 2011, at 8:53 PM, J.J. Emerson wrote: > Hello All, > > I have experienced some behavior in SeqIO that doesn't seem to be what I > would expect. Basically, for a certain script, if I try to pass something > like "-fh => \*STDIN" to Bio::SeqIO->new(), it will fail if both of the > following two conditions are met simultaneously: > > 1. STDIN is coming from a pipe; > 2. SeqIO is trying to guess the format. > > If STDIO is coming from redirection instead of a pipe or if the format is > specified manually (i.e. BioPERL doesn't have to guess), the error doesn't > seem to occur. > > This issue has been reported previously: > > http://lists.open-bio.org/pipermail/bioperl-l/2010-July/033681.html > https://redmine.open-bio.org/issues/3122 Yes, this was addressed according to that case. > This issue is ultimately one of using seek() on a pipe, which is forbidden > (see below). To be clear, there are kludgy ways around this that allow > BioPERL to take input from a pipe AND guess the format. My naive and > inefficient kludge was to test for reading from STDIN and for the absence of > a format. If both of these conditions are met, then I slurp STDIN into a > variable and then open a filehandle on that variable, and pass it to SeqIO, > which can guess the format if the fh isn't opened on a pipe. SeqIO then > successfully guesses the format and does the SeqIO thing, at the expense of > having the program pass over the data at least twice. And if the input file > is huge, it could potentially consume all the memory. A better way to > address the problem would be to process the input one line at a time, but > this seems to require more extensive changes. Have you tried tempfiles? Not that this is a great solution, but it's very commonly used for large sequence data, and it is seekable. This behavior could also be wrapped in GuessSeqFormat i suppose (but see below) > The reason I'm reposting this is because I think that the inability to guess > the sequence format from data originating from a pipe is an important > limitation for a fundamental part of BioPERL. When designing scripts to be > used in pipelines, the inability to guess formats for piped data limits > BioPERL's pipelineability substantially. Even though previous reports of > this have been made and a bug opened and closed, I was wondering if anyone > thought this was worthwhile fixing so as to make SeqIO (and probably AlignIO > as well?) more flexible? > > Does anyone think this should be refiled as a bug? > > Cheers, > > J.J. The fundamental problem with pipes (as you indicated) is that the data stream is not seekable. We do have a built-in buffer in Bio::Root::IO that somewhat handles this, but Bio::Tools::GuessSeqFormat is (IIRC) designed to use the filehandle directly, bypassing the BioPerl IO layer completely. One solution is to redesign GuessSeqFormat to use Bio::Root::IO, have GuessSeqFormat push all data back to the buffer, then let SeqIO parse. That will require some fundamental changes for both Bio::Root::IO and Bio::SeqIO (note that one cannot pass a Bio::Root::IO instance to another Bio::Root::IO-based class for parsing at this time). The other option is (as hinted above) having GuessSeqFormat dump the data to a tempfile, seek back after guessing, and retain the filehandle for Bio::SeqIO. Not the best solutions, but either should work. My question (not a criticism, just trying to understand the problem): why are you going through all the trouble of using GuessSeqFormat as a permanent solution anyway? If you have a stream returning a possibly unknown data type, I would argue that the fundamental bug is not GuessSeqFormat but something else, more specifically not knowing the behavior of the data source and the returned format to begin with. Is something preventing that? My point is, GuessSeqFormat is fine as a temporary stop-gap, but it is not a permanent solution to your problems (it is guessing, after all). Note the code has had very little development over the years, and the related SeqIO code hasn't aged particularly well. > PS > > Below are snippets of code and/or errors related to reproducing the failure > to guess unspecified formats. I'll see how Mailman treats my attachments and > post the code as a reply if they don't work. > > The bioperl_fhtest.pl attachment is the script that reproduces the error. > The w.fa is a fasta file containing some sequence. > > Here are the command lines to generate the behavior I observe (w.fa is a > file containing some fasta sequences, in my case it was the w gene from > different *Drosophila* species): > > ./bioperl_fhtest.pl fasta < w.fa # Works (redirection, no guessing) >> ./bioperl_fhtest.pl < w.fa # Works (redirection, guessing) >> >> cat w.fa | ./bioperl_fhtest.pl fasta # Works (pipe, no guessing) >> cat w.fa | ./bioperl_fhtest.pl # DOESN'T work (pipe, guessing) >> > > > Here's the error I get in the last case: > > ------------- EXCEPTION: Bio::Root::Exception ------------- >> MSG: Failed resetting the filehandle; IO error occurred >> STACK: Error::throw >> STACK: Bio::Root::Root::throw >> /usr/local/share/perl/5.10.1/Bio/Root/Root.pm:472 >> STACK: Bio::Tools::GuessSeqFormat::guess >> /usr/local/share/perl/5.10.1/Bio/Tools/GuessSeqFormat.pm:512 >> STACK: Bio::SeqIO::new /usr/local/share/perl/5.10.1/Bio/SeqIO.pm:381 >> STACK: ./bioperl_fhtest.pl:8 >> ----------------------------------------------------------- >> > >> From what I gather, the error is triggered by a failure of seek() on a STDIO > fh on lines 517-518 (text from the version GuessSeqFormat.pm installed on my > server): > > 512 if (defined $self->{-file}) { >> 513 # Close the file we opened. >> 514 close($fh); >> 515 } elsif (ref $fh eq 'GLOB') { >> 516 # Try seeking to the start position. >> 517 seek($fh, $start_pos, 0) || $self->throw("Failed resetting >> the ". >> 518 "filehandle; IO error >> occurred");; >> 519 } elsif (defined $fh && $fh->can('setpos')) { >> 520 # Seek to the start position. >> 521 $fh->setpos($start_pos); >> 522 } >> > _______________________________________________ You are always welcome to reopen and update the bug, or file a new one. chris From cjfields at illinois.edu Thu Aug 25 13:16:03 2011 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 25 Aug 2011 12:16:03 -0500 Subject: [Bioperl-l] SeqIO alters Genbank files In-Reply-To: References: Message-ID: <393F144A-AECE-4F7D-B418-B71D46F3C82F@illinois.edu> Brian, Yes, that's correct (comment out or remove the other stuff). Not sure what difference it will make, I'm interested to see if anything fundamental expects this behavior and breaks with tests. Using 'git blame', it appears Allen Day added this in relation to Feature-Annotation code we actually reverted a few years ago, so this should be removed anyway. I still think we should work around FTHelper altogether. Reading the code, it seems like a ton of wasted instances being generated for no apparent reason. Now going back to our bioperl archives to see if there is any need for it... chris On Aug 25, 2011, at 11:53 AM, Brian Osborne wrote: > Chris, > > OK, will do. I should add that an early version of FTHelper was doing this same edit with the "strand", "source_tag", and "frame" tags but someone has commented out the "source_tag" and "strand" lines. > > Should I comment out both "score" and "frame" code? > > BIO > > On Aug 25, 2011, at 12:42 PM, Chris Fields wrote: > >> Brian, >> >> I think comment out the code; our baked-in validation is only half-correct anyway, and I think it's probably a good idea to veer towards separation of format validation and parsing (they're two related but different concerns). >> >> To tell the truth, I think we should eschew using FTHelper altogether and just use a Bio::SeqFeatureI-based class directly. I haven't quite grasped the reasoning behind FTHelper.pm, and I would bet removing it as a middleman across the board would help parsing speed. Anyone have an objection to that, or at least an explanation for generation of tons of FTHelper instances that couldn't be handled by a Factory? >> >> chris >> >> On Aug 25, 2011, at 9:35 AM, Brian Osborne wrote: >> >>> bioperl-l, >>> >>> I need to run something by you before I commit code and tests. I have code that takes a Genbank file as input and creates another Genbank file as output. I noticed that SeqIO - specifically FTHelper.pm - was taking a tag like this in the input file: >>> >>> /score=100.1 >>> >>> And adding a "note" tag, so the output file contains this: >>> >>> /score=100.1 >>> /note="score=100.1" >>> >>> I'm assuming that the code does this because NCBI will not accept score tags and values even though Bioperl, generally speaking, does not say that NCBI defines the fine details of Genbank format. >>> >>> On the other hand I don't like the idea that SeqIO is altering the content. It also turns out that if you have code that does multiple round-trips you end up with text like this: >>> >>> /score=100.1 >>> /note="score=100.1" >>> /note="score=100.1" >>> /note="score=100.1" >>> /note="score=100.1" >>> >>> Should I comment out the code that's doing these edits or not? >>> >>> Thanks again, >>> >>> Brian O. >>> >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > From bosborne11 at verizon.net Thu Aug 25 12:53:08 2011 From: bosborne11 at verizon.net (Brian Osborne) Date: Thu, 25 Aug 2011 12:53:08 -0400 Subject: [Bioperl-l] SeqIO alters Genbank files In-Reply-To: References: Message-ID: Chris, OK, will do. I should add that an early version of FTHelper was doing this same edit with the "strand", "source_tag", and "frame" tags but someone has commented out the "source_tag" and "strand" lines. Should I comment out both "score" and "frame" code? BIO On Aug 25, 2011, at 12:42 PM, Chris Fields wrote: > Brian, > > I think comment out the code; our baked-in validation is only half-correct anyway, and I think it's probably a good idea to veer towards separation of format validation and parsing (they're two related but different concerns). > > To tell the truth, I think we should eschew using FTHelper altogether and just use a Bio::SeqFeatureI-based class directly. I haven't quite grasped the reasoning behind FTHelper.pm, and I would bet removing it as a middleman across the board would help parsing speed. Anyone have an objection to that, or at least an explanation for generation of tons of FTHelper instances that couldn't be handled by a Factory? > > chris > > On Aug 25, 2011, at 9:35 AM, Brian Osborne wrote: > >> bioperl-l, >> >> I need to run something by you before I commit code and tests. I have code that takes a Genbank file as input and creates another Genbank file as output. I noticed that SeqIO - specifically FTHelper.pm - was taking a tag like this in the input file: >> >> /score=100.1 >> >> And adding a "note" tag, so the output file contains this: >> >> /score=100.1 >> /note="score=100.1" >> >> I'm assuming that the code does this because NCBI will not accept score tags and values even though Bioperl, generally speaking, does not say that NCBI defines the fine details of Genbank format. >> >> On the other hand I don't like the idea that SeqIO is altering the content. It also turns out that if you have code that does multiple round-trips you end up with text like this: >> >> /score=100.1 >> /note="score=100.1" >> /note="score=100.1" >> /note="score=100.1" >> /note="score=100.1" >> >> Should I comment out the code that's doing these edits or not? >> >> Thanks again, >> >> Brian O. >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From jj.emerson at gmail.com Thu Aug 25 14:52:48 2011 From: jj.emerson at gmail.com (J.J. Emerson) Date: Thu, 25 Aug 2011 11:52:48 -0700 Subject: [Bioperl-l] Bio::SeqIO can't guess the format of data from a pipe In-Reply-To: <2E070303-ADDF-472B-835B-6D3959B83E9A@illinois.edu> References: <2E070303-ADDF-472B-835B-6D3959B83E9A@illinois.edu> Message-ID: Hi Chris, You asked: My question (not a criticism, just trying to understand the problem): why > are you going through all the trouble of using GuessSeqFormat as a permanent > solution anyway? If you have a stream returning a possibly unknown data > type, I would argue that the fundamental bug is not GuessSeqFormat but > something else, more specifically not knowing the behavior of the data > source and the returned format to begin with. Is something preventing that? > In my particular case, I'm trying not to impose a particular usage scenario onto the script I'm writing in the hopes it will be useful (and general) to others in my lab in the future*. In my proximate case, I will certainly be able to provide SeqIO with a format argument. But insofar as GuessSeqFormat is considered desirable (and reasonable people could indeed disagree whether it is desirable) I think its applicability shouldn't hinge on whether it is guessing on a pipe or a file. My point is, GuessSeqFormat is fine as a temporary stop-gap, but it is not a > permanent solution to your problems (it is guessing, after all). Note the > code has had very little development over the years, and the related SeqIO > code hasn't aged particularly well. > I see. I wasn't aware that GuessSeqFormat was so relatively neglected. Given the rather challenging nature of the more elegant fix you suggested (using the buffering of Root:IO), perhaps I should consider dropping my issue or filing it as a feature request rather than a bug? Cheers, J.J. PS * The way I plan on using my script is roughly as follows: prog1 [some arguments] \ | myscript.pl --informat fasta \ | prog2 \ | prog3 > pipeline.output However, I'd like for the "--informat" switch to be optional, mainly to increase usability for other users. For any well considered format, the information is right there in the data to know what the format is, and as such, providing the format a second time is somewhat redundant. In principle, being able to do the following would be very useful: prog1 [some arguments] \ | myscript.pl \ | prog2 > pipeline.output The modularity of pipelining is very valuable and this is what caused me to anticipate a usage scenario that involved both GuessSeqFormat and reading from a pipe. On Thu, Aug 25, 2011 at 9:58 AM, Chris Fields wrote: > On Aug 24, 2011, at 8:53 PM, J.J. Emerson wrote: > > > Hello All, > > > > I have experienced some behavior in SeqIO that doesn't seem to be what I > > would expect. Basically, for a certain script, if I try to pass something > > like "-fh => \*STDIN" to Bio::SeqIO->new(), it will fail if both of the > > following two conditions are met simultaneously: > > > > 1. STDIN is coming from a pipe; > > 2. SeqIO is trying to guess the format. > > > > If STDIO is coming from redirection instead of a pipe or if the format is > > specified manually (i.e. BioPERL doesn't have to guess), the error > doesn't > > seem to occur. > > > > This issue has been reported previously: > > > > http://lists.open-bio.org/pipermail/bioperl-l/2010-July/033681.html > > https://redmine.open-bio.org/issues/3122 > > Yes, this was addressed according to that case. > > > This issue is ultimately one of using seek() on a pipe, which is > forbidden > > (see below). To be clear, there are kludgy ways around this that allow > > BioPERL to take input from a pipe AND guess the format. My naive and > > inefficient kludge was to test for reading from STDIN and for the absence > of > > a format. If both of these conditions are met, then I slurp STDIN into a > > variable and then open a filehandle on that variable, and pass it to > SeqIO, > > which can guess the format if the fh isn't opened on a pipe. SeqIO then > > successfully guesses the format and does the SeqIO thing, at the expense > of > > having the program pass over the data at least twice. And if the input > file > > is huge, it could potentially consume all the memory. A better way to > > address the problem would be to process the input one line at a time, but > > this seems to require more extensive changes. > > Have you tried tempfiles? Not that this is a great solution, but it's very > commonly used for large sequence data, and it is seekable. This behavior > could also be wrapped in GuessSeqFormat i suppose (but see below) > > > The reason I'm reposting this is because I think that the inability to > guess > > the sequence format from data originating from a pipe is an important > > limitation for a fundamental part of BioPERL. When designing scripts to > be > > used in pipelines, the inability to guess formats for piped data limits > > BioPERL's pipelineability substantially. Even though previous reports of > > this have been made and a bug opened and closed, I was wondering if > anyone > > thought this was worthwhile fixing so as to make SeqIO (and probably > AlignIO > > as well?) more flexible? > > > > Does anyone think this should be refiled as a bug? > > > > Cheers, > > > > J.J. > > The fundamental problem with pipes (as you indicated) is that the data > stream is not seekable. We do have a built-in buffer in Bio::Root::IO that > somewhat handles this, but Bio::Tools::GuessSeqFormat is (IIRC) designed to > use the filehandle directly, bypassing the BioPerl IO layer completely. > > One solution is to redesign GuessSeqFormat to use Bio::Root::IO, have > GuessSeqFormat push all data back to the buffer, then let SeqIO parse. That > will require some fundamental changes for both Bio::Root::IO and Bio::SeqIO > (note that one cannot pass a Bio::Root::IO instance to another > Bio::Root::IO-based class for parsing at this time). > > The other option is (as hinted above) having GuessSeqFormat dump the data > to a tempfile, seek back after guessing, and retain the filehandle for > Bio::SeqIO. Not the best solutions, but either should work. > > My question (not a criticism, just trying to understand the problem): why > are you going through all the trouble of using GuessSeqFormat as a permanent > solution anyway? If you have a stream returning a possibly unknown data > type, I would argue that the fundamental bug is not GuessSeqFormat but > something else, more specifically not knowing the behavior of the data > source and the returned format to begin with. Is something preventing that? > > My point is, GuessSeqFormat is fine as a temporary stop-gap, but it is not > a permanent solution to your problems (it is guessing, after all). Note the > code has had very little development over the years, and the related SeqIO > code hasn't aged particularly well. > > > PS > > > > Below are snippets of code and/or errors related to reproducing the > failure > > to guess unspecified formats. I'll see how Mailman treats my attachments > and > > post the code as a reply if they don't work. > > > > The bioperl_fhtest.pl attachment is the script that reproduces the > error. > > The w.fa is a fasta file containing some sequence. > > > > Here are the command lines to generate the behavior I observe (w.fa is a > > file containing some fasta sequences, in my case it was the w gene from > > different *Drosophila* species): > > > > ./bioperl_fhtest.pl fasta < w.fa # Works (redirection, no guessing) > >> ./bioperl_fhtest.pl < w.fa # Works (redirection, guessing) > >> > >> cat w.fa | ./bioperl_fhtest.pl fasta # Works (pipe, no guessing) > >> cat w.fa | ./bioperl_fhtest.pl # DOESN'T work (pipe, guessing) > >> > > > > > > Here's the error I get in the last case: > > > > ------------- EXCEPTION: Bio::Root::Exception ------------- > >> MSG: Failed resetting the filehandle; IO error occurred > >> STACK: Error::throw > >> STACK: Bio::Root::Root::throw > >> /usr/local/share/perl/5.10.1/Bio/Root/Root.pm:472 > >> STACK: Bio::Tools::GuessSeqFormat::guess > >> /usr/local/share/perl/5.10.1/Bio/Tools/GuessSeqFormat.pm:512 > >> STACK: Bio::SeqIO::new /usr/local/share/perl/5.10.1/Bio/SeqIO.pm:381 > >> STACK: ./bioperl_fhtest.pl:8 > >> ----------------------------------------------------------- > >> > > > >> From what I gather, the error is triggered by a failure of seek() on a > STDIO > > fh on lines 517-518 (text from the version GuessSeqFormat.pm installed on > my > > server): > > > > 512 if (defined $self->{-file}) { > >> 513 # Close the file we opened. > >> 514 close($fh); > >> 515 } elsif (ref $fh eq 'GLOB') { > >> 516 # Try seeking to the start position. > >> 517 seek($fh, $start_pos, 0) || $self->throw("Failed > resetting > >> the ". > >> 518 "filehandle; IO error > >> occurred");; > >> 519 } elsif (defined $fh && $fh->can('setpos')) { > >> 520 # Seek to the start position. > >> 521 $fh->setpos($start_pos); > >> 522 } > >> > > _______________________________________________ > > You are always welcome to reopen and update the bug, or file a new one. > > chris > > From cjfields at illinois.edu Thu Aug 25 17:04:15 2011 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 25 Aug 2011 16:04:15 -0500 Subject: [Bioperl-l] Bio::SeqIO can't guess the format of data from a pipe In-Reply-To: References: <2E070303-ADDF-472B-835B-6D3959B83E9A@illinois.edu> Message-ID: On Aug 25, 2011, at 1:52 PM, J.J. Emerson wrote: > Hi Chris, > > You asked: > > My question (not a criticism, just trying to understand the problem): why are you going through all the trouble of using GuessSeqFormat as a permanent solution anyway? If you have a stream returning a possibly unknown data type, I would argue that the fundamental bug is not GuessSeqFormat but something else, more specifically not knowing the behavior of the data source and the returned format to begin with. Is something preventing that? > > In my particular case, I'm trying not to impose a particular usage scenario onto the script I'm writing in the hopes it will be useful (and general) to others in my lab in the future*. In my proximate case, I will certainly be able to provide SeqIO with a format argument. But insofar as GuessSeqFormat is considered desirable (and reasonable people could indeed disagree whether it is desirable) I think its applicability shouldn't hinge on whether it is guessing on a pipe or a file. > > My point is, GuessSeqFormat is fine as a temporary stop-gap, but it is not a permanent solution to your problems (it is guessing, after all). Note the code has had very little development over the years, and the related SeqIO code hasn't aged particularly well. > > I see. I wasn't aware that GuessSeqFormat was so relatively neglected. Given the rather challenging nature of the more elegant fix you suggested (using the buffering of Root:IO), perhaps I should consider dropping my issue or filing it as a feature request rather than a bug? That's fine. I don't want to dissuade you from taking this on, either. > Cheers, > > J.J. > > PS > > * The way I plan on using my script is roughly as follows: > > prog1 [some arguments] \ > | myscript.pl --informat fasta \ > | prog2 \ > | prog3 > pipeline.output > > However, I'd like for the "--informat" switch to be optional, mainly to increase usability for other users. For any well considered format, the information is right there in the data to know what the format is, and as such, providing the format a second time is somewhat redundant. In principle, being able to do the following would be very useful: > > prog1 [some arguments] \ > | myscript.pl \ > | prog2 > pipeline.output > > The modularity of pipelining is very valuable and this is what caused me to anticipate a usage scenario that involved both GuessSeqFormat and reading from a pipe. Not disagreeing with you at all, flexible code is best. chris From hlapp at drycafe.net Thu Aug 25 22:29:44 2011 From: hlapp at drycafe.net (Hilmar Lapp) Date: Fri, 26 Aug 2011 11:29:44 +0900 Subject: [Bioperl-l] SeqIO alters Genbank files In-Reply-To: References: Message-ID: <00375B6C-64AE-4D43-9D98-6CD90C31A76A@drycafe.net> Could this behavior perhaps be made optional, with the default being off? -hilmar On Aug 25, 2011, at 11:35 PM, Brian Osborne wrote: > bioperl-l, > > I need to run something by you before I commit code and tests. I > have code that takes a Genbank file as input and creates another > Genbank file as output. I noticed that SeqIO - specifically > FTHelper.pm - was taking a tag like this in the input file: > > /score=100.1 > > And adding a "note" tag, so the output file contains this: > > /score=100.1 > /note="score=100.1" > > I'm assuming that the code does this because NCBI will not accept > score tags and values even though Bioperl, generally speaking, does > not say that NCBI defines the fine details of Genbank format. > > On the other hand I don't like the idea that SeqIO is altering the > content. It also turns out that if you have code that does multiple > round-trips you end up with text like this: > > /score=100.1 > /note="score=100.1" > /note="score=100.1" > /note="score=100.1" > /note="score=100.1" > > Should I comment out the code that's doing these edits or not? > > Thanks again, > > Brian O. > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : =========================================================== From carandraug+dev at gmail.com Fri Aug 26 10:20:39 2011 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Fri, 26 Aug 2011 15:20:39 +0100 Subject: [Bioperl-l] Problem using Bio::Tools::Run::RemoteBlast In-Reply-To: <37711.192.168.1.254.1313992876.squirrel@webmail.ibab.ac.in> References: <37711.192.168.1.254.1313992876.squirrel@webmail.ibab.ac.in> Message-ID: On 22 August 2011 07:01, Lucky Singh wrote: > Now I > wanted to host it from web server, but This program is not working from it > may be it is not able to create or write on file from web server but in > command line it is working fine. I don't know the possible reason, please > help me to figure it out. Have you looked in the apache logs (look in /var/log/apache2/error.log) ? Can you pastebin your whole code and the content of the error log after trying to run the script? From bosborne11 at verizon.net Fri Aug 26 10:39:44 2011 From: bosborne11 at verizon.net (Brian Osborne) Date: Fri, 26 Aug 2011 10:39:44 -0400 Subject: [Bioperl-l] SeqIO alters Genbank files In-Reply-To: <00375B6C-64AE-4D43-9D98-6CD90C31A76A@drycafe.net> References: <00375B6C-64AE-4D43-9D98-6CD90C31A76A@drycafe.net> Message-ID: <9EB8EA4F-0E22-4446-A57E-F726E001B068@verizon.net> Hilmar, Yes, of course. Are you thinking that this code is designed, in part, to help people submit to NCBI? BIO On Aug 25, 2011, at 10:29 PM, Hilmar Lapp wrote: > Could this behavior perhaps be made optional, with the default being off? > > -hilmar > > On Aug 25, 2011, at 11:35 PM, Brian Osborne wrote: > >> bioperl-l, >> >> I need to run something by you before I commit code and tests. I have code that takes a Genbank file as input and creates another Genbank file as output. I noticed that SeqIO - specifically FTHelper.pm - was taking a tag like this in the input file: >> >> /score=100.1 >> >> And adding a "note" tag, so the output file contains this: >> >> /score=100.1 >> /note="score=100.1" >> >> I'm assuming that the code does this because NCBI will not accept score tags and values even though Bioperl, generally speaking, does not say that NCBI defines the fine details of Genbank format. >> >> On the other hand I don't like the idea that SeqIO is altering the content. It also turns out that if you have code that does multiple round-trips you end up with text like this: >> >> /score=100.1 >> /note="score=100.1" >> /note="score=100.1" >> /note="score=100.1" >> /note="score=100.1" >> >> Should I comment out the code that's doing these edits or not? >> >> Thanks again, >> >> Brian O. >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : > =========================================================== > > > > From hlapp at drycafe.net Fri Aug 26 10:50:26 2011 From: hlapp at drycafe.net (Hilmar Lapp) Date: Fri, 26 Aug 2011 23:50:26 +0900 Subject: [Bioperl-l] SeqIO alters Genbank files In-Reply-To: <9EB8EA4F-0E22-4446-A57E-F726E001B068@verizon.net> References: <00375B6C-64AE-4D43-9D98-6CD90C31A76A@drycafe.net> <9EB8EA4F-0E22-4446-A57E-F726E001B068@verizon.net> Message-ID: On Aug 26, 2011, at 11:39 PM, Brian Osborne wrote: > Are you thinking that this code is designed, in part, to help people > submit to NCBI? I don't know, but perhaps. My thinking was, if the code is doing something that's useful in some, but bad in many or most other situations, it'd be nice if the useful behavior could be retained as an option for those who expressly want (or need) it. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : =========================================================== From florent.angly at gmail.com Sat Aug 27 07:12:05 2011 From: florent.angly at gmail.com (Florent Angly) Date: Sat, 27 Aug 2011 21:12:05 +1000 Subject: [Bioperl-l] Bio::SeqIO can't guess the format of data from a pipe In-Reply-To: References: <2E070303-ADDF-472B-835B-6D3959B83E9A@illinois.edu> Message-ID: <4E58D105.7050805@gmail.com> On the topic of guessing file formats, last I checked, it was difficult to reuse the format guessed by Bio::SeqIO For example, if I want to takes sequences in any format (FASTA, FASTQ, ...) and filter some of them out and put them in a new file in the same format, I need to do something along these lines: # Open the file and let BioPerl guess its format my $in = Bio::SeqIO->new( -file => $input_seqfile ); # Have Bioperl guess the format (again) so we can use the same format for the output file my $format = $in->_guess_format( $input_seqfile ); # Open the output file (same format as the input file my $out = Bio::SeqIO->new( -file => ">".$output_seqfile , format => $format ); # Now do the work... The limitations of the code above is that in is more complex than it should be and forces Bioperl do check the file format twice. My proposal would be to store the format of a file somewhere in the Bio::SeqIO object and create a new get/set method in Bio::SeqIO called format() to store of access its value. The idea would be that the example code above could be rewritten as: # Open the file and let BioPerl guess its format my $in = Bio::SeqIO->new( -file => $input_seqfile ); # Retrieve the format guessed by BioPerl my $format = $in->format( ); # Open the output file using the same format as the input file my $out = Bio::SeqIO->new( -file => ">".$output_seqfile , format => $format ); # Now do the work... I think this is more elegant since it is more readable, requires less computation (the file format is guessed once), and is more consistent with other Bio::SeqIO methods like alphabet, that guesses the alphabet but has a get/set method to access it. Florent On 26/08/11 07:04, Chris Fields wrote: > On Aug 25, 2011, at 1:52 PM, J.J. Emerson wrote: > >> Hi Chris, >> >> You asked: >> >> My question (not a criticism, just trying to understand the problem): why are you going through all the trouble of using GuessSeqFormat as a permanent solution anyway? If you have a stream returning a possibly unknown data type, I would argue that the fundamental bug is not GuessSeqFormat but something else, more specifically not knowing the behavior of the data source and the returned format to begin with. Is something preventing that? >> >> In my particular case, I'm trying not to impose a particular usage scenario onto the script I'm writing in the hopes it will be useful (and general) to others in my lab in the future*. In my proximate case, I will certainly be able to provide SeqIO with a format argument. But insofar as GuessSeqFormat is considered desirable (and reasonable people could indeed disagree whether it is desirable) I think its applicability shouldn't hinge on whether it is guessing on a pipe or a file. >> >> My point is, GuessSeqFormat is fine as a temporary stop-gap, but it is not a permanent solution to your problems (it is guessing, after all). Note the code has had very little development over the years, and the related SeqIO code hasn't aged particularly well. >> >> I see. I wasn't aware that GuessSeqFormat was so relatively neglected. Given the rather challenging nature of the more elegant fix you suggested (using the buffering of Root:IO), perhaps I should consider dropping my issue or filing it as a feature request rather than a bug? > That's fine. I don't want to dissuade you from taking this on, either. > >> Cheers, >> >> J.J. >> >> PS >> >> * The way I plan on using my script is roughly as follows: >> >> prog1 [some arguments] \ >> | myscript.pl --informat fasta \ >> | prog2 \ >> | prog3> pipeline.output >> >> However, I'd like for the "--informat" switch to be optional, mainly to increase usability for other users. For any well considered format, the information is right there in the data to know what the format is, and as such, providing the format a second time is somewhat redundant. In principle, being able to do the following would be very useful: >> >> prog1 [some arguments] \ >> | myscript.pl \ >> | prog2> pipeline.output >> >> The modularity of pipelining is very valuable and this is what caused me to anticipate a usage scenario that involved both GuessSeqFormat and reading from a pipe. > Not disagreeing with you at all, flexible code is best. > > chris > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Fri Aug 26 23:54:05 2011 From: cjfields at illinois.edu (Chris Fields) Date: Fri, 26 Aug 2011 22:54:05 -0500 Subject: [Bioperl-l] Bio::SeqIO can't guess the format of data from a pipe In-Reply-To: <4E58D105.7050805@gmail.com> References: <2E070303-ADDF-472B-835B-6D3959B83E9A@illinois.edu> <4E58D105.7050805@gmail.com> Message-ID: On Aug 27, 2011, at 6:12 AM, Florent Angly wrote: > On the topic of guessing file formats, last I checked, it was difficult to reuse the format guessed by Bio::SeqIO > > For example, if I want to takes sequences in any format (FASTA, FASTQ, ...) and filter some of them out and put them in a new file in the same format, I need to do something along these lines: > > # Open the file and let BioPerl guess its format > my $in = Bio::SeqIO->new( -file => $input_seqfile ); > > # Have Bioperl guess the format (again) so we can use the same format for the output file > my $format = $in->_guess_format( $input_seqfile ); > > # Open the output file (same format as the input file > my $out = Bio::SeqIO->new( -file => ">".$output_seqfile , format => $format ); > > # Now do the work... > > The limitations of the code above is that in is more complex than it should be and forces Bioperl do check the file format twice. My proposal would be to store the format of a file somewhere in the Bio::SeqIO object and create a new get/set method in Bio::SeqIO called format() to store of access its value. The name of the class is the format (that's how they are loaded). We could add this as a convenience level for Bio::SeqIO (fairly easy to do, actually), but it would only makes sense as a getter. Bio::SeqIO dynamically loads the proper Bio::SeqIO:: module in the constructor (Bio::SeqIO::genbank, for example). Being able to set the format to 'fasta' with a loaded Bio::SeqIO::genbank still gets GenBank format. > The idea would be that the example code above could be rewritten as: > > # Open the file and let BioPerl guess its format > my $in = Bio::SeqIO->new( -file => $input_seqfile ); > > # Retrieve the format guessed by BioPerl > my $format = $in->format( ); > > # Open the output file using the same format as the input file > my $out = Bio::SeqIO->new( -file => ">".$output_seqfile , format => $format ); > > # Now do the work... > > I think this is more elegant since it is more readable, requires less computation (the file format is guessed once), and is more consistent with other Bio::SeqIO methods like alphabet, that guesses the alphabet but has a get/set method to access it. > > Florent Guessing the alphabet for the vast majority of sequence data isn't quite as complex and quixotic as guessing a sequence format. The latter is far more variable and infinitely increases, much like standards (ex: http://xkcd.com/927/). Not that sequences aren't capable of change... chris From hlapp at drycafe.net Fri Aug 26 23:43:57 2011 From: hlapp at drycafe.net (Hilmar Lapp) Date: Sat, 27 Aug 2011 12:43:57 +0900 Subject: [Bioperl-l] Bio::SeqIO can't guess the format of data from a pipe In-Reply-To: <4E58D105.7050805@gmail.com> References: <2E070303-ADDF-472B-835B-6D3959B83E9A@illinois.edu> <4E58D105.7050805@gmail.com> Message-ID: The format is already available - it is in essence the class of the SeqIO instance: my $format = ref($in); Rather than passing that into SeqIO->new(), you can directly instantiate a new object from it: my $out = ref($in)->new(-file => ...); Would that address what you are trying to accomplish? -hilmar Sent with a tap. On Aug 27, 2011, at 8:12 PM, Florent Angly wrote: > My proposal would be to store the format of a file somewhere in the Bio::SeqIO object and create a new get/set method in Bio::SeqIO called format() to store of access its value. The idea would be that the example code above could be rewritten as: > > # Open the file and let BioPerl guess its format > my $in = Bio::SeqIO->new( -file => $input_seqfile ); > > # Retrieve the format guessed by BioPerl > my $format = $in->format( ); > > # Open the output file using the same format as the input file > my $out = Bio::SeqIO->new( -file => ">".$output_seqfile , format => $format ); > > # Now do the work... > > I think this is more elegant since it is more readable, requires less computation (the file format is guessed once), and is more consistent with other Bio::SeqIO methods like alphabet, that guesses the alphabet but has a get/set method to access it. From florent.angly at gmail.com Sun Aug 28 05:08:32 2011 From: florent.angly at gmail.com (Florent Angly) Date: Sun, 28 Aug 2011 19:08:32 +1000 Subject: [Bioperl-l] Bio::SeqIO can't guess the format of data from a pipe In-Reply-To: References: <2E070303-ADDF-472B-835B-6D3959B83E9A@illinois.edu> <4E58D105.7050805@gmail.com> Message-ID: <4E5A0590.2010805@gmail.com> Yes indeed, that's a very convenient way to implement a format() methods that gets the format of the file. I'll try to implement it today. More logic may be involved because of the formats that take variants, e.g. the FASTQ format (Bio::SeqIO::fastq module) has a 'sanger', 'illumina' and 'solexa' variants. Florent On 27/08/11 13:43, Hilmar Lapp wrote: > The format is already available - it is in essence the class of the SeqIO instance: > > my $format = ref($in); > > Rather than passing that into SeqIO->new(), you can directly instantiate a new object from it: > > my $out = ref($in)->new(-file => ...); > > Would that address what you are trying to accomplish? > > -hilmar > > Sent with a tap. > > On Aug 27, 2011, at 8:12 PM, Florent Angly wrote: > >> My proposal would be to store the format of a file somewhere in the Bio::SeqIO object and create a new get/set method in Bio::SeqIO called format() to store of access its value. The idea would be that the example code above could be rewritten as: >> >> # Open the file and let BioPerl guess its format >> my $in = Bio::SeqIO->new( -file => $input_seqfile ); >> >> # Retrieve the format guessed by BioPerl >> my $format = $in->format( ); >> >> # Open the output file using the same format as the input file >> my $out = Bio::SeqIO->new( -file => ">".$output_seqfile , format => $format ); >> >> # Now do the work... >> >> I think this is more elegant since it is more readable, requires less computation (the file format is guessed once), and is more consistent with other Bio::SeqIO methods like alphabet, that guesses the alphabet but has a get/set method to access it. From cjfields at illinois.edu Sat Aug 27 23:27:34 2011 From: cjfields at illinois.edu (Chris Fields) Date: Sat, 27 Aug 2011 22:27:34 -0500 Subject: [Bioperl-l] Bio::SeqIO can't guess the format of data from a pipe In-Reply-To: <4E5A0590.2010805@gmail.com> References: <2E070303-ADDF-472B-835B-6D3959B83E9A@illinois.edu> <4E58D105.7050805@gmail.com> <4E5A0590.2010805@gmail.com> Message-ID: <8D639B95-0666-4F09-8E9E-88C8CDF76ABC@illinois.edu> There is no reason the variant couldn't also be a method; it's fairly generic to Bio::SeqIO. FASTQ just happens to be the only parser that takes advantage of it (probably b/c I added it when I refactored FASTQ :) See the code for Bio::SeqIO::new to see what is done. Again, like the format it only makes sense as a getter method. chris On Aug 28, 2011, at 4:08 AM, Florent Angly wrote: > > Yes indeed, that's a very convenient way to implement a format() methods that gets the format of the file. I'll try to implement it today. More logic may be involved because of the formats that take variants, e.g. the FASTQ format (Bio::SeqIO::fastq module) has a 'sanger', 'illumina' and 'solexa' variants. > Florent > > > On 27/08/11 13:43, Hilmar Lapp wrote: >> The format is already available - it is in essence the class of the SeqIO instance: >> >> my $format = ref($in); >> >> Rather than passing that into SeqIO->new(), you can directly instantiate a new object from it: >> >> my $out = ref($in)->new(-file => ...); >> >> Would that address what you are trying to accomplish? >> >> -hilmar >> >> Sent with a tap. >> >> On Aug 27, 2011, at 8:12 PM, Florent Angly wrote: >> >>> My proposal would be to store the format of a file somewhere in the Bio::SeqIO object and create a new get/set method in Bio::SeqIO called format() to store of access its value. The idea would be that the example code above could be rewritten as: >>> >>> # Open the file and let BioPerl guess its format >>> my $in = Bio::SeqIO->new( -file => $input_seqfile ); >>> >>> # Retrieve the format guessed by BioPerl >>> my $format = $in->format( ); >>> >>> # Open the output file using the same format as the input file >>> my $out = Bio::SeqIO->new( -file => ">".$output_seqfile , format => $format ); >>> >>> # Now do the work... >>> >>> I think this is more elegant since it is more readable, requires less computation (the file format is guessed once), and is more consistent with other Bio::SeqIO methods like alphabet, that guesses the alphabet but has a get/set method to access it. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From florent.angly at gmail.com Sun Aug 28 18:35:36 2011 From: florent.angly at gmail.com (Florent Angly) Date: Mon, 29 Aug 2011 08:35:36 +1000 Subject: [Bioperl-l] Bio::SeqIO can't guess the format of data from a pipe In-Reply-To: <8D639B95-0666-4F09-8E9E-88C8CDF76ABC@illinois.edu> References: <2E070303-ADDF-472B-835B-6D3959B83E9A@illinois.edu> <4E58D105.7050805@gmail.com> <4E5A0590.2010805@gmail.com> <8D639B95-0666-4F09-8E9E-88C8CDF76ABC@illinois.edu> Message-ID: <4E5AC2B8.9060808@gmail.com> Hi, I implemented the format() getter method in Bio::SeqIO as discussed, essentially following the way proposed by Hilmar. The variant() method is not needed since Bio::SeqIO::fastq already has a get/set method for that. I noticed that there are plenty more Bio*IO modules that could benefit from having a format() method, e.g.: Bio::AlignIO Bio::ClusterIO Bio::FeatureIO Bio::MapIO Bio::OntologyIO Bio::SearchIO Bio::TreeIO Bio::Assembly::IO * The code could be copy-pasted for each of them but it is not very graceful. Is there a way we could have all these IO modules share the same format() method? * Note how the IO class for Bio::Assembly is called Bio::Assembly::IO, and not Bio::AssemblyIO like for other classes. This may be something to change in the future for consistency. Florent On 28/08/11 13:27, Chris Fields wrote: > There is no reason the variant couldn't also be a method; it's fairly generic to Bio::SeqIO. FASTQ just happens to be the only parser that takes advantage of it (probably b/c I added it when I refactored FASTQ :) > > See the code for Bio::SeqIO::new to see what is done. Again, like the format it only makes sense as a getter method. > > chris > > On Aug 28, 2011, at 4:08 AM, Florent Angly wrote: > >> Yes indeed, that's a very convenient way to implement a format() methods that gets the format of the file. I'll try to implement it today. More logic may be involved because of the formats that take variants, e.g. the FASTQ format (Bio::SeqIO::fastq module) has a 'sanger', 'illumina' and 'solexa' variants. >> Florent >> >> >> On 27/08/11 13:43, Hilmar Lapp wrote: >>> The format is already available - it is in essence the class of the SeqIO instance: >>> >>> my $format = ref($in); >>> >>> Rather than passing that into SeqIO->new(), you can directly instantiate a new object from it: >>> >>> my $out = ref($in)->new(-file => ...); >>> >>> Would that address what you are trying to accomplish? >>> >>> -hilmar >>> >>> Sent with a tap. >>> >>> On Aug 27, 2011, at 8:12 PM, Florent Angly wrote: >>> >>>> My proposal would be to store the format of a file somewhere in the Bio::SeqIO object and create a new get/set method in Bio::SeqIO called format() to store of access its value. The idea would be that the example code above could be rewritten as: >>>> >>>> # Open the file and let BioPerl guess its format >>>> my $in = Bio::SeqIO->new( -file => $input_seqfile ); >>>> >>>> # Retrieve the format guessed by BioPerl >>>> my $format = $in->format( ); >>>> >>>> # Open the output file using the same format as the input file >>>> my $out = Bio::SeqIO->new( -file => ">".$output_seqfile , format => $format ); >>>> >>>> # Now do the work... >>>> >>>> I think this is more elegant since it is more readable, requires less computation (the file format is guessed once), and is more consistent with other Bio::SeqIO methods like alphabet, that guesses the alphabet but has a get/set method to access it. >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Sun Aug 28 21:10:27 2011 From: cjfields at illinois.edu (Chris Fields) Date: Sun, 28 Aug 2011 20:10:27 -0500 Subject: [Bioperl-l] Bio::SeqIO can't guess the format of data from a pipe In-Reply-To: <4E5AC2B8.9060808@gmail.com> References: <2E070303-ADDF-472B-835B-6D3959B83E9A@illinois.edu> <4E58D105.7050805@gmail.com> <4E5A0590.2010805@gmail.com> <8D639B95-0666-4F09-8E9E-88C8CDF76ABC@illinois.edu> <4E5AC2B8.9060808@gmail.com> Message-ID: On Aug 28, 2011, at 5:35 PM, Florent Angly wrote: > Hi, > > I implemented the format() getter method in Bio::SeqIO as discussed, essentially following the way proposed by Hilmar. The variant() method is not needed since Bio::SeqIO::fastq already has a get/set method for that. Right, but the method could be used by other modules if it were moved to Bio::SeqIO. for instance. > I noticed that there are plenty more Bio*IO modules that could benefit from having a format() method, e.g.: > Bio::AlignIO > Bio::ClusterIO > Bio::FeatureIO > Bio::MapIO > Bio::OntologyIO > Bio::SearchIO > Bio::TreeIO > Bio::Assembly::IO * > The code could be copy-pasted for each of them but it is not very graceful. Is there a way we could have all these IO modules share the same format() method? Move the method to Bio::Root::IO, the common base class for all of the above. > * Note how the IO class for Bio::Assembly is called Bio::Assembly::IO, and not Bio::AssemblyIO like for other classes. This may be something to change in the future for consistency. > > Florent That's possible; one could take advantage of that for redesign/API issues if it were needed. chris From noncoding at gmail.com Mon Aug 29 06:31:10 2011 From: noncoding at gmail.com (Remo Sanges) Date: Mon, 29 Aug 2011 12:31:10 +0200 Subject: [Bioperl-l] Opportunity: PhD in BIOINFORMATICS at SZN, Naples, Italy In-Reply-To: <7F0AE58E-6052-469B-ACD0-207FAD060472@drycafe.net> References: <7F0AE58E-6052-469B-ACD0-207FAD060472@drycafe.net> Message-ID: <4E5B6A6E.2020508@gmail.com> (Apologies if you have received this already or if this is considered spam. Please feel free to pass on to anyone who might be interested.) The Stazione Zoologica Anton Dohrn in Naples is among the top research institutions in the world in the fields of marine biology and ecology. The new established bioinformatics laboratory is seeking for a candidate interested in the evolution of genome architecture http://bit.ly/okEGvL We are looking for someone who understands basic biological and evolutionary problems and is able to independently accomplish bioinformatics tasks. Candidates will be expected to have knowledge of biology, genetics and functional genomics, to demonstrate the ability to work in a UNIX/Linux environment and to be familiar with a scripting language (e.g. Perl), a database system (e.g. MySQL) and a statistical programming environment (e.g R). Previous experience with comparative genomics and genomics databases as well as an understanding of statistical methods used in the interpretation of biological data is a desirable asset. Wet lab work might be required during the PhD. All the information about the PhD and the guidelines on how to apply are listed on the webpage http://bit.ly/d2WuXk The closing date for applications is 20 September 2011. Kind Regards Remo -- Remo Sanges Bioinformatics - Animal Physiology and Evolution Stazione Zoologica Anton Dohrn Villa Comunale, 80121 Napoli - Italy +39 081 5833428 From locarpau at upvnet.upv.es Mon Aug 29 12:47:13 2011 From: locarpau at upvnet.upv.es (Lorenzo Carretero Paulet) Date: Mon, 29 Aug 2011 18:47:13 +0200 Subject: [Bioperl-l] Saving Codeml Output file In-Reply-To: <9866C4A4-AC36-4A25-B38F-3006A7BB0F11@illinois.edu> References: <9645AF32-5EC3-41FA-9A32-45B6B92E31FD@illinois.edu> <4DF56976.8080704@upvnet.upv.es> <9866C4A4-AC36-4A25-B38F-3006A7BB0F11@illinois.edu> Message-ID: <1314636433.4e5bc291a40c6@webmail.upv.es> Hi all, I'm running codeml from the PAML package using the corresponding Bioperl wrapper. I'd like to save the output file as -outfile => 'mlc', as in: my $codeml_factory = new Bio::Tools::Run::Phylo::PAML::Codeml ( -outfile => 'mlc', -save_tempfiles => 1, -alignment => $codon_MSA, -tree => $biotree, -params => { #'outfile' =>'mlc', 'verbose' => 1, 'noisy' => 9, 'runmode' => 0, #user tree 'seqtype' => 1, 'model' => $model, 'NSsites' => $NSsites, 'fix_omega' => $fix_omega, 'omega' => $omega, 'ncatG' => $ncatG, 'icode' => 0, #* 0:universal code; 1:mammalian mt; 2-10:see below (5:ciliate nuclear) #'fix_alpha' => 0, #'fix_kappa' => 0, #'RateAncestor' => 0, 'CodonFreq' => 2, 'cleandata' => 1, # remove sites with amibguity data (1 yes, 0 no), 'ndata' => 1 }, ); and subsequently parsing it using my $parserF= Bio::Tools::Phylo::PAML->new (-file => "mlc", -dir => "./"); However, I get the following message. ------------- EXCEPTION ------------- MSG: Could not open mlc: No such file or directory STACK Bio::Root::IO::_initialize_io /Library/Perl//5.10.0/Bio/Root/IO.pm:351 STACK Bio::Tools::Phylo::PAML::new /Library/Perl//5.10.0/Bio/Tools/Phylo/PAML.pm:239 STACK main::BranchSiteEvolAnalysis /Users/marioafares/Documents/workspace/BactEvolGen/BactEvolGen.pl:1421 STACK toplevel /Users/marioafares/Documents/workspace/BactEvolGen/BactEvolGen.pl:939 ------------------------------------- what I guess means the output file is not being saved in the previous step. Anyone knows what's wrong. Tnak you very much in advance for your help. Cheers, Lorenzo From David.Messina at sbc.su.se Mon Aug 29 13:43:33 2011 From: David.Messina at sbc.su.se (Dave Messina) Date: Mon, 29 Aug 2011 19:43:33 +0200 Subject: [Bioperl-l] Saving Codeml Output file In-Reply-To: <1314636433.4e5bc291a40c6@webmail.upv.es> References: <9645AF32-5EC3-41FA-9A32-45B6B92E31FD@illinois.edu> <4DF56976.8080704@upvnet.upv.es> <9866C4A4-AC36-4A25-B38F-3006A7BB0F11@illinois.edu> <1314636433.4e5bc291a40c6@webmail.upv.es> Message-ID: Hi Lorenzo, and subsequently parsing it using > my $parserF= Bio::Tools::Phylo::PAML->new (-file => "mlc", -dir => "./"); > > However, I get the following message. > > ------------- EXCEPTION ------------- > MSG: Could not open mlc: No such file or directory > > what I guess means the output file is not being saved in the previous step. > Your interpretation could be correct. I think though that it might be that the -dir parameter you specify, "./", is not correct. Are you seeing the mlc file in the '.' (current working) dir? If I remember correctly, by default the mlc file is created in a temporary directory in /scratch or /tmp, and the save_tempfiles flag simply keeps that temporary directory from being deleted. I don't have the docs in front of me, but I believe there's a way to get the path of the temp directory that B::T::P::PAML is using. If so, you can use that path as the value for the -dir parameter. Let me know if not, though, and we can follow up on this. Dave PS - also, could you verify that you're using the latest versions of bioperl-live and bioperl-run from Github? From Kevin.M.Brown at asu.edu Mon Aug 29 14:09:29 2011 From: Kevin.M.Brown at asu.edu (Kevin Brown) Date: Mon, 29 Aug 2011 11:09:29 -0700 Subject: [Bioperl-l] Saving Codeml Output file In-Reply-To: References: <9645AF32-5EC3-41FA-9A32-45B6B92E31FD@illinois.edu><4DF56976.8080704@upvnet.upv.es><9866C4A4-AC36-4A25-B38F-3006A7BB0F11@illinois.edu><1314636433.4e5bc291a40c6@webmail.upv.es> Message-ID: <1A4207F8295607498283FE9E93B775B407CCB29D@EX02.asurite.ad.asu.edu> Opening a file for output that does not exist requires the > or >> redirector (depending on if you want to overwrite or append output). my $parserF= Bio::Tools::Phylo::PAML->new (-file => ">mlc", -dir => "./"); Kevin Brown Center for Innovations in Medicine Biodesign Institute Arizona State University > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Dave Messina > Sent: Monday, August 29, 2011 10:44 AM > To: Lorenzo Carretero Paulet > Cc: bioperl-l at lists.open-bio.org > Subject: Re: [Bioperl-l] Saving Codeml Output file > > Hi Lorenzo, > > > and subsequently parsing it using > > my $parserF= Bio::Tools::Phylo::PAML->new (-file => "mlc", -dir => > "./"); > > > > However, I get the following message. > > > > ------------- EXCEPTION ------------- > > MSG: Could not open mlc: No such file or directory > > > > > > what I guess means the output file is not being saved in the previous > step. > > > > > Your interpretation could be correct. I think though that it might be > that > the -dir parameter you specify, "./", is not correct. Are you seeing > the mlc > file in the '.' (current working) dir? > > If I remember correctly, by default the mlc file is created in a > temporary > directory in /scratch or /tmp, and the save_tempfiles flag simply keeps > that > temporary directory from being deleted. > > I don't have the docs in front of me, but I believe there's a way to > get the > path of the temp directory that B::T::P::PAML is using. If so, you can > use > that path as the value for the -dir parameter. > > Let me know if not, though, and we can follow up on this. > > Dave > > PS - also, could you verify that you're using the latest versions of > bioperl-live and bioperl-run from Github? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From scott at scottcain.net Mon Aug 29 14:34:41 2011 From: scott at scottcain.net (Scott Cain) Date: Mon, 29 Aug 2011 14:34:41 -0400 Subject: [Bioperl-l] pls help.. In-Reply-To: References: <92CA808D-16F0-4F08-BC44-8A0C06292EA8@scottcain.net> <1D308407-17A9-4203-9D6C-D71FA0FD74D0@illinois.edu> Message-ID: Hi Ravi, Sorry I took a while to get back to you; I was on vacation last week. Also, please keep correspondence on the bioperl mailing list. If you had, perhaps somebody else would have provided another answer by now. I found the bug in the genbank2gff3 script that causes this problem. You have a few options for how to proceed: 1. Split the multi-genbank file into individual files, put them in a directory, and point the script at that directory (with the --dir flag). If you do this, you won't have to do anything with your BioPerl installation. 2. Get a fresh checkout of bioperl-live from git and install BioPerl from it, as I just committed the fix to the master branch. 3. Manually apply the fix that I just put into master. The diff is here: https://github.com/bioperl/bioperl-live/commit/1cff7d541e704a1f35d85bb27a0ab5911d89f8df Scott On Tue, Aug 23, 2011 at 12:55 AM, Ravi Devani wrote: > Yes the script works but have you seen the gff file generated by it. It has > multiple entries for the same features. And the file keeps on growing in > size with thw same features repeated many times. Thats the problem.. > > Thanking you, > Ravi > > > > -- ------------------------------------------------------------------------ Scott Cain, Ph. D.? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? scott at scottcain dot net GMOD Coordinator (http://gmod.org/)? ? ? ? ? ? ? ? ? ?? 216-392-3087 Ontario Institute for Cancer Research From locarpau at upvnet.upv.es Mon Aug 29 14:56:50 2011 From: locarpau at upvnet.upv.es (Lorenzo Carretero Paulet) Date: Mon, 29 Aug 2011 20:56:50 +0200 Subject: [Bioperl-l] Saving Codeml Output file In-Reply-To: References: <9645AF32-5EC3-41FA-9A32-45B6B92E31FD@illinois.edu> <4DF56976.8080704@upvnet.upv.es> <9866C4A4-AC36-4A25-B38F-3006A7BB0F11@illinois.edu> <1314636433.4e5bc291a40c6@webmail.upv.es> Message-ID: <1314644210.4e5be0f277c05@webmail.upv.es> Thanks Dave, Yes. I do not found the output file in the current directory, or in the temp directory. Using my $tmpdir = $codeml_factory->tempdir(); my $parserF= Bio::Tools::Phylo::PAML->new ( -file => "mlc", -dir => "$tmpdir" ); I still get the same error message. I'm using Bioperl version 1.006901. Cheers, Lorenzo Mensaje citado por Dave Messina : > Hi Lorenzo, > > > and subsequently parsing it using > > my $parserF= Bio::Tools::Phylo::PAML->new (-file => "mlc", -dir => "./"); > > > > However, I get the following message. > > > > ------------- EXCEPTION ------------- > > MSG: Could not open mlc: No such file or directory > > > > > > what I guess means the output file is not being saved in the previous step. > > > > > Your interpretation could be correct. I think though that it might be that > the -dir parameter you specify, "./", is not correct. Are you seeing the mlc > file in the '.' (current working) dir? > > If I remember correctly, by default the mlc file is created in a temporary > directory in /scratch or /tmp, and the save_tempfiles flag simply keeps that > temporary directory from being deleted. > > I don't have the docs in front of me, but I believe there's a way to get the > path of the temp directory that B::T::P::PAML is using. If so, you can use > that path as the value for the -dir parameter. > > Let me know if not, though, and we can follow up on this. > > Dave > > PS - also, could you verify that you're using the latest versions of > bioperl-live and bioperl-run from Github? > From locarpau at upvnet.upv.es Mon Aug 29 15:05:49 2011 From: locarpau at upvnet.upv.es (Lorenzo Carretero Paulet) Date: Mon, 29 Aug 2011 21:05:49 +0200 Subject: [Bioperl-l] Saving Codeml Output file In-Reply-To: <1A4207F8295607498283FE9E93B775B407CCB29D@EX02.asurite.ad.asu.edu> References: <9645AF32-5EC3-41FA-9A32-45B6B92E31FD@illinois.edu><4DF56976.8080704@upvnet.upv.es><9866C4A4-AC36-4A25-B38F-3006A7BB0F11@illinois.edu><1314636433.4e5bc291a40c6@webmail.upv.es> <1A4207F8295607498283FE9E93B775B407CCB29D@EX02.asurite.ad.asu.edu> Message-ID: <1314644749.4e5be30d78cb7@webmail.upv.es> Kevin, Still the same. The previous message is preceeded by: Filehandle GEN11 opened only for output at /Library/Perl//5.10.0/Bio/Root/IO.pm line 571 which points to # if the buffer been filled by _pushback then return the buffer # contents, rather than read from the filehandle if( @{$self->{'_readbuffer'} || [] } ) { $line = shift @{$self->{'_readbuffer'}}; } else { $line = <$fh>; } from the inner subroutine _readline of /Bio/Root/IO.pm Best, L Mensaje citado por Kevin Brown : > Opening a file for output that does not exist requires the > or >> > redirector (depending on if you want to overwrite or append output). > > my $parserF= Bio::Tools::Phylo::PAML->new (-file => ">mlc", -dir => > "./"); > > > > Kevin Brown > Center for Innovations in Medicine > Biodesign Institute > Arizona State University > > -----Original Message----- > > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > > bounces at lists.open-bio.org] On Behalf Of Dave Messina > > Sent: Monday, August 29, 2011 10:44 AM > > To: Lorenzo Carretero Paulet > > Cc: bioperl-l at lists.open-bio.org > > Subject: Re: [Bioperl-l] Saving Codeml Output file > > > > Hi Lorenzo, > > > > > > and subsequently parsing it using > > > my $parserF= Bio::Tools::Phylo::PAML->new (-file => "mlc", -dir => > > "./"); > > > > > > However, I get the following message. > > > > > > ------------- EXCEPTION ------------- > > > MSG: Could not open mlc: No such file or directory > > > > > > > > > > what I guess means the output file is not being saved in the > previous > > step. > > > > > > > > > Your interpretation could be correct. I think though that it might be > > that > > the -dir parameter you specify, "./", is not correct. Are you seeing > > the mlc > > file in the '.' (current working) dir? > > > > If I remember correctly, by default the mlc file is created in a > > temporary > > directory in /scratch or /tmp, and the save_tempfiles flag simply > keeps > > that > > temporary directory from being deleted. > > > > I don't have the docs in front of me, but I believe there's a way to > > get the > > path of the temp directory that B::T::P::PAML is using. If so, you can > > use > > that path as the value for the -dir parameter. > > > > Let me know if not, though, and we can follow up on this. > > > > Dave > > > > PS - also, could you verify that you're using the latest versions of > > bioperl-live and bioperl-run from Github? > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From Kevin.M.Brown at asu.edu Mon Aug 29 15:19:53 2011 From: Kevin.M.Brown at asu.edu (Kevin Brown) Date: Mon, 29 Aug 2011 12:19:53 -0700 Subject: [Bioperl-l] Saving Codeml Output file In-Reply-To: <1314636433.4e5bc291a40c6@webmail.upv.es> References: <9645AF32-5EC3-41FA-9A32-45B6B92E31FD@illinois.edu><4DF56976.8080704@upvnet.upv.es><9866C4A4-AC36-4A25-B38F-3006A7BB0F11@illinois.edu> <1314636433.4e5bc291a40c6@webmail.upv.es> Message-ID: <1A4207F8295607498283FE9E93B775B407CCB2D3@EX02.asurite.ad.asu.edu> OK, went back to the original message. And here's where the problem actually originates... my $codeml_factory = new Bio::Tools::Run::Phylo::PAML::Codeml ( # this should cause it to create a file called mlc -outfile => '>mlc', -save_tempfiles => 1, -alignment => $codon_MSA, -tree => $biotree, -params => { 'verbose' => 1, 'noisy' => 9, 'runmode' => 0, #user tree 'seqtype' => 1, 'model' => $model, 'NSsites' => $NSsites, 'fix_omega' => $fix_omega, 'omega' => $omega, 'ncatG' => $ncatG, 'icode' => 0, #* 0:universal code; 1:mammalian mt; 2-10:see below (5:ciliate nuclear) #'fix_alpha' => 0, #'fix_kappa' => 0, #'RateAncestor' => 0, 'CodonFreq' => 2, 'cleandata' => 1, # remove sites with amibguity data (1 yes, 0 no), 'ndata' => 1 }, ); Kevin Brown Center for Innovations in Medicine Biodesign Institute Arizona State University > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Lorenzo Carretero Paulet > Sent: Monday, August 29, 2011 9:47 AM > To: bioperl-l at lists.open-bio.org > Subject: [Bioperl-l] Saving Codeml Output file > > Hi all, > I'm running codeml from the PAML package using the corresponding > Bioperl > wrapper. I'd like to save the output file as -outfile => 'mlc', as in: > my $codeml_factory = new Bio::Tools::Run::Phylo::PAML::Codeml > ( -outfile => 'mlc', > -save_tempfiles => 1, > -alignment => $codon_MSA, > -tree => $biotree, > -params => > { > #'outfile' =>'mlc', > 'verbose' => 1, > 'noisy' => 9, > 'runmode' => 0, #user tree > 'seqtype' => 1, > 'model' => $model, > 'NSsites' => $NSsites, > 'fix_omega' => $fix_omega, > 'omega' => $omega, > 'ncatG' => $ncatG, > 'icode' => 0, #* 0:universal code; 1:mammalian mt; 2-10:see below > (5:ciliate > nuclear) > #'fix_alpha' => 0, > #'fix_kappa' => > 0, #'RateAncestor' => 0, > 'CodonFreq' => 2, > 'cleandata' => > 1, # remove sites with amibguity data (1 yes, 0 no), > 'ndata' => 1 > }, > ); > > and subsequently parsing it using > my $parserF= Bio::Tools::Phylo::PAML->new (-file => "mlc", -dir => > "./"); > > However, I get the following message. > > ------------- EXCEPTION ------------- > MSG: Could not open mlc: No such file or directory > STACK Bio::Root::IO::_initialize_io > /Library/Perl//5.10.0/Bio/Root/IO.pm:351 > STACK Bio::Tools::Phylo::PAML::new > /Library/Perl//5.10.0/Bio/Tools/Phylo/PAML.pm:239 > STACK main::BranchSiteEvolAnalysis > /Users/marioafares/Documents/workspace/BactEvolGen/BactEvolGen.pl:1421 > STACK toplevel > /Users/marioafares/Documents/workspace/BactEvolGen/BactEvolGen.pl:939 > ------------------------------------- > > what I guess means the output file is not being saved in the previous > step. > Anyone knows what's wrong. > Tnak you very much in advance for your help. > Cheers, > Lorenzo > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From locarpau at upvnet.upv.es Mon Aug 29 19:19:46 2011 From: locarpau at upvnet.upv.es (Lorenzo Carretero Paulet) Date: Tue, 30 Aug 2011 01:19:46 +0200 Subject: [Bioperl-l] Saving Codeml Output file In-Reply-To: <1A4207F8295607498283FE9E93B775B407CCB2D3@EX02.asurite.ad.asu.edu> References: <9645AF32-5EC3-41FA-9A32-45B6B92E31FD@illinois.edu><4DF56976.8080704@upvnet.upv.es><9866C4A4-AC36-4A25-B38F-3006A7BB0F11@illinois.edu> <1314636433.4e5bc291a40c6@webmail.upv.es> <1A4207F8295607498283FE9E93B775B407CCB2D3@EX02.asurite.ad.asu.edu> Message-ID: <1314659986.4e5c1e9268078@webmail.upv.es> Kevin, That's pretty reasonable, but unfortunately still doesn't run. Even if I create the file as $outfile and give it as value to the wrapper as -outfile =>$outfile. It seems as if Bio::Tools::Run::Phylo::PAML::Codeml failed at creating the outfile. Did anyone manage to generate the outfile from Bio::Tools::Run::Phylo::PAML::Codeml. Cheers, Lorenzo Mensaje citado por Kevin Brown : > OK, went back to the original message. > > And here's where the problem actually originates... > > my $codeml_factory = new Bio::Tools::Run::Phylo::PAML::Codeml > ( > # this should cause it to create a file > called mlc > -outfile => '>mlc', > -save_tempfiles => 1, > -alignment => > $codon_MSA, > -tree => > $biotree, > -params => > { > 'verbose' => 1, > 'noisy' => 9, > 'runmode' => 0, #user tree > 'seqtype' => 1, > 'model' => $model, > 'NSsites' => $NSsites, > 'fix_omega' => $fix_omega, > 'omega' => $omega, > 'ncatG' => $ncatG, > 'icode' => 0, #* 0:universal code; 1:mammalian mt; 2-10:see > below (5:ciliate nuclear) > #'fix_alpha' => 0, > #'fix_kappa' => 0, > #'RateAncestor' => 0, > 'CodonFreq' => 2, > 'cleandata' => 1, # remove sites with amibguity data (1 yes, 0 > no), > 'ndata' => 1 > }, > ); > > > Kevin Brown > Center for Innovations in Medicine > Biodesign Institute > Arizona State University > > > -----Original Message----- > > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > > bounces at lists.open-bio.org] On Behalf Of Lorenzo Carretero Paulet > > Sent: Monday, August 29, 2011 9:47 AM > > To: bioperl-l at lists.open-bio.org > > Subject: [Bioperl-l] Saving Codeml Output file > > > > Hi all, > > I'm running codeml from the PAML package using the corresponding > > Bioperl > > wrapper. I'd like to save the output file as -outfile => 'mlc', as in: > > my $codeml_factory = new Bio::Tools::Run::Phylo::PAML::Codeml > > ( -outfile => 'mlc', > > -save_tempfiles => 1, > > -alignment => > $codon_MSA, > > -tree => > $biotree, > > -params => > > { > > #'outfile' =>'mlc', > > 'verbose' => 1, > > 'noisy' => 9, > > 'runmode' => 0, #user tree > > 'seqtype' => 1, > > 'model' => $model, > > 'NSsites' => $NSsites, > > 'fix_omega' => $fix_omega, > > 'omega' => $omega, > > 'ncatG' => $ncatG, > > 'icode' => 0, #* 0:universal code; 1:mammalian mt; 2-10:see below > > (5:ciliate > > nuclear) > > #'fix_alpha' => 0, > > #'fix_kappa' => > > 0, #'RateAncestor' > => 0, > > 'CodonFreq' => > 2, > > 'cleandata' => > > 1, # remove sites with amibguity data (1 yes, 0 no), > > 'ndata' => 1 > > > }, > > ); > > > > and subsequently parsing it using > > my $parserF= Bio::Tools::Phylo::PAML->new (-file => "mlc", -dir => > > "./"); > > > > However, I get the following message. > > > > ------------- EXCEPTION ------------- > > MSG: Could not open mlc: No such file or directory > > STACK Bio::Root::IO::_initialize_io > > /Library/Perl//5.10.0/Bio/Root/IO.pm:351 > > STACK Bio::Tools::Phylo::PAML::new > > /Library/Perl//5.10.0/Bio/Tools/Phylo/PAML.pm:239 > > STACK main::BranchSiteEvolAnalysis > > /Users/marioafares/Documents/workspace/BactEvolGen/BactEvolGen.pl:1421 > > STACK toplevel > > /Users/marioafares/Documents/workspace/BactEvolGen/BactEvolGen.pl:939 > > ------------------------------------- > > > > what I guess means the output file is not being saved in the previous > > step. > > Anyone knows what's wrong. > > Tnak you very much in advance for your help. > > Cheers, > > Lorenzo > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From jason.stajich at gmail.com Mon Aug 29 20:05:57 2011 From: jason.stajich at gmail.com (Jason Stajich) Date: Mon, 29 Aug 2011 17:05:57 -0700 Subject: [Bioperl-l] Saving Codeml Output file In-Reply-To: <1314659986.4e5c1e9268078@webmail.upv.es> References: <9645AF32-5EC3-41FA-9A32-45B6B92E31FD@illinois.edu><4DF56976.8080704@upvnet.upv.es><9866C4A4-AC36-4A25-B38F-3006A7BB0F11@illinois.edu> <1314636433.4e5bc291a40c6@webmail.upv.es> <1A4207F8295607498283FE9E93B775B407CCB2D3@EX02.asurite.ad.asu.edu> <1314659986.4e5c1e9268078@webmail.upv.es> Message-ID: I think you are mistaken on how to use the factory running objects and associated parser. You don't have to instantiate a parser as this is what is returned by the run command. The whole point is you don't need to get to the tempdir or specify opening of the mlc file or all the other output files from the program. you get to use the parser to get the data out and then it cleans up afterwards so you can run many iterations of runs in separate folders without having to cleanup afterwards. http://www.bioperl.org/wiki/HOWTO:PAML my $factory = Bio::Tools::Run::Phylo::PAML::Codeml->new( ... ); my ($rc,$parser) = $factory->run( ); if( my $result = $parser->next_result ) { # $result is a Bio::Tools::Phylo::PAML object } On Aug 29, 2011, at 4:19 PM, Lorenzo Carretero Paulet wrote: > Kevin, > That's pretty reasonable, but unfortunately still doesn't run. Even if I create > the file as $outfile and give it as value to the wrapper as -outfile > =>$outfile. It seems as if Bio::Tools::Run::Phylo::PAML::Codeml failed at > creating the outfile. Did anyone manage to generate the outfile from > Bio::Tools::Run::Phylo::PAML::Codeml. > Cheers, > Lorenzo > > Mensaje citado por Kevin Brown : > >> OK, went back to the original message. >> >> And here's where the problem actually originates... >> >> my $codeml_factory = new Bio::Tools::Run::Phylo::PAML::Codeml >> ( >> # this should cause it to create a file >> called mlc >> -outfile => '>mlc', >> -save_tempfiles => 1, >> -alignment => >> $codon_MSA, >> -tree => >> $biotree, >> -params => >> { >> 'verbose' => 1, >> 'noisy' => 9, >> 'runmode' => 0, #user tree >> 'seqtype' => 1, >> 'model' => $model, >> 'NSsites' => $NSsites, >> 'fix_omega' => $fix_omega, >> 'omega' => $omega, >> 'ncatG' => $ncatG, >> 'icode' => 0, #* 0:universal code; 1:mammalian mt; 2-10:see >> below (5:ciliate nuclear) >> #'fix_alpha' => 0, >> #'fix_kappa' => 0, >> #'RateAncestor' => 0, >> 'CodonFreq' => 2, >> 'cleandata' => 1, # remove sites with amibguity data (1 yes, 0 >> no), >> 'ndata' => 1 >> }, >> ); >> >> >> Kevin Brown >> Center for Innovations in Medicine >> Biodesign Institute >> Arizona State University >> >>> -----Original Message----- >>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- >>> bounces at lists.open-bio.org] On Behalf Of Lorenzo Carretero Paulet >>> Sent: Monday, August 29, 2011 9:47 AM >>> To: bioperl-l at lists.open-bio.org >>> Subject: [Bioperl-l] Saving Codeml Output file >>> >>> Hi all, >>> I'm running codeml from the PAML package using the corresponding >>> Bioperl >>> wrapper. I'd like to save the output file as -outfile => 'mlc', as in: >>> my $codeml_factory = new Bio::Tools::Run::Phylo::PAML::Codeml >>> ( -outfile => 'mlc', >>> -save_tempfiles => 1, >>> -alignment => >> $codon_MSA, >>> -tree => >> $biotree, >>> -params => >>> { >>> #'outfile' =>'mlc', >>> 'verbose' => 1, >>> 'noisy' => 9, >>> 'runmode' => 0, #user tree >>> 'seqtype' => 1, >>> 'model' => $model, >>> 'NSsites' => $NSsites, >>> 'fix_omega' => $fix_omega, >>> 'omega' => $omega, >>> 'ncatG' => $ncatG, >>> 'icode' => 0, #* 0:universal code; 1:mammalian mt; 2-10:see below >>> (5:ciliate >>> nuclear) >>> #'fix_alpha' => 0, >>> #'fix_kappa' => >>> 0, #'RateAncestor' >> => 0, >>> 'CodonFreq' => >> 2, >>> 'cleandata' => >>> 1, # remove sites with amibguity data (1 yes, 0 no), >>> 'ndata' => 1 >>> >> }, >>> ); >>> >>> and subsequently parsing it using >>> my $parserF= Bio::Tools::Phylo::PAML->new (-file => "mlc", -dir => >>> "./"); >>> >>> However, I get the following message. >>> >>> ------------- EXCEPTION ------------- >>> MSG: Could not open mlc: No such file or directory >>> STACK Bio::Root::IO::_initialize_io >>> /Library/Perl//5.10.0/Bio/Root/IO.pm:351 >>> STACK Bio::Tools::Phylo::PAML::new >>> /Library/Perl//5.10.0/Bio/Tools/Phylo/PAML.pm:239 >>> STACK main::BranchSiteEvolAnalysis >>> /Users/marioafares/Documents/workspace/BactEvolGen/BactEvolGen.pl:1421 >>> STACK toplevel >>> /Users/marioafares/Documents/workspace/BactEvolGen/BactEvolGen.pl:939 >>> ------------------------------------- >>> >>> what I guess means the output file is not being saved in the previous >>> step. >>> Anyone knows what's wrong. >>> Tnak you very much in advance for your help. >>> Cheers, >>> Lorenzo >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From fs5 at sanger.ac.uk Tue Aug 30 05:45:46 2011 From: fs5 at sanger.ac.uk (Frank Schwach) Date: Tue, 30 Aug 2011 10:45:46 +0100 Subject: [Bioperl-l] primer3redux 0.09 add_targets is not there In-Reply-To: <3BE41688-C163-4EA1-AF6A-34A6052FCFEA@illinois.edu> References: <3BE41688-C163-4EA1-AF6A-34A6052FCFEA@illinois.edu> Message-ID: <1314697546.3797.8.camel@deskpro15336.internal.sanger.ac.uk> Yes, I still have the primer3redux doc on my TODO list. Sorry, haven't had the time to do this lately but will loook into this as soon as I can. Frank On Mon, 2011-08-22 at 15:10 -0500, Chris Fields wrote: > On Aug 22, 2011, at 2:52 PM, Anand Patel wrote: > > > my $primer3 = Bio::Tools::Run::Primer3Redux->new(-outfile => > > "temp.out", -path => "/usr/bin/primer3_core"); > > > > If I use this: > > $primer3->add_targets( > > 'PRIMER_OPT_TM'=>$PRIMER_OPT_TM,'PRIMER_MIN_TM'=>$PRIMER_MIN_TM, > > 'PRIMER_MAX_TM'=>$PRIMER_MAX_TM, > > 'PRIMER_PAIR_MAX_DIFF_TM'=>$PRIMER_MAX_DIFF_TM, > > 'PRIMER_MAX_SIZE'=>$PRIMER_MAX_SIZE,'PRIMER_OPT_SIZE'=>$PRIMER_OPT_SIZE, > > 'PRIMER_MIN_SIZE'=>$PRIMER_MIN_SIZE, > > 'PRIMER_MAX_GC'=>$PRIMER_MAX_GC, > > 'PRIMER_OPT_GC_PERCENT'=>$PRIMER_OPT_GC_PERCENT, > > 'PRIMER_MIN_GC'=>$PRIMER_MIN_GC, > > 'SEQUENCE_TARGET'=>$TARGET, 'PRIMER_PRODUCT_SIZE_RANGE' > > =>$PRIMER_PRODUCT_SIZE_RANGE); > > > > I get: > > Can't locate object method "add_targets" via package > > "Bio::Tools::Run::Primer3Redux" at p3ra.pl line 31, line 1. > > > > On the other hand, if I change that line to: > > $primer3->set_parameters( > > 'PRIMER_OPT_TM'=>$PRIMER_OPT_TM,'PRIMER_MIN_TM'=>$PRIMER_MIN_TM, > > 'PRIMER_MAX_TM'=>$PRIMER_MAX_TM, > > 'PRIMER_PAIR_MAX_DIFF_TM'=>$PRIMER_MAX_DIFF_TM, > > 'PRIMER_MAX_SIZE'=>$PRIMER_MAX_SIZE,'PRIMER_OPT_SIZE'=>$PRIMER_OPT_SIZE, > > 'PRIMER_MIN_SIZE'=>$PRIMER_MIN_SIZE, > > 'PRIMER_MAX_GC'=>$PRIMER_MAX_GC, > > 'PRIMER_OPT_GC_PERCENT'=>$PRIMER_OPT_GC_PERCENT, > > 'PRIMER_MIN_GC'=>$PRIMER_MIN_GC, > > 'SEQUENCE_TARGET'=>$TARGET, 'PRIMER_PRODUCT_SIZE_RANGE' > > =>$PRIMER_PRODUCT_SIZE_RANGE); > > > > It works. When I looked at the source code for Primer3Redux, I > > couldn't find add_targets, but set_parameters looked like it might > > work, so I used that instead, and it worked. > > > > But I see over in the github that there are other issues with the > > documentation (how primer3redux's result object is now 3 deep rather > > than 2 deep). Not sure if this is in that category or not. > > That is true; documentation was to be updated but that hasn't happened yet (haven't had the free time to work specifically on this, and I think fschwach was to work on some HOWTO documentation). I do plan on an update in the next few weeks to address the various Issues on github, if you can file this as well it would help. > > I have to go back and look at the history of add_targets() reative to primer3 bioperl code, but I don't think this was part of the commit history of Bio::Tools::Run::Primer3Redux (maybe for the old version, Bio::Tools::Run::Primer3), so that is probably cruft left over from the update. Would be easy enough to alias it for convenience... > > chris > > > Thanks, > > Anand > ... > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. From manju.rawat2 at gmail.com Tue Aug 30 07:22:33 2011 From: manju.rawat2 at gmail.com (Manju Rawat) Date: Tue, 30 Aug 2011 07:22:33 -0400 Subject: [Bioperl-l] Bioperl query.... Message-ID: Hey Pls help me.. I am very new in Bioperl.. And i want to use blast report in my programming.. But i dnt know how to use it...pls tell me how to use HSP,gaps.etc methods??/ how to use them to extract valus from blast file.. Thanks Manju Rawat From roy.chaudhuri at gmail.com Tue Aug 30 07:25:32 2011 From: roy.chaudhuri at gmail.com (Roy Chaudhuri) Date: Tue, 30 Aug 2011 12:25:32 +0100 Subject: [Bioperl-l] Bioperl query.... In-Reply-To: References: Message-ID: <4E5CC8AC.8050800@gmail.com> Hi Manju, See: http://www.bioperl.org/wiki/HOWTO:SearchIO Cheers, Roy. On 30/08/2011 12:22, Manju Rawat wrote: > Hey Pls help me.. > I am very new in Bioperl.. > And i want to use blast report in my programming.. > But i dnt know how to use it...pls tell me how to use HSP,gaps.etc > methods??/ > how to use them to extract valus from blast file.. > > Thanks > Manju Rawat > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Tue Aug 30 09:54:19 2011 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 30 Aug 2011 08:54:19 -0500 Subject: [Bioperl-l] primer3redux 0.09 add_targets is not there In-Reply-To: <1314697546.3797.8.camel@deskpro15336.internal.sanger.ac.uk> References: <3BE41688-C163-4EA1-AF6A-34A6052FCFEA@illinois.edu> <1314697546.3797.8.camel@deskpro15336.internal.sanger.ac.uk> Message-ID: <8063FB1D-4557-4D1B-B9EF-9833ECD440E9@illinois.edu> S'okay, we're all a bit busy :P chris On Aug 30, 2011, at 4:45 AM, Frank Schwach wrote: > Yes, I still have the primer3redux doc on my TODO list. Sorry, haven't > had the time to do this lately but will loook into this as soon as I > can. > Frank > > > On Mon, 2011-08-22 at 15:10 -0500, Chris Fields wrote: >> On Aug 22, 2011, at 2:52 PM, Anand Patel wrote: >> >>> my $primer3 = Bio::Tools::Run::Primer3Redux->new(-outfile => >>> "temp.out", -path => "/usr/bin/primer3_core"); >>> >>> If I use this: >>> $primer3->add_targets( >>> 'PRIMER_OPT_TM'=>$PRIMER_OPT_TM,'PRIMER_MIN_TM'=>$PRIMER_MIN_TM, >>> 'PRIMER_MAX_TM'=>$PRIMER_MAX_TM, >>> 'PRIMER_PAIR_MAX_DIFF_TM'=>$PRIMER_MAX_DIFF_TM, >>> 'PRIMER_MAX_SIZE'=>$PRIMER_MAX_SIZE,'PRIMER_OPT_SIZE'=>$PRIMER_OPT_SIZE, >>> 'PRIMER_MIN_SIZE'=>$PRIMER_MIN_SIZE, >>> 'PRIMER_MAX_GC'=>$PRIMER_MAX_GC, >>> 'PRIMER_OPT_GC_PERCENT'=>$PRIMER_OPT_GC_PERCENT, >>> 'PRIMER_MIN_GC'=>$PRIMER_MIN_GC, >>> 'SEQUENCE_TARGET'=>$TARGET, 'PRIMER_PRODUCT_SIZE_RANGE' >>> =>$PRIMER_PRODUCT_SIZE_RANGE); >>> >>> I get: >>> Can't locate object method "add_targets" via package >>> "Bio::Tools::Run::Primer3Redux" at p3ra.pl line 31, line 1. >>> >>> On the other hand, if I change that line to: >>> $primer3->set_parameters( >>> 'PRIMER_OPT_TM'=>$PRIMER_OPT_TM,'PRIMER_MIN_TM'=>$PRIMER_MIN_TM, >>> 'PRIMER_MAX_TM'=>$PRIMER_MAX_TM, >>> 'PRIMER_PAIR_MAX_DIFF_TM'=>$PRIMER_MAX_DIFF_TM, >>> 'PRIMER_MAX_SIZE'=>$PRIMER_MAX_SIZE,'PRIMER_OPT_SIZE'=>$PRIMER_OPT_SIZE, >>> 'PRIMER_MIN_SIZE'=>$PRIMER_MIN_SIZE, >>> 'PRIMER_MAX_GC'=>$PRIMER_MAX_GC, >>> 'PRIMER_OPT_GC_PERCENT'=>$PRIMER_OPT_GC_PERCENT, >>> 'PRIMER_MIN_GC'=>$PRIMER_MIN_GC, >>> 'SEQUENCE_TARGET'=>$TARGET, 'PRIMER_PRODUCT_SIZE_RANGE' >>> =>$PRIMER_PRODUCT_SIZE_RANGE); >>> >>> It works. When I looked at the source code for Primer3Redux, I >>> couldn't find add_targets, but set_parameters looked like it might >>> work, so I used that instead, and it worked. >>> >>> But I see over in the github that there are other issues with the >>> documentation (how primer3redux's result object is now 3 deep rather >>> than 2 deep). Not sure if this is in that category or not. >> >> That is true; documentation was to be updated but that hasn't happened yet (haven't had the free time to work specifically on this, and I think fschwach was to work on some HOWTO documentation). I do plan on an update in the next few weeks to address the various Issues on github, if you can file this as well it would help. >> >> I have to go back and look at the history of add_targets() reative to primer3 bioperl code, but I don't think this was part of the commit history of Bio::Tools::Run::Primer3Redux (maybe for the old version, Bio::Tools::Run::Primer3), so that is probably cruft left over from the update. Would be easy enough to alias it for convenience... >> >> chris >> >>> Thanks, >>> Anand >> ... >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > -- > The Wellcome Trust Sanger Institute is operated by Genome Research > Limited, a charity registered in England with number 1021457 and a > company registered in England with number 2742969, whose registered > office is 215 Euston Road, London, NW1 2BE. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From locarpau at upvnet.upv.es Tue Aug 30 10:58:51 2011 From: locarpau at upvnet.upv.es (Lorenzo Carretero Paulet) Date: Tue, 30 Aug 2011 16:58:51 +0200 Subject: [Bioperl-l] Saving Codeml Output file In-Reply-To: References: <9645AF32-5EC3-41FA-9A32-45B6B92E31FD@illinois.edu><4DF56976.8080704@upvnet.upv.es><9866C4A4-AC36-4A25-B38F-3006A7BB0F11@illinois.edu> <1314636433.4e5bc291a40c6@webmail.upv.es> <1A4207F8295607498283FE9E93B775B407CCB2D3@EX02.asurite.ad.asu.edu> <1314659986.4e5c1e9268078@webmail.upv.es> Message-ID: <1314716331.4e5cfaab4958e@webmail.upv.es> Thanks Jason, Ok, I see. That's what I was triying at the beggining. This runs OK in my scripts for branch-specific models. However, when I try branch-site models (NSsites > 0) and try to parse the results using my $model_result= $paml_result->get_NSSite_results I start to have problems. According to Dumper, I'm able to generate a Bio::Tools::Phylo::PAML object $paml_result but this doesn't store any Bio::Tools::Phylo::PAML::ModelResult that could be accessed using get_NSSite_results. See below a little piece of code to illustrate what I'm saying. my $codeml_factory = new Bio::Tools::Run::Phylo::PAML::Codeml ( -alignment => $codon_MSA, -tree => $biotree, -params => { ... ...parameter values ... }, ); my ($rc,$parser) = $codeml_factory->run(); # or run($dna_aln,$biotree) #$codeml_factory->cleanup(); my $paml_result = $parser->next_result; say Dumper $paml_result; #This returns a true Bio::Tools::Phylo::PAML::Result object!!! my $model_result= $paml_result->get_NSSite_results; say Dumper $model_result; #This doesn't return a true Bio::Tools::Phylo::PAML::ModelResult object ($VAR1 = 0;)!!! $ns_string = "model ".$model_result->model_num."\n".$model_result->model_description()."\n".$model_result->time_used."\n"; As no ModelResult object is generated, the script stops returning: Can't call method "model_num" without a package or object reference That's why I was trying to save the mlc output file and parse it, instead of parsing directly the Bio::Tools::Phylo::PAML object. Best, Lorenzo PS: I?m using paml version 4.4b, July 2010 and Bioperl 1.006901. on mac osx Mensaje citado por Jason Stajich : > I think you are mistaken on how to use the factory running objects and > associated parser. > > You don't have to instantiate a parser as this is what is returned by the run > command. The whole point is you don't need to get to the tempdir or specify > opening of the mlc file or all the other output files from the program. you > get to use the parser to get the data out and then it cleans up afterwards so > you can run many iterations of runs in separate folders without having to > cleanup afterwards. > > http://www.bioperl.org/wiki/HOWTO:PAML > > my $factory = Bio::Tools::Run::Phylo::PAML::Codeml->new( ... ); > my ($rc,$parser) = $factory->run( ); > > if( my $result = $parser->next_result ) { > # $result is a Bio::Tools::Phylo::PAML object > } > > > On Aug 29, 2011, at 4:19 PM, Lorenzo Carretero Paulet wrote: > > > Kevin, > > That's pretty reasonable, but unfortunately still doesn't run. Even if I > create > > the file as $outfile and give it as value to the wrapper as -outfile > > =>$outfile. It seems as if Bio::Tools::Run::Phylo::PAML::Codeml failed at > > creating the outfile. Did anyone manage to generate the outfile from > > Bio::Tools::Run::Phylo::PAML::Codeml. > > Cheers, > > Lorenzo > > > > Mensaje citado por Kevin Brown : > > > >> OK, went back to the original message. > >> > >> And here's where the problem actually originates... > >> > >> my $codeml_factory = new Bio::Tools::Run::Phylo::PAML::Codeml > >> ( > >> # this should cause it to create a file > >> called mlc > >> -outfile => '>mlc', > >> -save_tempfiles => 1, > >> -alignment => > >> $codon_MSA, > >> -tree => > >> $biotree, > >> -params => > >> { > >> 'verbose' => 1, > >> 'noisy' => 9, > >> 'runmode' => 0, #user tree > >> 'seqtype' => 1, > >> 'model' => $model, > >> 'NSsites' => $NSsites, > >> 'fix_omega' => $fix_omega, > >> 'omega' => $omega, > >> 'ncatG' => $ncatG, > >> 'icode' => 0, #* 0:universal code; 1:mammalian mt; 2-10:see > >> below (5:ciliate nuclear) > >> #'fix_alpha' => 0, > >> #'fix_kappa' => 0, > >> #'RateAncestor' => 0, > >> 'CodonFreq' => 2, > >> 'cleandata' => 1, # remove sites with amibguity data (1 yes, 0 > >> no), > >> 'ndata' => 1 > >> }, > >> ); > >> > >> > >> Kevin Brown > >> Center for Innovations in Medicine > >> Biodesign Institute > >> Arizona State University > >> > >>> -----Original Message----- > >>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > >>> bounces at lists.open-bio.org] On Behalf Of Lorenzo Carretero Paulet > >>> Sent: Monday, August 29, 2011 9:47 AM > >>> To: bioperl-l at lists.open-bio.org > >>> Subject: [Bioperl-l] Saving Codeml Output file > >>> > >>> Hi all, > >>> I'm running codeml from the PAML package using the corresponding > >>> Bioperl > >>> wrapper. I'd like to save the output file as -outfile => 'mlc', as in: > >>> my $codeml_factory = new Bio::Tools::Run::Phylo::PAML::Codeml > >>> ( -outfile => 'mlc', > >>> -save_tempfiles => 1, > >>> -alignment => > >> $codon_MSA, > >>> -tree => > >> $biotree, > >>> -params => > >>> { > >>> #'outfile' =>'mlc', > >>> 'verbose' => 1, > >>> 'noisy' => 9, > >>> 'runmode' => 0, #user tree > >>> 'seqtype' => 1, > >>> 'model' => $model, > >>> 'NSsites' => $NSsites, > >>> 'fix_omega' => $fix_omega, > >>> 'omega' => $omega, > >>> 'ncatG' => $ncatG, > >>> 'icode' => 0, #* 0:universal code; 1:mammalian mt; 2-10:see below > >>> (5:ciliate > >>> nuclear) > >>> #'fix_alpha' => 0, > >>> #'fix_kappa' => > >>> 0, #'RateAncestor' > >> => 0, > >>> 'CodonFreq' => > >> 2, > >>> 'cleandata' => > >>> 1, # remove sites with amibguity data (1 yes, 0 no), > >>> 'ndata' => 1 > >>> > >> }, > >>> ); > >>> > >>> and subsequently parsing it using > >>> my $parserF= Bio::Tools::Phylo::PAML->new (-file => "mlc", -dir => > >>> "./"); > >>> > >>> However, I get the following message. > >>> > >>> ------------- EXCEPTION ------------- > >>> MSG: Could not open mlc: No such file or directory > >>> STACK Bio::Root::IO::_initialize_io > >>> /Library/Perl//5.10.0/Bio/Root/IO.pm:351 > >>> STACK Bio::Tools::Phylo::PAML::new > >>> /Library/Perl//5.10.0/Bio/Tools/Phylo/PAML.pm:239 > >>> STACK main::BranchSiteEvolAnalysis > >>> /Users/marioafares/Documents/workspace/BactEvolGen/BactEvolGen.pl:1421 > >>> STACK toplevel > >>> /Users/marioafares/Documents/workspace/BactEvolGen/BactEvolGen.pl:939 > >>> ------------------------------------- > >>> > >>> what I guess means the output file is not being saved in the previous > >>> step. > >>> Anyone knows what's wrong. > >>> Tnak you very much in advance for your help. > >>> Cheers, > >>> Lorenzo > >>> _______________________________________________ > >>> Bioperl-l mailing list > >>> Bioperl-l at lists.open-bio.org > >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >> > >> _______________________________________________ > >> Bioperl-l mailing list > >> Bioperl-l at lists.open-bio.org > >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >> > > > > > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From shalabh.sharma7 at gmail.com Tue Aug 30 11:26:00 2011 From: shalabh.sharma7 at gmail.com (shalabh sharma) Date: Tue, 30 Aug 2011 11:26:00 -0400 Subject: [Bioperl-l] Bioperl query.... In-Reply-To: <4E5CC8AC.8050800@gmail.com> References: <4E5CC8AC.8050800@gmail.com> Message-ID: Hi Manju, Just follow the link sent by Roy. It also contain some useful example scripts. What i am suggesting is , you should run a blast on a very small data set that you can inspect easily and manually. Then parse it using SeachIO (follow the link) and you will get a fair idea that how it works. -Shalabh On Tue, Aug 30, 2011 at 7:25 AM, Roy Chaudhuri wrote: > Hi Manju, > > See: > http://www.bioperl.org/wiki/**HOWTO:SearchIO > > Cheers, > Roy. > > > On 30/08/2011 12:22, Manju Rawat wrote: > >> Hey Pls help me.. >> I am very new in Bioperl.. >> And i want to use blast report in my programming.. >> But i dnt know how to use it...pls tell me how to use HSP,gaps.etc >> methods??/ >> how to use them to extract valus from blast file.. >> >> Thanks >> Manju Rawat >> ______________________________**_________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/**mailman/listinfo/bioperl-l >> > > ______________________________**_________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/**mailman/listinfo/bioperl-l > -- Shalabh Sharma Scientific Computing Professional Associate (Bioinformatics Specialist) Department of Marine Sciences University of Georgia Athens, GA 30602-3636 From longbow0 at gmail.com Wed Aug 31 11:48:16 2011 From: longbow0 at gmail.com (longbow leo) Date: Wed, 31 Aug 2011 10:48:16 -0500 Subject: [Bioperl-l] How to color leaves of a tree by Bio::Tree::Draw::Cladogram? Message-ID: Dear all, I am using the module Bio::Tree::Draw::Cladogram to create a tree diagram. But when I tried to color the tree leaves, the diagram was still without any colors. How can I color tree leave? Thanks in advance. Here is my script: ###################################################################### #!/usr/bin/perl use strict; use warnings; use Bio::TreeIO; use Bio::Tree::Draw::Cladogram; my $treei = Bio::TreeIO->new( -fh => \*DATA, -format => 'newick', ); my $tree = $treei->next_tree; # Color node 'B' to red my ($nodeB) = $tree->find_node( -id => 'B' ); $nodeB->add_tag_value('Rcolor', 1); $nodeB->add_tag_value('Gcolor', 0); $nodeB->add_tag_value('Bcolor', 0); my $cg = Bio::Tree::Draw::Cladogram->new( -tree => $tree, ); $cg->print( -file => 'mytree.eps' ); __DATA__ (((A:5,B:5)90:2,C:4)25:3,D:10); ###################################################################### Regards, Haizhou From roy.chaudhuri at gmail.com Wed Aug 31 12:02:30 2011 From: roy.chaudhuri at gmail.com (Roy Chaudhuri) Date: Wed, 31 Aug 2011 17:02:30 +0100 Subject: [Bioperl-l] How to color leaves of a tree by Bio::Tree::Draw::Cladogram? In-Reply-To: References: Message-ID: <4E5E5B16.9070704@gmail.com> Hi Haizhou, I think you need to specify -colors=>1 in your Bio::Tree::Draw::Cladogram constructor: my $cg = Bio::Tree::Draw::Cladogram->new( -tree => $tree, -colors => 1 ); Not sure why this isn't on by default. Roy. On 31/08/2011 16:48, longbow leo wrote: > Dear all, > > I am using the module Bio::Tree::Draw::Cladogram to create a tree diagram. > But when I tried to color the tree leaves, the diagram was still without any > colors. > > How can I color tree leave? Thanks in advance. > > Here is my script: > > ###################################################################### > > > #!/usr/bin/perl > > use strict; > use warnings; > > use Bio::TreeIO; > use Bio::Tree::Draw::Cladogram; > > my $treei = Bio::TreeIO->new( > -fh => \*DATA, > -format => 'newick', > ); > > my $tree = $treei->next_tree; > > # Color node 'B' to red > my ($nodeB) = $tree->find_node( -id => 'B' ); > > $nodeB->add_tag_value('Rcolor', 1); > $nodeB->add_tag_value('Gcolor', 0); > $nodeB->add_tag_value('Bcolor', 0); > > my $cg = Bio::Tree::Draw::Cladogram->new( > -tree => $tree, > ); > > $cg->print( -file => 'mytree.eps' ); > > __DATA__ > (((A:5,B:5)90:2,C:4)25:3,D:10); > > > ###################################################################### > > Regards, > > Haizhou > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Mon Aug 1 00:07:38 2011 From: cjfields at illinois.edu (Chris Fields) Date: Sun, 31 Jul 2011 23:07:38 -0500 Subject: [Bioperl-l] BioPerl Test requirements Message-ID: <79E5ED11-1D3F-436F-AA02-D922EAEB123A@illinois.edu> All, We are currently using a BioPerl-specific module for running tests called Bio::Root::Test. It is essentially a wrapper module, re-exporting all the methods for Test::More, Test::Exception, and Test::Warn. One problem: it currently expects a copy of Test::Warn and Test::Exception in each repository as a fallback. Another problem: these included modules appear to be triggering dependencies with debian packaging. As an example of one hidden dependency, the included Test::Warn requires Array::Compare, which converted to Moose a few years ago, so you automatically have to install the entire Moose dependency tree, even though Bioperl doesn't require it (not a slam on Moose, you really SHOULD be using Moose these days. No, really :). Anway, more recent versions of Test::Warn don't have this requirement, but as we package an old version of this module we get stuck with the dependencies until we (manually) update this for each repository. Ick. I think the best solution is to remove the bioperl-local modules in t/lib and list Test::Most instead as a 'build_requires' in Build.PL, e.g. the module is only necessary for the build phase so is optionally installed. Test::Most essentially does exactly the same thing as Bio::Root::Test and more; it also includes Test::Deep and Test::Diff (Bio::Root::Test has a few additional methods of use as well). As this will require developers to use Test::Most instead, though, I though it would be worth asking on the list to see if there are any objections. Any thoughts? chris From cjfields at illinois.edu Mon Aug 1 00:42:39 2011 From: cjfields at illinois.edu (Chris Fields) Date: Sun, 31 Jul 2011 23:42:39 -0500 Subject: [Bioperl-l] protaparam In-Reply-To: References: Message-ID: <44853A9D-9E78-469E-B8D8-B06EBDB5F780@illinois.edu> Shachi, My guess is this is not a BioPerl-specific issue, but that the web service interface has changed or is no longer active. Unfortunately this is one module that has no tests associated with it, so this passed through the cracks. You are more than welcome to file a bug on this, but if the service is inactive we'll likely immediately deprecate the module. chris On Jul 28, 2011, at 11:46 PM, Shachi Gahoi wrote: > Dear All, > > If anybody know how to rum protparam using bioperl please let me know. > > > Thanks in advance > > -- > Regards, > Shachi > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From jason.stajich at gmail.com Mon Aug 1 03:12:32 2011 From: jason.stajich at gmail.com (Jason Stajich) Date: Sun, 31 Jul 2011 23:12:32 -0800 Subject: [Bioperl-l] Fwd: Bio::Tools::Run::Phylo::Phyml, tree_string References: Message-ID: <3521B67E-D158-492A-8A60-025D6C5C9934@gmail.com> Heikki - can you take a look at this when you get time - I'm unclear what the BIONJ string is used for? Begin forwarded message: > From: Tristan Lefebure > Date: July 27, 2011 6:12:16 AM AKDT > To: bioperl mailing list > Subject: Re: [Bioperl-l] Bio::Tools::Run::Phylo::Phyml, tree_string > > done: > https://redmine.open-bio.org/issues/3273 > > -- > Tristan > > On Tue, Jul 26, 2011 at 9:43 PM, Chris Fields wrote: >> That's an odd one. Could you file this on redmine? >> >> chris >> >> On Jul 26, 2011, at 10:14 AM, Tristan Lefebure wrote: >> >>> Ouups, I found a typo in my post, it should read: >>> >>> I am not quite sure I understand why tree_string() from >>> Bio::Tools::Run::Phylo::Phyml returns >>> a string that looks like that (I removed the end of the tree): >>> >>> BIONJ(((((((('92':0.0114354726,'12':0.0472591023)0.0000000000:0.0000005859,... >>> >>> On Tue, Jul 26, 2011 at 4:47 PM, Tristan Lefebure >>> wrote: >>>> Hi there, >>>> I am not quite sure I understand why tree_string() from Bio::Tools::Run::Phylo::Phyml returns >>>> a string that looks like that (I removed the end of the tree): >>>> >>>> Tree is BIONJ(((((((('92':0.0114354726,'12':0.0472591023)0.0000000000:0.0000005859,... >>>> >>>> Why do we have this 'Tree is BIONJ' thing? >>>> >>>> A quick look at the code in the _run() function gives : >>>> >>>> { >>>> open(my $FH_TREE, "<", $tree_file) >>>> || $self->throw("Phyml call ($command) did not give an output: $?"); >>>> local $/; >>>> $self->{_tree} .= <$FH_TREE>; >>>> } >>>> >>>> Why appending something to $self->{_tree}? What about? >>>> $self->{_tree} = <$FH_TREE>; >>>> >>>> I was about to fill a bug report, but then I saw that in Phyml.t: >>>> >>>> is substr($factory->tree_string, 0, 9), 'BIONJ(SIN', 'tree_string()'; >>>> >>>> Well, I am lost. Any help much appreciated... >>>> >>>> -- >>>> Tristan >>>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From David.Messina at sbc.su.se Mon Aug 1 05:09:47 2011 From: David.Messina at sbc.su.se (Dave Messina) Date: Mon, 1 Aug 2011 11:09:47 +0200 Subject: [Bioperl-l] BioPerl Test requirements In-Reply-To: <79E5ED11-1D3F-436F-AA02-D922EAEB123A@illinois.edu> References: <79E5ED11-1D3F-436F-AA02-D922EAEB123A@illinois.edu> Message-ID: Sounds good, Chris. Go for it. Dave From hlapp at drycafe.net Mon Aug 1 16:30:18 2011 From: hlapp at drycafe.net (Hilmar Lapp) Date: Mon, 1 Aug 2011 16:30:18 -0400 Subject: [Bioperl-l] BioPerl Test requirements In-Reply-To: <79E5ED11-1D3F-436F-AA02-D922EAEB123A@illinois.edu> References: <79E5ED11-1D3F-436F-AA02-D922EAEB123A@illinois.edu> Message-ID: I think the small burden this change incurs for each developer is well outweighed by the reduced maintenance and installation burden. Go for it. -hilmar On Aug 1, 2011, at 12:07 AM, Chris Fields wrote: > All, > > We are currently using a BioPerl-specific module for running tests > called Bio::Root::Test. It is essentially a wrapper module, re- > exporting all the methods for Test::More, Test::Exception, and > Test::Warn. One problem: it currently expects a copy of Test::Warn > and Test::Exception in each repository as a fallback. Another > problem: these included modules appear to be triggering dependencies > with debian packaging. > > As an example of one hidden dependency, the included Test::Warn > requires Array::Compare, which converted to Moose a few years ago, > so you automatically have to install the entire Moose dependency > tree, even though Bioperl doesn't require it (not a slam on Moose, > you really SHOULD be using Moose these days. No, really :). > > Anway, more recent versions of Test::Warn don't have this > requirement, but as we package an old version of this module we get > stuck with the dependencies until we (manually) update this for each > repository. Ick. > > I think the best solution is to remove the bioperl-local modules in > t/lib and list Test::Most instead as a 'build_requires' in Build.PL, > e.g. the module is only necessary for the build phase so is > optionally installed. Test::Most essentially does exactly the same > thing as Bio::Root::Test and more; it also includes Test::Deep and > Test::Diff (Bio::Root::Test has a few additional methods of use as > well). > > As this will require developers to use Test::Most instead, though, I > though it would be worth asking on the list to see if there are any > objections. Any thoughts? > > > chris > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : =========================================================== From cjfields at illinois.edu Mon Aug 1 16:34:56 2011 From: cjfields at illinois.edu (Chris Fields) Date: Mon, 1 Aug 2011 15:34:56 -0500 Subject: [Bioperl-l] BioPerl Test requirements In-Reply-To: References: <79E5ED11-1D3F-436F-AA02-D922EAEB123A@illinois.edu> Message-ID: <0D28A228-53D1-4843-B99D-9F8A48132EA2@illinois.edu> Okay, will do. I'll initially test on a branch and then pull in. Thanks for the feedback Hilmar and Dave! chris On Aug 1, 2011, at 3:30 PM, Hilmar Lapp wrote: > I think the small burden this change incurs for each developer is well outweighed by the reduced maintenance and installation burden. Go for it. > > -hilmar > > On Aug 1, 2011, at 12:07 AM, Chris Fields wrote: > >> All, >> >> We are currently using a BioPerl-specific module for running tests called Bio::Root::Test. It is essentially a wrapper module, re-exporting all the methods for Test::More, Test::Exception, and Test::Warn. One problem: it currently expects a copy of Test::Warn and Test::Exception in each repository as a fallback. Another problem: these included modules appear to be triggering dependencies with debian packaging. >> >> As an example of one hidden dependency, the included Test::Warn requires Array::Compare, which converted to Moose a few years ago, so you automatically have to install the entire Moose dependency tree, even though Bioperl doesn't require it (not a slam on Moose, you really SHOULD be using Moose these days. No, really :). >> >> Anway, more recent versions of Test::Warn don't have this requirement, but as we package an old version of this module we get stuck with the dependencies until we (manually) update this for each repository. Ick. >> >> I think the best solution is to remove the bioperl-local modules in t/lib and list Test::Most instead as a 'build_requires' in Build.PL, e.g. the module is only necessary for the build phase so is optionally installed. Test::Most essentially does exactly the same thing as Bio::Root::Test and more; it also includes Test::Deep and Test::Diff (Bio::Root::Test has a few additional methods of use as well). >> >> As this will require developers to use Test::Most instead, though, I though it would be worth asking on the list to see if there are any objections. Any thoughts? >> >> >> chris >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : > =========================================================== > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From hlapp at drycafe.net Mon Aug 1 18:36:27 2011 From: hlapp at drycafe.net (Hilmar Lapp) Date: Mon, 1 Aug 2011 18:36:27 -0400 Subject: [Bioperl-l] Job opportunity: User Interface Design and Web Application Developer Message-ID: <7F0AE58E-6052-469B-ACD0-207FAD060472@drycafe.net> (Apologies if you have received this already or if this is considered spam - we're trying to reach out as broad as possible and I know that quite a few in the Bio* communities would be well qualified. Please feel free to pass on to anyone who might be interested, or might know someone who is.) User Interface Design and Web Application Developer The National Evolutionary Synthesis Center (NESCent) seeks a creative and enthusiastic individual to design user interfaces and web applications for scientific applications that manage, analyze, visualize and share data in support of evolutionary research. The incumbent will work as part of a small informatics team in close collaboration with domain scientists. NESCent (http://nescent.org) is an NSF-funded center dedicated to cross-disciplinary research in evolutionary science. Our informatics team works closely with visiting and resident scientists to support their custom software and database development needs (http://informatics.nescent.org ), and collaborates broadly with other biodiversity informatics projects. All NESCent software products are open-source, and the Center has a number of initiatives to actively promote collaborative development of community software resources. Above all, we are enthusiastic about our work, about the mission of the Center, and about the contribution of informatics to that mission. Job description: The incumbent will design and develop user interfaces and web applications for databases and other software tools for sponsored scientists and staff. The job responsibilities include all stages of the software development process, including requirements gathering, design, implementation, release packaging and documentation, as part of a small team (typically 2-3 individuals). We expect the incumbent to present their work at conferences and contribute to publications with scientific collaborators; interact regularly with visiting and resident scientists, other members of the informatics team and Center staff; and generally serve as an expert resource for Center personnel. The position provides opportunities for professional development and encourages research into new technologies. Most informatics staff work at our Durham NC offices, located adjacent to Duke University, but we support a wide range of technologies for virtual communication with off-site staff and collaborators. Salary range: $70,000 - $80,000, depending on education and experience Required Qualifications: * Demonstrated success collaborating with clients on custom software solutions * Experience with various stages of the software development cycle * Expertise in development and testing of user interface designs * Excellent communication skills, both virtual and face-to-face Preferred Qualifications: * M.S. or Ph.D. in Computer Science, Bioinformatics or related field * Demonstrated interest in science, particularly biology * Expertise in dynamic and interactive web technologies (JavaScript, CGI) * Expertise in rapid application development and respective programming technologies and languages (e.g., modern scripting languages and web-application frameworks such as Python/Django, Ruby/ Ruby-on-Rails, and Perl/Catalyst). * Expertise in graphic design * Expertise in data visualization and/or scientific data integration * Expertise in software usability design and assessment * Expertise in web service (SOAP, REST, XML, JSON) and semantic web technologies * Fluency in Java programming * Prior experience in relational database programming (PostgreSQL or MySQL) * Experience with open-source, and collaborative, software development How to apply: Please send cover letter, resume and contact information for three references to Dr. Karen Cranston, Training Coordinator and Bioinformatics Project Manager (karen.cranston at nescent.org); Please also complete the online application at the University of North Carolina HR website: http://bit.ly/r9HQ8r. Informal inquires or requests for additional information may be directed to Dr. Cranston by email or phone (+1-919-613-2275). Closing date is August 15, 2011. -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : =========================================================== From florent.angly at gmail.com Mon Aug 1 20:09:51 2011 From: florent.angly at gmail.com (Florent Angly) Date: Tue, 02 Aug 2011 10:09:51 +1000 Subject: [Bioperl-l] BioPerl Test requirements In-Reply-To: <0D28A228-53D1-4843-B99D-9F8A48132EA2@illinois.edu> References: <79E5ED11-1D3F-436F-AA02-D922EAEB123A@illinois.edu> <0D28A228-53D1-4843-B99D-9F8A48132EA2@illinois.edu> Message-ID: <4E37404F.1040001@gmail.com> If Test::Most gives more testing capabilities and makes packaging Bioperl easier, I think it's pretty sweet! Florent On 02/08/11 06:34, Chris Fields wrote: > Okay, will do. I'll initially test on a branch and then pull in. Thanks for the feedback Hilmar and Dave! > > chris > > On Aug 1, 2011, at 3:30 PM, Hilmar Lapp wrote: > >> I think the small burden this change incurs for each developer is well outweighed by the reduced maintenance and installation burden. Go for it. >> >> -hilmar >> >> On Aug 1, 2011, at 12:07 AM, Chris Fields wrote: >> >>> All, >>> >>> We are currently using a BioPerl-specific module for running tests called Bio::Root::Test. It is essentially a wrapper module, re-exporting all the methods for Test::More, Test::Exception, and Test::Warn. One problem: it currently expects a copy of Test::Warn and Test::Exception in each repository as a fallback. Another problem: these included modules appear to be triggering dependencies with debian packaging. >>> >>> As an example of one hidden dependency, the included Test::Warn requires Array::Compare, which converted to Moose a few years ago, so you automatically have to install the entire Moose dependency tree, even though Bioperl doesn't require it (not a slam on Moose, you really SHOULD be using Moose these days. No, really :). >>> >>> Anway, more recent versions of Test::Warn don't have this requirement, but as we package an old version of this module we get stuck with the dependencies until we (manually) update this for each repository. Ick. >>> >>> I think the best solution is to remove the bioperl-local modules in t/lib and list Test::Most instead as a 'build_requires' in Build.PL, e.g. the module is only necessary for the build phase so is optionally installed. Test::Most essentially does exactly the same thing as Bio::Root::Test and more; it also includes Test::Deep and Test::Diff (Bio::Root::Test has a few additional methods of use as well). >>> >>> As this will require developers to use Test::Most instead, though, I though it would be worth asking on the list to see if there are any objections. Any thoughts? >>> >>> >>> chris >>> >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> -- >> =========================================================== >> : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : >> =========================================================== >> >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From hartzell at alerce.com Mon Aug 1 20:06:54 2011 From: hartzell at alerce.com (George Hartzell) Date: Mon, 1 Aug 2011 17:06:54 -0700 Subject: [Bioperl-l] BioPerl Test requirements In-Reply-To: <79E5ED11-1D3F-436F-AA02-D922EAEB123A@illinois.edu> References: <79E5ED11-1D3F-436F-AA02-D922EAEB123A@illinois.edu> Message-ID: <20023.16286.89015.854814@gargle.gargle.HOWL> Sounds great. g. From carandraug+dev at gmail.com Tue Aug 2 10:00:32 2011 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Tue, 2 Aug 2011 15:00:32 +0100 Subject: [Bioperl-l] wiki administrator needed Message-ID: Hi! I have a problem with the bioperl wiki and have sent a support request to 'support at open-bio.org' as instructed here (http://www.bioperl.org/wiki/About_site#Help_with_Wiki_Problems ). I got the ticket ID #966. This was 2 weeks ago. Can someone with administrator rights on the wiki do something about it? Thanks in advance, Carn? Draug From p.j.a.cock at googlemail.com Tue Aug 2 10:56:30 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Tue, 2 Aug 2011 15:56:30 +0100 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: References: Message-ID: 2011/8/2 Carn? Draug : > Hi! > > I have a problem with the bioperl wiki and have sent a support request > to 'support at open-bio.org' as instructed here > (http://www.bioperl.org/wiki/About_site#Help_with_Wiki_Problems ). I > got the ticket ID #966. This was 2 weeks ago. Can someone with > administrator rights on the wiki do something about it? > > Thanks in advance, > Carn? Draug What was the problem with the wiki (for the benefit of those of us who might be able to fix it but are not on the support system and didn't get your email)? Peter From carandraug+dev at gmail.com Tue Aug 2 11:06:10 2011 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Tue, 2 Aug 2011 16:06:10 +0100 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: References: Message-ID: 2011/8/2 Peter Cock : > 2011/8/2 Carn? Draug : >> I have a problem with the bioperl wiki and have sent a support request >> to 'support at open-bio.org' as instructed here >> (http://www.bioperl.org/wiki/About_site#Help_with_Wiki_Problems ). I >> got the ticket ID #966. This was 2 weeks ago. Can someone with >> administrator rights on the wiki do something about it? > > What was the problem with the wiki (for the benefit of those > of us who might be able to fix it but are not on the support > system and didn't get your email)? Guess there should be no problem mentioning this on this open mailing list. Here's the e-mail I sent back then: When logging with OpenID, I accidentally created a new account. Now I can't use that OpenID for my real account since it's connected to that other account. It also doesn't let me remove that OpenID from that account. My real account has the nickname 'Carandraug'. The account I created by accident has the nickname '~carandraug' (because I was trying to connect my account with the OpenID of https://launchpad.net/~carandraug Could someone please remove the '~carandraug' account? I couldn't find a button to do so. From hlapp at drycafe.net Tue Aug 2 12:25:48 2011 From: hlapp at drycafe.net (Hilmar Lapp) Date: Tue, 2 Aug 2011 12:25:48 -0400 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: References: Message-ID: <0E50C8F0-2FCD-4AAB-83DD-DF6229E382C5@drycafe.net> I don't think the wiki allows removing of accounts (only blocking). Someone would have to go into the MySQL database and do that. -hilmar On Aug 2, 2011, at 11:06 AM, Carn? Draug wrote: > 2011/8/2 Peter Cock : >> 2011/8/2 Carn? Draug : >>> I have a problem with the bioperl wiki and have sent a support >>> request >>> to 'support at open-bio.org' as instructed here >>> (http://www.bioperl.org/wiki/About_site#Help_with_Wiki_Problems ). I >>> got the ticket ID #966. This was 2 weeks ago. Can someone with >>> administrator rights on the wiki do something about it? >> >> What was the problem with the wiki (for the benefit of those >> of us who might be able to fix it but are not on the support >> system and didn't get your email)? > > Guess there should be no problem mentioning this on this open mailing > list. Here's the e-mail I sent back then: > > When logging with OpenID, I accidentally created a new account. Now I > can't use that OpenID for my real account since it's connected to that > other account. It also doesn't let me remove that OpenID from that > account. > > My real account has the nickname 'Carandraug'. > > The account I created by accident has the nickname '~carandraug' > (because I was trying to connect my account with the OpenID of > https://launchpad.net/~carandraug > > Could someone please remove the '~carandraug' account? I couldn't find > a button to do so. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : =========================================================== From p.j.a.cock at googlemail.com Tue Aug 2 12:27:11 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Tue, 2 Aug 2011 17:27:11 +0100 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: <0E50C8F0-2FCD-4AAB-83DD-DF6229E382C5@drycafe.net> References: <0E50C8F0-2FCD-4AAB-83DD-DF6229E382C5@drycafe.net> Message-ID: 2011/8/2 Hilmar Lapp : > I don't think the wiki allows removing of accounts (only blocking). Someone > would have to go into the MySQL database and do that. The MediaWiki FAQ says don't do that, but does mention an optional add-on for merging wiki user accounts. We could block the unwanted account instead. Peter From cjfields at illinois.edu Tue Aug 2 12:35:36 2011 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 2 Aug 2011 11:35:36 -0500 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: References: <0E50C8F0-2FCD-4AAB-83DD-DF6229E382C5@drycafe.net> Message-ID: I don't know if blocking that account will solve to OpenID problem (that it is associated with the bad account), but maybe merging that account and Carn?'s good one will work. Maybe it's worth looking at the add-on. chris On Aug 2, 2011, at 11:27 AM, Peter Cock wrote: > 2011/8/2 Hilmar Lapp : >> I don't think the wiki allows removing of accounts (only blocking). Someone >> would have to go into the MySQL database and do that. > > The MediaWiki FAQ says don't do that, but does mention an > optional add-on for merging wiki user accounts. > > We could block the unwanted account instead. > > Peter > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Tue Aug 2 12:38:01 2011 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 2 Aug 2011 11:38:01 -0500 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: References: <0E50C8F0-2FCD-4AAB-83DD-DF6229E382C5@drycafe.net> Message-ID: Carn?, Try logging in with the bad account, then go under 'my preferences'. There is an OpenID tag; this lists your OpenIDs, along with a 'delete' button. See if deleting the OpenID helps. chris On Aug 2, 2011, at 11:27 AM, Peter Cock wrote: > 2011/8/2 Hilmar Lapp : >> I don't think the wiki allows removing of accounts (only blocking). Someone >> would have to go into the MySQL database and do that. > > The MediaWiki FAQ says don't do that, but does mention an > optional add-on for merging wiki user accounts. > > We could block the unwanted account instead. > > Peter > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From carandraug+dev at gmail.com Tue Aug 2 12:58:41 2011 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Tue, 2 Aug 2011 17:58:41 +0100 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: References: <0E50C8F0-2FCD-4AAB-83DD-DF6229E382C5@drycafe.net> Message-ID: On 2 August 2011 17:38, Chris Fields wrote: > Try logging in with the bad account, then go under 'my preferences'. ?There is an OpenID tag; this lists your OpenIDs, along with a 'delete' button. ?See if deleting the OpenID helps. I had try that the first time. However, it didn't let me do it because that OpenID was the one used to create the account. Carn? From ihok at hotmail.com Tue Aug 2 13:29:43 2011 From: ihok at hotmail.com (Jack Tanner) Date: Tue, 2 Aug 2011 13:29:43 -0400 Subject: [Bioperl-l] fastq quality with initial @ Message-ID: i've got a fastq file with PHRED quality strings that sometimes start with '@'. this breaks the _index_file routine in Bio/Index/Fastq.pm. i would've filed this in bugzilla, but i'm not authorized to do that. From cjfields at illinois.edu Tue Aug 2 14:59:00 2011 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 2 Aug 2011 13:59:00 -0500 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: References: <0E50C8F0-2FCD-4AAB-83DD-DF6229E382C5@drycafe.net> Message-ID: Let's see if we can get the merge account add-in working, then. chris On Aug 2, 2011, at 11:58 AM, Carn? Draug wrote: > On 2 August 2011 17:38, Chris Fields wrote: >> Try logging in with the bad account, then go under 'my preferences'. There is an OpenID tag; this lists your OpenIDs, along with a 'delete' button. See if deleting the OpenID helps. > > I had try that the first time. However, it didn't let me do it because > that OpenID was the one used to create the account. > > Carn? > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Tue Aug 2 15:00:47 2011 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 2 Aug 2011 14:00:47 -0500 Subject: [Bioperl-l] fastq quality with initial @ In-Reply-To: References: Message-ID: <441DB637-5586-488F-8943-FEA4D56C276B@illinois.edu> On Aug 2, 2011, at 12:29 PM, Jack Tanner wrote: > > i've got a fastq file with PHRED quality strings that sometimes start with '@'. this breaks the _index_file routine in Bio/Index/Fastq.pm. > i would've filed this in bugzilla, but i'm not authorized to do that. We no longer use bugzilla (as of v 1.6.900); see here: http://www.bioperl.org/wiki/Bugs Just register for an account and submit. I would check the latest code before doing so, just in case it has been fixed. chris From bosborne11 at verizon.net Tue Aug 2 16:24:54 2011 From: bosborne11 at verizon.net (Brian Osborne) Date: Tue, 02 Aug 2011 16:24:54 -0400 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: References: <0E50C8F0-2FCD-4AAB-83DD-DF6229E382C5@drycafe.net> Message-ID: Chris, This is the one I've used: http://www.mediawiki.org/wiki/Extension:User_Merge_and_Delete BIO On Aug 2, 2011, at 2:59 PM, Chris Fields wrote: > Let's see if we can get the merge account add-in working, then. > > chris > > On Aug 2, 2011, at 11:58 AM, Carn? Draug wrote: > >> On 2 August 2011 17:38, Chris Fields wrote: >>> Try logging in with the bad account, then go under 'my preferences'. There is an OpenID tag; this lists your OpenIDs, along with a 'delete' button. See if deleting the OpenID helps. >> >> I had try that the first time. However, it didn't let me do it because >> that OpenID was the one used to create the account. >> >> Carn? >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Tue Aug 2 18:01:42 2011 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 2 Aug 2011 17:01:42 -0500 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: References: <0E50C8F0-2FCD-4AAB-83DD-DF6229E382C5@drycafe.net> Message-ID: <0CE0963F-3D4C-4EDB-A77F-859598E24DE2@illinois.edu> Carn?, I installed the add-in, merged the old account (~carandraug) into the one specified (Carandraug ), and deleted the old account. See if that works. chris On Aug 2, 2011, at 1:59 PM, Chris Fields wrote: > Let's see if we can get the merge account add-in working, then. > > chris > > On Aug 2, 2011, at 11:58 AM, Carn? Draug wrote: > >> On 2 August 2011 17:38, Chris Fields wrote: >>> Try logging in with the bad account, then go under 'my preferences'. There is an OpenID tag; this lists your OpenIDs, along with a 'delete' button. See if deleting the OpenID helps. >> >> I had try that the first time. However, it didn't let me do it because >> that OpenID was the one used to create the account. >> >> Carn? >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From carandraug+dev at gmail.com Tue Aug 2 18:19:38 2011 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Tue, 2 Aug 2011 23:19:38 +0100 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: <0CE0963F-3D4C-4EDB-A77F-859598E24DE2@illinois.edu> References: <0E50C8F0-2FCD-4AAB-83DD-DF6229E382C5@drycafe.net> <0CE0963F-3D4C-4EDB-A77F-859598E24DE2@illinois.edu> Message-ID: On 2 August 2011 23:01, Chris Fields wrote: > Carn?, > > I installed the add-in, merged the old account (~carandraug) into the one specified (Carandraug ), and deleted the old account. ?See if that works. > > chris When I try to add this OpenID to my account, I still get the error: "That is someone else's OpenID." If I try to log in with this OpenID, after saying that I'm logged in successfully, the site still looks as if I'm not logged in, with a button to 'log in' and an IP address instead of a username. Another problem that I have when logging is that sometimes mediawiki sends 'https://login.launchpad.net/ id/y7xtYzD' instead of 'https://login.launchpad.net/~carandraug' to the launchpad server. I don't know what's causing this. Trying to backspace and delete what may be invisible characters before and after the string sometimes solves this. This happens even though I type this character by character so if there's any invisble stuff on the form it must be there before. This occurs when using Iceweasel 3.5 (in Debian), Firefox 3.6 (in Ubuntu) and Firefox 5 (in MacOSX). Carn? Draug From cjfields at illinois.edu Tue Aug 2 18:39:19 2011 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 2 Aug 2011 17:39:19 -0500 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: References: <0E50C8F0-2FCD-4AAB-83DD-DF6229E382C5@drycafe.net> <0CE0963F-3D4C-4EDB-A77F-859598E24DE2@illinois.edu> Message-ID: On Aug 2, 2011, at 5:19 PM, Carn? Draug wrote: > On 2 August 2011 23:01, Chris Fields wrote: >> Carn?, >> >> I installed the add-in, merged the old account (~carandraug) into the one specified (Carandraug ), and deleted the old account. See if that works. >> >> chris > > When I try to add this OpenID to my account, I still get the error: > "That is someone else's OpenID." Apparently UserMerge doesn't clean up empty OpenID. I found that one (login.launchpad.net/~carandraug) and manually deleted it. The user ID it was associated with no longer existed in the user tables. Kinda wondered if that would happen... > If I try to log in with this OpenID, after saying that I'm logged in > successfully, the site still looks as if I'm not logged in, with a > button to 'log in' and an IP address instead of a username. > > Another problem that I have when logging is that sometimes mediawiki > sends 'https://login.launchpad.net/ id/y7xtYzD' instead of > 'https://login.launchpad.net/~carandraug' to the launchpad server. I > don't know what's causing this. Trying to backspace and delete what > may be invisible characters before and after the string sometimes > solves this. This happens even though I type this character by > character so if there's any invisble stuff on the form it must be > there before. This occurs when using Iceweasel 3.5 (in Debian), > Firefox 3.6 (in Ubuntu) and Firefox 5 (in MacOSX). > > Carn? Draug Not sure myself, sounds like a MW bug. See if the OpenID works first, then maybe we can address that. chris From carandraug+dev at gmail.com Tue Aug 2 18:56:49 2011 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Tue, 2 Aug 2011 23:56:49 +0100 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: References: <0E50C8F0-2FCD-4AAB-83DD-DF6229E382C5@drycafe.net> <0CE0963F-3D4C-4EDB-A77F-859598E24DE2@illinois.edu> Message-ID: 2011/8/2 Chris Fields : > On Aug 2, 2011, at 5:19 PM, Carn? Draug wrote: >> On 2 August 2011 23:01, Chris Fields wrote: >>> Carn?, >>> >>> I installed the add-in, merged the old account (~carandraug) into the one specified (Carandraug ), and deleted the old account. ?See if that works. >>> >> >> When I try to add this OpenID to my account, I still get the error: >> "That is someone else's OpenID." > > Apparently UserMerge doesn't clean up empty OpenID. ?I found that one (login.launchpad.net/~carandraug) and manually deleted it. ?The user ID it was associated with no longer existed in the user tables. This is solved. I connected my account with this OpenID and can now log in with it. Thank you >> Another problem that I have when logging is that sometimes mediawiki >> sends 'https://login.launchpad.net/ id/y7xtYzD' instead of >> 'https://login.launchpad.net/~carandraug' to the launchpad server. I >> don't know what's causing this. Trying to backspace and delete what >> may be invisible characters before and after the string sometimes >> solves this. This happens even though I type this character by >> character so if there's any invisble stuff on the form it must be >> there before. This occurs when using Iceweasel 3.5 (in Debian), >> Firefox 3.6 (in Ubuntu) and Firefox 5 (in MacOSX). This still happens sometimes. It just happened now. I had also fill a support request about this issue some weeks ago (ticket #965). No idea what's been causing this. Carn? From cjfields at illinois.edu Tue Aug 2 21:55:23 2011 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 2 Aug 2011 20:55:23 -0500 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: References: <0E50C8F0-2FCD-4AAB-83DD-DF6229E382C5@drycafe.net> <0CE0963F-3D4C-4EDB-A77F-859598E24DE2@illinois.edu> Message-ID: On Aug 2, 2011, at 5:56 PM, Carn? Draug wrote: > 2011/8/2 Chris Fields : >> On Aug 2, 2011, at 5:19 PM, Carn? Draug wrote: >>> On 2 August 2011 23:01, Chris Fields wrote: >>>> Carn?, >>>> >>>> I installed the add-in, merged the old account (~carandraug) into the one specified (Carandraug ), and deleted the old account. See if that works. >>>> >>> >>> When I try to add this OpenID to my account, I still get the error: >>> "That is someone else's OpenID." >> >> Apparently UserMerge doesn't clean up empty OpenID. I found that one (login.launchpad.net/~carandraug) and manually deleted it. The user ID it was associated with no longer existed in the user tables. > > This is solved. I connected my account with this OpenID and can now > log in with it. Thank you No problem. Apparently there is a bug fix in the more recent versions of OpenID and UserMerge, I'll add a redmine task to make sure they get updated (have my hands full right now, and OpenID can sometimes be tricky to debug). >>> Another problem that I have when logging is that sometimes mediawiki >>> sends 'https://login.launchpad.net/ id/y7xtYzD' instead of >>> 'https://login.launchpad.net/~carandraug' to the launchpad server. I >>> don't know what's causing this. Trying to backspace and delete what >>> may be invisible characters before and after the string sometimes >>> solves this. This happens even though I type this character by >>> character so if there's any invisble stuff on the form it must be >>> there before. This occurs when using Iceweasel 3.5 (in Debian), >>> Firefox 3.6 (in Ubuntu) and Firefox 5 (in MacOSX). > > This still happens sometimes. It just happened now. I had also fill a > support request about this issue some weeks ago (ticket #965). No idea > what's been causing this. > > Carn? Okay, as long as it's noted somewhere. chris From kai.blin at biotech.uni-tuebingen.de Wed Aug 3 04:55:04 2011 From: kai.blin at biotech.uni-tuebingen.de (Kai Blin) Date: Wed, 03 Aug 2011 10:55:04 +0200 Subject: [Bioperl-l] Bio::Root::IO _readline/_pushback behavior Message-ID: <4E390CE8.2050100@biotech.uni-tuebingen.de> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi folks, as I mentioned on https://redmine.open-bio.org/issues/3264 there is something odd going on with Bio::Root::IO's _readline/_pushback functions. This seems to be intentional, at least there is a test case asserting the behaviour I'm seeing. It his however very confusing to the unexpecting programmer using the code. One assumption I'd immediately make would be that if I have code that does a $foo = $io->_readline; $io->_pushback($foo); $bar = $io->_readline;, $foo will be the same string as $bar, regardless what other pieces of the code did. Currently, this is not the case, because the readbuffer that _pushback pushes back into has new strings appended to the end but readline removes them from the front. This easily violates the "principle of least surprise", so I think we should change the readbuffer to a stack. As far as I can tell, changing the _pushback function to "unshift" instead of "push" to the readbuffer breaks only the Root/RootIO.t test designed to test the old behaviour. I don't see any other tests failing on my system that don't fail without this patch. Any comments from the core devs? Cheers, Kai - -- Dipl.-Inform. Kai Blin kai.blin at biotech.uni-tuebingen.de Institute for Microbiology and Infection Medicine Division of Microbiology/Biotechnology Eberhard-Karls-Universit?t T?bingen Auf der Morgenstelle 28 Phone : ++49 7071 29-78841 D-72076 T?bingen Fax : ++49 7071 29-5979 Germany Homepage: http://www.mikrobio.uni-tuebingen.de/ag_wohlleben -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJOOQzoAAoJEKM5lwBiwTTPO6QIAMDN1bAm1FFD98F0rhN7TCpW sV2sLkQDESK9YjCxp3kAqCpg7ZCArcA5l7HmEdAZFTzdFnsfnvKJmNB86C30QXJs 6XcYSbvBIPQdhjK7WIhG2pANItiTxKTGgXDZklVjgj2dVT4kSkCgdGYAAMssT1hn n1/jkBJu5uuCq43Wv5Ia+wEhdN0M+xgKc9x7MF/ikO2qr6x24odMNTW8VgyLsYie p9M68U23aStip2rxV1hrhZzbnjLz66V6O9fIEHmm5CYLfcGXkcrclzLIeptepSj1 bj/7dWIdXy8VnoSNx4RbckHSkMbdIkmyPKzmoYFN7p3FvmrSXsOmB6nfD0hEkbY= =S5ff -----END PGP SIGNATURE----- From shelly.mh at gmail.com Tue Aug 2 06:19:33 2011 From: shelly.mh at gmail.com (Shelly M) Date: Tue, 2 Aug 2011 13:19:33 +0300 Subject: [Bioperl-l] question regarding Bio::DB::CUTG Message-ID: Hello, My name is Shelly and I'm a student at the Hebrew university of Jerusalem. I'm trying to use the package Bio::DB::CUTG but I have some trouble retrieving the right table for a given organism. For example, if I write my $cdtable = Bio::DB::CUTG->new(-sp =>'Mus musculus'); I get a warring message :MSG: too many species - not a unique species id, and it return _species => mitochondrion Mus musculus. So my question is what is the exact format for retrieving the the specific organism? Thanks a lot for the help, Shelly From maximilien1er at gmail.com Tue Aug 2 22:50:44 2011 From: maximilien1er at gmail.com (=?ISO-8859-1?Q?Maxime_D=E9raspe?=) Date: Tue, 2 Aug 2011 19:50:44 -0700 (PDT) Subject: [Bioperl-l] Genbank file : bad features (tag) order with /translation Message-ID: Hi, when I parse a genbank file no matter what I do, the / translation="MKAV.." tag value of a CDS never appear in the last place as it should be. Other tags like /note= /product comes after / translation which it's not the usual practice with genbank file. Could anyone have an idea how to deal with it... put /translation tag value in the last place when I write the genbank file. Thank you ! Max From shachigahoimbi at gmail.com Wed Aug 3 02:00:44 2011 From: shachigahoimbi at gmail.com (Shachi Gahoi) Date: Wed, 3 Aug 2011 11:30:44 +0530 Subject: [Bioperl-l] How to show branch length value in tree Message-ID: Dear All I am using Bio::Tree modules for constructing and drawing tree. *I am unable to show branch length value in tree. * Please tell me How can I do this, if anybody knows. Here is my script which i am using...and i also attached generated tree. Thanks in advance ################################################################################################ use Bio::AlignIO; use Bio::Align::ProteinStatistics; use Bio::Tree::DistanceFactory; use Bio::TreeIO; use Bio::Tree::Draw::Cladogram; # for a dna alignment # can also use ProteinStatistics my $alnio = Bio::AlignIO->new(-file => 'ADP.aln', -format=>'clustalw'); my $dfactory = Bio::Tree::DistanceFactory->new(-method => 'UPGMA'); my $stats = Bio::Align::ProteinStatistics->new; my $treeout = Bio::TreeIO->new(-format => 'newick', -file =>'>ADP1.dnd'); while( my $aln = $alnio->next_aln ) { my $mat = $stats->distance(-method => 'Kimura', -align => $aln); my $tree = $dfactory->make_tree($mat); $treeout->write_tree($tree); } my $dir = shift || '.'; opendir(DIR, $dir) || die $!; for my $file ( readdir(DIR) ) { next unless $file =~ /(\S+)\.dnd$/; my $stem = $1; my $treeio = Bio::TreeIO->new('-format' => 'newick', '-file' => "$dir/$file"); if( my $t1 = $treeio->next_tree ) { my $obj1 = Bio::Tree::Draw::Cladogram->new(-bootstrap => 1, -tree => $t1, -compact => 0); $obj1->print(-file => "$dir/$stem.eps"); } } ######################################################################################################## -- Regards, Shachi -------------- next part -------------- A non-text attachment was scrubbed... Name: ADP1.dnd Type: application/octet-stream Size: 1369 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ADP1.eps Type: application/postscript Size: 17718 bytes Desc: not available URL: From cjfields at illinois.edu Wed Aug 3 09:10:20 2011 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 3 Aug 2011 08:10:20 -0500 Subject: [Bioperl-l] Question to Bio::SearchIO::infernal.pm In-Reply-To: <4E32E14B020000EE00004F57@gwia1.boku.ac.at> References: <4E32E14B020000EE00004F57@gwia1.boku.ac.at> Message-ID: Nadine, Hard to guess w/o seeing the report, but I'm not terribly surprised. I believe I only coded for simple 1 CM reports, IIRC. You'll have to file this as a bug on redmine along with an example. chris On Jul 29, 2011, at 9:35 AM, Nadine Elpida Tatto wrote: > Hi There! > > > > I was wondering if you would or can help me. > > > I have an infernal report containing about 2000 CMs from an infernal run against Rfam.cm. To parse this report I wanted to use Bio::SearchIO::infernal.pm. Unfortunately this turned out to be a problem for me, because "$parser->next_result" only delivers the result for the first CM in the report and nothing more. > > > My code: > #!/usr/bin/perl -w > > > use strict;use Data::Dumper; > use Bio::SearchIO; > > > my $infile = $ARGV[0]; # infernal report > my $parser = Bio::SearchIO->new(-format => 'Infernal', > -file => $infile); > > > while( my $result = $parser->next_result ) { > print $result->query_name . "\n"; > } > > > exit; > > > > > The output: > > > ntatto:~$ ./infernalParser.pl infernal.output > 5S_rRNA > ntatto:~$ > > > > > I would expect the following (like parsing a blast report): > > > ntatto:~$ ./infernalParser.pl infernal.output > 5S_rRNA > 5_8S_rRNA > U1 > ... > ntatto:~$ > > > > I would be glad for help. > > > Thank you in advance. > > > Best Regards > > > N Tatto > > > > > > > > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From p.j.a.cock at googlemail.com Wed Aug 3 09:46:06 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 3 Aug 2011 14:46:06 +0100 Subject: [Bioperl-l] Genbank file : bad features (tag) order with /translation In-Reply-To: References: Message-ID: 2011/8/3 Maxime D?raspe : > Hi, > > when I parse a genbank file no matter what I do, the / > translation="MKAV.." tag value of a CDS never appear in the last place > as it should be. Other tags like /note= /product comes after / > translation which it's not the usual practice with genbank file. Could > anyone have an idea how to deal with it... put /translation tag value > in the last place when I write the genbank file. > > Thank you ! > > Max Hi Max, I'm not aware of anything in the feature table specification about the order of the feature qualifiers (the "tags" like /note and /product). See http://www.ncbi.nlm.nih.gov/collab/FT/ I suspect BioPerl is using a hash (Biopython uses a dictionary) for the feature qualifiers, which would discard the order. Why do you care about the order? Peter From roy.chaudhuri at gmail.com Wed Aug 3 09:58:22 2011 From: roy.chaudhuri at gmail.com (Roy Chaudhuri) Date: Wed, 03 Aug 2011 14:58:22 +0100 Subject: [Bioperl-l] How to show branch length value in tree In-Reply-To: References: Message-ID: <4E3953FE.5080304@gmail.com> Hi Shachi, I don't think you can draw labels on branches using Bio::Tree::Draw::Cladogram. However, it will draw node labels, so you could copy the branch lengths over to the node ids: my $obj1 = Bio::Tree::Draw::Cladogram->new(-bootstrap => 1, -tree => $t1, -compact => 0); for my $node ($tree->get_nodes) { $node->id($node->branch_length) if defined $node->branch_length; } $obj1->print(-file => "$dir/$stem.eps") Incidentally, in your script you write the tree out to a file, then read it back in using TreeIO. This is unnecessary, you can use $tree directly as input to Bio::Tree::Draw::Cladogram. Alternatively, you could write out a newick file and use non-Bioperl software such as njplot or MEGA to draw your tree with labelled branch lengths. Cheers, Roy. On 03/08/2011 07:00, Shachi Gahoi wrote: > Dear All > > I am using Bio::Tree modules for constructing and drawing tree. *I am unable > to show branch length value in tree. > * > Please tell me How can I do this, if anybody knows. > > Here is my script which i am using...and i also attached generated tree. > > Thanks in advance > > ################################################################################################ > > use Bio::AlignIO; > use Bio::Align::ProteinStatistics; > use Bio::Tree::DistanceFactory; > use Bio::TreeIO; > use Bio::Tree::Draw::Cladogram; > > # for a dna alignment > # can also use ProteinStatistics > > my $alnio = Bio::AlignIO->new(-file => 'ADP.aln', -format=>'clustalw'); > > my $dfactory = Bio::Tree::DistanceFactory->new(-method => 'UPGMA'); > > my $stats = Bio::Align::ProteinStatistics->new; > > my $treeout = Bio::TreeIO->new(-format => 'newick', -file =>'>ADP1.dnd'); > > while( my $aln = $alnio->next_aln ) > { > my $mat = $stats->distance(-method => 'Kimura', -align => $aln); > > my $tree = $dfactory->make_tree($mat); > $treeout->write_tree($tree); > } > > my $dir = shift || '.'; > > opendir(DIR, $dir) || die $!; > for my $file ( readdir(DIR) ) > { > next unless $file =~ /(\S+)\.dnd$/; > my $stem = $1; > my $treeio = Bio::TreeIO->new('-format' => 'newick', > '-file' => "$dir/$file"); > > if( my $t1 = $treeio->next_tree ) > { > my $obj1 = Bio::Tree::Draw::Cladogram->new(-bootstrap => 1, > -tree => $t1, > -compact => 0); > $obj1->print(-file => "$dir/$stem.eps"); > } > } > > ######################################################################################################## > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From roy.chaudhuri at gmail.com Wed Aug 3 10:01:18 2011 From: roy.chaudhuri at gmail.com (Roy Chaudhuri) Date: Wed, 03 Aug 2011 15:01:18 +0100 Subject: [Bioperl-l] How to show branch length value in tree In-Reply-To: <4E3953FE.5080304@gmail.com> References: <4E3953FE.5080304@gmail.com> Message-ID: <4E3954AE.2080401@gmail.com> Sorry, the code had a typo, it should be: my $obj1 = Bio::Tree::Draw::Cladogram->new(-bootstrap => 1, -tree => $t1, -compact => 0); for my $node ($t1->get_nodes) { $node->id($node->branch_length) if defined $node->branch_length; } $obj1->print(-file => "$dir/$stem.eps") On 03/08/2011 14:58, Roy Chaudhuri wrote: > Hi Shachi, > > I don't think you can draw labels on branches using > Bio::Tree::Draw::Cladogram. However, it will draw node labels, so you > could copy the branch lengths over to the node ids: > > my $obj1 = Bio::Tree::Draw::Cladogram->new(-bootstrap => 1, > -tree => $t1, > -compact => 0); > for my $node ($tree->get_nodes) { > $node->id($node->branch_length) if defined $node->branch_length; > } > $obj1->print(-file => "$dir/$stem.eps") > > Incidentally, in your script you write the tree out to a file, then read > it back in using TreeIO. This is unnecessary, you can use $tree directly > as input to Bio::Tree::Draw::Cladogram. > > Alternatively, you could write out a newick file and use non-Bioperl > software such as njplot or MEGA to draw your tree with labelled branch > lengths. > > Cheers, > Roy. > > On 03/08/2011 07:00, Shachi Gahoi wrote: >> Dear All >> >> I am using Bio::Tree modules for constructing and drawing tree. *I am unable >> to show branch length value in tree. >> * >> Please tell me How can I do this, if anybody knows. >> >> Here is my script which i am using...and i also attached generated tree. >> >> Thanks in advance >> >> ################################################################################################ >> >> use Bio::AlignIO; >> use Bio::Align::ProteinStatistics; >> use Bio::Tree::DistanceFactory; >> use Bio::TreeIO; >> use Bio::Tree::Draw::Cladogram; >> >> # for a dna alignment >> # can also use ProteinStatistics >> >> my $alnio = Bio::AlignIO->new(-file => 'ADP.aln', -format=>'clustalw'); >> >> my $dfactory = Bio::Tree::DistanceFactory->new(-method => 'UPGMA'); >> >> my $stats = Bio::Align::ProteinStatistics->new; >> >> my $treeout = Bio::TreeIO->new(-format => 'newick', -file =>'>ADP1.dnd'); >> >> while( my $aln = $alnio->next_aln ) >> { >> my $mat = $stats->distance(-method => 'Kimura', -align => $aln); >> >> my $tree = $dfactory->make_tree($mat); >> $treeout->write_tree($tree); >> } >> >> my $dir = shift || '.'; >> >> opendir(DIR, $dir) || die $!; >> for my $file ( readdir(DIR) ) >> { >> next unless $file =~ /(\S+)\.dnd$/; >> my $stem = $1; >> my $treeio = Bio::TreeIO->new('-format' => 'newick', >> '-file' => "$dir/$file"); >> >> if( my $t1 = $treeio->next_tree ) >> { >> my $obj1 = Bio::Tree::Draw::Cladogram->new(-bootstrap => 1, >> -tree => $t1, >> -compact => 0); >> $obj1->print(-file => "$dir/$stem.eps"); >> } >> } >> >> ######################################################################################################## >> >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > From cjfields at illinois.edu Wed Aug 3 10:08:33 2011 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 3 Aug 2011 09:08:33 -0500 Subject: [Bioperl-l] Genbank file : bad features (tag) order with /translation In-Reply-To: References: Message-ID: <4585DD3A-8E0A-4820-BA34-8154146A0BC8@illinois.edu> On Aug 3, 2011, at 8:46 AM, Peter Cock wrote: > 2011/8/3 Maxime D?raspe : >> Hi, >> >> when I parse a genbank file no matter what I do, the / >> translation="MKAV.." tag value of a CDS never appear in the last place >> as it should be. Other tags like /note= /product comes after / >> translation which it's not the usual practice with genbank file. Could >> anyone have an idea how to deal with it... put /translation tag value >> in the last place when I write the genbank file. >> >> Thank you ! >> >> Max > > Hi Max, > > I'm not aware of anything in the feature table specification > about the order of the feature qualifiers (the "tags" like /note > and /product). See http://www.ncbi.nlm.nih.gov/collab/FT/ > > I suspect BioPerl is using a hash (Biopython uses a dictionary) > for the feature qualifiers, which would discard the order. > > Why do you care about the order? > > Peter Yes, it uses a hash based on the feature tags. Not sure how Biopython handles it but my guess is something similar (Peter?). The output order was never a chief concern of ours. To tell the truth our main focus has never been simple conversion, except to transform data into a format that is more manageable/normalized. For those interested in making this change, all the code for printing features is in one method in Bio::SeqIO::genbank, _print_GenBank_FTHelper(). The best way to handle this would be to allow an optional coderef/callback that takes the feature (or the tags) and allows custom sorting and printing; I don't want to get into messy semantics on how to specifically sort tags, best to let the user decide. chris From cjfields at illinois.edu Wed Aug 3 10:16:37 2011 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 3 Aug 2011 09:16:37 -0500 Subject: [Bioperl-l] Bio::Root::IO _readline/_pushback behavior In-Reply-To: <4E390CE8.2050100@biotech.uni-tuebingen.de> References: <4E390CE8.2050100@biotech.uni-tuebingen.de> Message-ID: On Aug 3, 2011, at 3:55 AM, Kai Blin wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hi folks, > > as I mentioned on https://redmine.open-bio.org/issues/3264 there is > something odd going on with Bio::Root::IO's _readline/_pushback > functions. This seems to be intentional, at least there is a test case > asserting the behaviour I'm seeing. It his however very confusing to the > unexpecting programmer using the code. > > One assumption I'd immediately make would be that if I have code that > does a $foo = $io->_readline; $io->_pushback($foo); $bar = > $io->_readline;, $foo will be the same string as $bar, regardless what > other pieces of the code did. Currently, this is not the case, because > the readbuffer that _pushback pushes back into has new strings appended > to the end but readline removes them from the front. I think this test is performed in the regressions already, but if not then it is more than welcome. > This easily violates the "principle of least surprise", so I think we > should change the readbuffer to a stack. As far as I can tell, changing > the _pushback function to "unshift" instead of "push" to the readbuffer > breaks only the Root/RootIO.t test designed to test the old behaviour. I > don't see any other tests failing on my system that don't fail without > this patch. > > Any comments from the core devs? I don't have a problem with that beyond the change to the RootIO.t tests (it implies a specific behavior that some developers expect, so is a very subtle API change). However, this is how one would expect it, to be more like an 'unread' stack instead of a queue. In fact, there is a module I used for Biome's pushback/readline called IO::Unread that implements an IO layer for mimicing this behavior, might be worth looking into. > Cheers, > Kai chris Christopher Fields Senior Research Scientist National Center for Supercomputing Applications Institute for Genomic Biology University of Illinois Urbana-Champaign 1206 W. Gregory Dr. , MC-195 Urbana, IL 61801 From p.j.a.cock at googlemail.com Wed Aug 3 10:45:21 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 3 Aug 2011 15:45:21 +0100 Subject: [Bioperl-l] Genbank file : bad features (tag) order with /translation In-Reply-To: <4585DD3A-8E0A-4820-BA34-8154146A0BC8@illinois.edu> References: <4585DD3A-8E0A-4820-BA34-8154146A0BC8@illinois.edu> Message-ID: On Wed, Aug 3, 2011 at 3:08 PM, Chris Fields wrote: > > Yes, it uses a hash based on the feature tags. ?Not sure how Biopython > handles it but my guess is something similar (Peter?). Yes, we key on the feature qualifier (e.g. note or product) and the values are a list of qualifier values (e.g. you can have two notes). > The output order was never a chief concern of ours. ?To tell the truth > our main focus has never been simple conversion, except to transform > data into a format that is more manageable/normalized. > > For those interested in making this change, all the code ?for printing > features is in one method in Bio::SeqIO::genbank, _print_GenBank_FTHelper(). >?The best way to handle this would be to allow an optional coderef/callback > that takes the feature (or the tags) and allows custom sorting and printing; > I don't want to get into messy semantics on how to specifically sort tags, > best to let the user decide. For Biopython switching from the default dictionary (hash type) to an order preserving dictionary would be one option. I too have no wish to try and implement qualifier sorting without an explicit standard. Peter From maximilien1er at gmail.com Wed Aug 3 10:48:05 2011 From: maximilien1er at gmail.com (=?ISO-8859-1?Q?Maxime_D=E9raspe?=) Date: Wed, 3 Aug 2011 07:48:05 -0700 (PDT) Subject: [Bioperl-l] Genbank file : bad features (tag) order with /translation In-Reply-To: References: Message-ID: <7717ecd0-c9b6-4ecc-895f-db60ca89e259@l7g2000vbz.googlegroups.com> > Hi Max, > > I'm not aware of anything in the feature table specification > about the order of the feature qualifiers (the "tags" like /note > and /product). Seehttp://www.ncbi.nlm.nih.gov/collab/FT/ > > I suspect BioPerl is using a hash (Biopython uses a dictionary) > for the feature qualifiers, which would discard the order. > > Why do you care about the order? > > Peter > Hi Peter, I care about the order for the submission to ncbi. But I guess they will reformat the file before getting it in their database. It's also visually better when the translation of the protein comes in the end of the annotation for the CDS and not before /product, /note .... Anyway maybe I'll reformat the file in sequin table for a direct submission to ncbi with sequin. Thank you. Max From p.j.a.cock at googlemail.com Wed Aug 3 12:00:01 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 3 Aug 2011 17:00:01 +0100 Subject: [Bioperl-l] Genbank file : bad features (tag) order with /translation In-Reply-To: <7717ecd0-c9b6-4ecc-895f-db60ca89e259@l7g2000vbz.googlegroups.com> References: <7717ecd0-c9b6-4ecc-895f-db60ca89e259@l7g2000vbz.googlegroups.com> Message-ID: 2011/8/3 Maxime D?raspe : >> >> Why do you care about the order? >> > > Hi Peter, > > I care about the order for the submission to ncbi. Do the NCBI have some guidelines which ask for a particular order? > But I guess they > will reformat the file before getting it in their database. They seem to generate the official GenBank files from their database - so I doubt the input order matters. > It's also > visually better when the translation of the protein comes in the end > of the annotation for the CDS and not before /product, /note .... I do see your point, but if that were the only motivation I wouldn't want to make generating GenBank output any more complicated than it already is. > Anyway maybe I'll reformat the file in sequin table for a direct > submission to ncbi with sequin. > > Thank you. > > Max Peter From cjfields at illinois.edu Wed Aug 3 12:52:02 2011 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 3 Aug 2011 11:52:02 -0500 Subject: [Bioperl-l] Genbank file : bad features (tag) order with /translation In-Reply-To: References: <7717ecd0-c9b6-4ecc-895f-db60ca89e259@l7g2000vbz.googlegroups.com> Message-ID: On Aug 3, 2011, at 11:00 AM, Peter Cock wrote: > 2011/8/3 Maxime D?raspe : >>> >>> Why do you care about the order? >>> >> >> Hi Peter, >> >> I care about the order for the submission to ncbi. > > Do the NCBI have some guidelines which ask for a particular order? No, beyond the feature table there is no specification that indicates such that I am aware of. Submitted data is tabular; sequin is a nicer GUI API for getting data into a useful format for submission to NCBI, where data is converted to ASN.1 I believe. >> But I guess they >> will reformat the file before getting it in their database. > > They seem to generate the official GenBank files from their > database - so I doubt the input order matters. Yep, that's correct. If NCBI ruled the world everyone would be using ASN.1 (b/c that's what they use internally). >> It's also >> visually better when the translation of the protein comes in the end >> of the annotation for the CDS and not before /product, /note .... > > I do see your point, but if that were the only motivation I wouldn't > want to make generating GenBank output any more complicated > than it already is. ... >> Anyway maybe I'll reformat the file in sequin table for a direct >> submission to ncbi with sequin. >> >> Thank you. >> >> Max > > Peter Maxime, I find most users try to avoid using GenBank format except when absolutely needed. There is a very good reason Sequin and tbl2asn are used by NCBI for submissions; they end up generating simple tabular data that is easier to feed into their internal ASN.1 format. Genbank is a nice human-readable format, but structure-wise I find it's a pain to deal with, not to mention the variant third-party 'genbank' data that users want us to handle. We try to support generation of output within reason, but that's never been our primary goal. As long as the output generated is capable of being re-read by our parsers with the data intact and generates sane data we're pretty happy. Saying that, any additions to deal with this are perfectly welcome (I pointed out one mechanism that could be used), but they would have to address the concerns Peter and I alluded to previously, and it would be nice to evaluate how any changes affect performance. You are more than welcome to submit this as a feature request using our redmine server (including patches if you do this yourself): https://redmine.open-bio.org/ chris From cjfields at illinois.edu Wed Aug 3 13:10:31 2011 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 3 Aug 2011 12:10:31 -0500 Subject: [Bioperl-l] Genbank file : bad features (tag) order with /translation In-Reply-To: <5D9B8006-A20E-4AAE-88EF-0A1DCA56B26E@verizon.net> References: <7717ecd0-c9b6-4ecc-895f-db60ca89e259@l7g2000vbz.googlegroups.com> <5D9B8006-A20E-4AAE-88EF-0A1DCA56B26E@verizon.net> Message-ID: <51452A39-42B7-4BBF-9F50-A37419E75454@illinois.edu> IMHO I find genbank too unwieldy, but it's nice to know the output works for NCBI submission. chris On Aug 3, 2011, at 12:06 PM, Brian Osborne wrote: > Peter, > > I currently use BioPerl and SeqIO::genbank to create the *gbf files for NCBI submission, they've always accepted them. In fact I think they don't even use them, I believe they use the *tbl, *fsa, and *agp files and the ASN file as data sources. > > Brian O > > On Aug 3, 2011, at 12:52 PM, Chris Fields wrote: > >> On Aug 3, 2011, at 11:00 AM, Peter Cock wrote: >> >>> 2011/8/3 Maxime D?raspe : >>>>> >>>>> Why do you care about the order? >>>>> >>>> >>>> Hi Peter, >>>> >>>> I care about the order for the submission to ncbi. >>> >>> Do the NCBI have some guidelines which ask for a particular order? >> >> No, beyond the feature table there is no specification that indicates such that I am aware of. Submitted data is tabular; sequin is a nicer GUI API for getting data into a useful format for submission to NCBI, where data is converted to ASN.1 I believe. >> >>>> But I guess they >>>> will reformat the file before getting it in their database. >>> >>> They seem to generate the official GenBank files from their >>> database - so I doubt the input order matters. >> >> Yep, that's correct. If NCBI ruled the world everyone would be using ASN.1 (b/c that's what they use internally). >> >>>> It's also >>>> visually better when the translation of the protein comes in the end >>>> of the annotation for the CDS and not before /product, /note .... >>> >>> I do see your point, but if that were the only motivation I wouldn't >>> want to make generating GenBank output any more complicated >>> than it already is. >> ... >>>> Anyway maybe I'll reformat the file in sequin table for a direct >>>> submission to ncbi with sequin. >>>> >>>> Thank you. >>>> >>>> Max >>> >>> Peter >> >> >> Maxime, I find most users try to avoid using GenBank format except when absolutely needed. There is a very good reason Sequin and tbl2asn are used by NCBI for submissions; they end up generating simple tabular data that is easier to feed into their internal ASN.1 format. Genbank is a nice human-readable format, but structure-wise I find it's a pain to deal with, not to mention the variant third-party 'genbank' data that users want us to handle. >> >> We try to support generation of output within reason, but that's never been our primary goal. As long as the output generated is capable of being re-read by our parsers with the data intact and generates sane data we're pretty happy. >> >> Saying that, any additions to deal with this are perfectly welcome (I pointed out one mechanism that could be used), but they would have to address the concerns Peter and I alluded to previously, and it would be nice to evaluate how any changes affect performance. You are more than welcome to submit this as a feature request using our redmine server (including patches if you do this yourself): >> >> https://redmine.open-bio.org/ >> >> chris >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > From bosborne11 at verizon.net Wed Aug 3 13:06:05 2011 From: bosborne11 at verizon.net (Brian Osborne) Date: Wed, 03 Aug 2011 13:06:05 -0400 Subject: [Bioperl-l] Genbank file : bad features (tag) order with /translation In-Reply-To: References: <7717ecd0-c9b6-4ecc-895f-db60ca89e259@l7g2000vbz.googlegroups.com> Message-ID: <5D9B8006-A20E-4AAE-88EF-0A1DCA56B26E@verizon.net> Peter, I currently use BioPerl and SeqIO::genbank to create the *gbf files for NCBI submission, they've always accepted them. In fact I think they don't even use them, I believe they use the *tbl, *fsa, and *agp files and the ASN file as data sources. Brian O On Aug 3, 2011, at 12:52 PM, Chris Fields wrote: > On Aug 3, 2011, at 11:00 AM, Peter Cock wrote: > >> 2011/8/3 Maxime D?raspe : >>>> >>>> Why do you care about the order? >>>> >>> >>> Hi Peter, >>> >>> I care about the order for the submission to ncbi. >> >> Do the NCBI have some guidelines which ask for a particular order? > > No, beyond the feature table there is no specification that indicates such that I am aware of. Submitted data is tabular; sequin is a nicer GUI API for getting data into a useful format for submission to NCBI, where data is converted to ASN.1 I believe. > >>> But I guess they >>> will reformat the file before getting it in their database. >> >> They seem to generate the official GenBank files from their >> database - so I doubt the input order matters. > > Yep, that's correct. If NCBI ruled the world everyone would be using ASN.1 (b/c that's what they use internally). > >>> It's also >>> visually better when the translation of the protein comes in the end >>> of the annotation for the CDS and not before /product, /note .... >> >> I do see your point, but if that were the only motivation I wouldn't >> want to make generating GenBank output any more complicated >> than it already is. > ... >>> Anyway maybe I'll reformat the file in sequin table for a direct >>> submission to ncbi with sequin. >>> >>> Thank you. >>> >>> Max >> >> Peter > > > Maxime, I find most users try to avoid using GenBank format except when absolutely needed. There is a very good reason Sequin and tbl2asn are used by NCBI for submissions; they end up generating simple tabular data that is easier to feed into their internal ASN.1 format. Genbank is a nice human-readable format, but structure-wise I find it's a pain to deal with, not to mention the variant third-party 'genbank' data that users want us to handle. > > We try to support generation of output within reason, but that's never been our primary goal. As long as the output generated is capable of being re-read by our parsers with the data intact and generates sane data we're pretty happy. > > Saying that, any additions to deal with this are perfectly welcome (I pointed out one mechanism that could be used), but they would have to address the concerns Peter and I alluded to previously, and it would be nice to evaluate how any changes affect performance. You are more than welcome to submit this as a feature request using our redmine server (including patches if you do this yourself): > > https://redmine.open-bio.org/ > > chris > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From lskatz at gmail.com Wed Aug 3 17:01:24 2011 From: lskatz at gmail.com (Lee Katz) Date: Wed, 3 Aug 2011 17:01:24 -0400 Subject: [Bioperl-l] SeqIO: paired end reads Message-ID: Hi all! I was wondering how to construct paired end reads from scratch. I know the locations of certain sequences across the genome with a high degree of confidence and so I want to give them to my assembler as paired end reads, along with my other sequence runs (454 and Illumina runs). I plan to use Newbler. My only problem is that I do not know the correct format in order to specify distance and sequences for a paired end reads run, and so I hope that there is a SeqIO solution. At the least, I hope that one bioperl member can point me to where the definition of the paired end reads file format is...? Thank you! --Lee From jason.stajich at gmail.com Wed Aug 3 17:17:01 2011 From: jason.stajich at gmail.com (Jason Stajich) Date: Wed, 3 Aug 2011 13:17:01 -0800 Subject: [Bioperl-l] SeqIO: paired end reads In-Reply-To: References: Message-ID: <57EA9809-E999-43EF-B340-9A552A4A3FB6@gmail.com> it depends on the assembler - For Illumina usually the paired ends end with /1 /2 and they have the same ID but are in two different files. Depends on if you are using interleaved paired reads or in two separate files. some just expect the paired reads to be mated by virtue of being in same order in two files. the ABYSS and Velvet manuals both explain what is expected so you will want to check on what are Newbler's assumptions on how the paired ends are encoded. There are simulator tools if that is what you are trying to do in the end? checkout wgsim which comes with samtools or try dnaa On Aug 3, 2011, at 1:01 PM, Lee Katz wrote: > Hi all! I was wondering how to construct paired end reads from scratch. I > know the locations of certain sequences across the genome with a high degree > of confidence and so I want to give them to my assembler as paired end > reads, along with my other sequence runs (454 and Illumina runs). I plan to > use Newbler. > > My only problem is that I do not know the correct format in order to specify > distance and sequences for a paired end reads run, and so I hope that there > is a SeqIO solution. At the least, I hope that one bioperl member can point > me to where the definition of the paired end reads file format is...? > > Thank you! > > --Lee > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From roy.chaudhuri at gmail.com Thu Aug 4 07:22:23 2011 From: roy.chaudhuri at gmail.com (Roy Chaudhuri) Date: Thu, 04 Aug 2011 12:22:23 +0100 Subject: [Bioperl-l] How to show branch length value in tree In-Reply-To: References: <4E3953FE.5080304@gmail.com> <4E3954AE.2080401@gmail.com> Message-ID: <4E3A80EF.2010409@gmail.com> Hi Shachi, Please keep replies on the mailing list, that way others can follow the discussion. As I mentioned, it is not possible to draw njplot-style trees with labelled branches using Bio::Tree::Draw::Cladogram, it currently only labels nodes (you could perhaps add branch labels as a feature request on Redmine). The code I gave overwrites the existing "leaf" node ids (the accessions) with branch lengths, if you want to also keep the existing labels you could try something like: for my $node ($t1->get_nodes) { if ($node->is_Leaf) { $node->id($node->branch_length.' '.$node->id); } else { $node->id($node->branch_length) } } Cheers, Roy. On 04/08/2011 05:36, Shachi Gahoi wrote: > Thank You so much. Now branch length is coming in tree. > > But I want Accesssion number in place of node id. > > I attached snapshot of tree as I want. Please tell me how can I do this. > > > > > On Wed, Aug 3, 2011 at 7:31 PM, Roy Chaudhuri > wrote: > > Sorry, the code had a typo, it should be: > > > my $obj1 = Bio::Tree::Draw::Cladogram->__new(-bootstrap => 1, > -tree => $t1, > -compact => 0); > for my $node ($t1->get_nodes) { > > $node->id($node->branch___length) if defined $node->branch_length; > } > $obj1->print(-file => "$dir/$stem.eps") > > On 03/08/2011 14:58, Roy Chaudhuri wrote: > > Hi Shachi, > > I don't think you can draw labels on branches using > Bio::Tree::Draw::Cladogram. However, it will draw node labels, > so you > could copy the branch lengths over to the node ids: > > my $obj1 = Bio::Tree::Draw::Cladogram->__new(-bootstrap => 1, > -tree => $t1, > -compact => 0); > for my $node ($tree->get_nodes) { > $node->id($node->branch___length) if defined > $node->branch_length; > } > $obj1->print(-file => "$dir/$stem.eps") > > Incidentally, in your script you write the tree out to a file, > then read > it back in using TreeIO. This is unnecessary, you can use $tree > directly > as input to Bio::Tree::Draw::Cladogram. > > Alternatively, you could write out a newick file and use non-Bioperl > software such as njplot or MEGA to draw your tree with labelled > branch > lengths. > > Cheers, > Roy. > > On 03/08/2011 07:00, Shachi Gahoi wrote: > > Dear All > > I am using Bio::Tree modules for constructing and drawing > tree. *I am unable > to show branch length value in tree. > * > Please tell me How can I do this, if anybody knows. > > Here is my script which i am using...and i also attached > generated tree. > > Thanks in advance > > ##############################__##############################__##############################__###### > > use Bio::AlignIO; > use Bio::Align::ProteinStatistics; > use Bio::Tree::DistanceFactory; > use Bio::TreeIO; > use Bio::Tree::Draw::Cladogram; > > # for a dna alignment > # can also use ProteinStatistics > > my $alnio = Bio::AlignIO->new(-file => 'ADP.aln', > -format=>'clustalw'); > > my $dfactory = Bio::Tree::DistanceFactory->__new(-method => > 'UPGMA'); > > my $stats = Bio::Align::ProteinStatistics-__>new; > > my $treeout = Bio::TreeIO->new(-format => 'newick', -file > =>'>ADP1.dnd'); > > while( my $aln = $alnio->next_aln ) > { > my $mat = $stats->distance(-method => 'Kimura', -align > => $aln); > > my $tree = $dfactory->make_tree($mat); > $treeout->write_tree($tree); > } > > my $dir = shift || '.'; > > opendir(DIR, $dir) || die $!; > for my $file ( readdir(DIR) ) > { > next unless $file =~ /(\S+)\.dnd$/; > my $stem = $1; > my $treeio = Bio::TreeIO->new('-format' => 'newick', > '-file' => "$dir/$file"); > > if( my $t1 = $treeio->next_tree ) > { > my $obj1 = > Bio::Tree::Draw::Cladogram->__new(-bootstrap => 1, > -tree > => $t1, > > -compact => 0); > $obj1->print(-file => "$dir/$stem.eps"); > } > } > > ##############################__##############################__##############################__############## > > > > > _________________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/__mailman/listinfo/bioperl-l > > > > > > > > -- > Regards, > Shachi From razi.khaja at gmail.com Thu Aug 4 13:39:28 2011 From: razi.khaja at gmail.com (Razi Khaja) Date: Thu, 4 Aug 2011 13:39:28 -0400 Subject: [Bioperl-l] BioPerl on GitHub will not install Message-ID: All, I just checked out the latest development version of BioPerl from GitHub and found that it does not install because bp_das_server.pl is missing. Building BioPerl 'blib/script/bp_das_server.pl' and 'blib/script/bp_das_server.pl' are identical (not copied) at /opt/bioperl-live/Bio/Root/Build.pm line 219 Use of uninitialized value in utime at /usr/lib/perl5/5.8.8/File/Copy.pm line 218. Use of uninitialized value in utime at /usr/lib/perl5/5.8.8/File/Copy.pm line 218. Can't rename 'blib/script/bp_das_server.pl' to 'blib/script/bp_das_server.pl': No such file or directory at /opt/bioperl-live/Bio/Root/Build.pm line 219. After copying the bp_das_server.pl that I had from a previous installation to 'blib/script', I was able to ./Build test and ./Build install the development version I checked out. Could someone test out this problem and fix it on github? if it really is a problem? Thanks, Razi From cjfields at illinois.edu Thu Aug 4 13:42:48 2011 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 4 Aug 2011 12:42:48 -0500 Subject: [Bioperl-l] [Bioperl-guts-l] BioPerl on GitHub will not install In-Reply-To: References: Message-ID: <007DAD37-BC86-4D1F-8C40-816890661F7D@illinois.edu> Yes, I can replicate that. It's from the recent renaming for scripts. I'll look into it. chris On Aug 4, 2011, at 12:39 PM, Razi Khaja wrote: > All, > > I just checked out the latest development version of BioPerl from GitHub and > found that it does not install because bp_das_server.pl is missing. > > Building BioPerl > 'blib/script/bp_das_server.pl' and 'blib/script/bp_das_server.pl' are > identical (not copied) at /opt/bioperl-live/Bio/Root/Build.pm line 219 > Use of uninitialized value in utime at /usr/lib/perl5/5.8.8/File/Copy.pm > line 218. > Use of uninitialized value in utime at /usr/lib/perl5/5.8.8/File/Copy.pm > line 218. > Can't rename 'blib/script/bp_das_server.pl' to 'blib/script/bp_das_server.pl': > No such file or directory at /opt/bioperl-live/Bio/Root/Build.pm line 219. > > After copying the bp_das_server.pl that I had from a previous installation > to 'blib/script', I was able to ./Build test and ./Build install the > development version I checked out. > > Could someone test out this problem and fix it on github? if it really is a > problem? > > Thanks, > > Razi > _______________________________________________ > Bioperl-guts-l mailing list > Bioperl-guts-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-guts-l From hlapp at drycafe.net Thu Aug 4 17:31:52 2011 From: hlapp at drycafe.net (Hilmar Lapp) Date: Thu, 4 Aug 2011 17:31:52 -0400 Subject: [Bioperl-l] Bio::Root::IO _readline/_pushback behavior In-Reply-To: References: <4E390CE8.2050100@biotech.uni-tuebingen.de> Message-ID: I agree. In fact I'm surprised that $io->_pushback() does not act like unshift() - that's I thought how it is used. -hilmar On Aug 3, 2011, at 10:16 AM, Chris Fields wrote: > On Aug 3, 2011, at 3:55 AM, Kai Blin wrote: > >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA1 >> >> Hi folks, >> >> as I mentioned on https://redmine.open-bio.org/issues/3264 there is >> something odd going on with Bio::Root::IO's _readline/_pushback >> functions. This seems to be intentional, at least there is a test >> case >> asserting the behaviour I'm seeing. It his however very confusing >> to the >> unexpecting programmer using the code. >> >> One assumption I'd immediately make would be that if I have code that >> does a $foo = $io->_readline; $io->_pushback($foo); $bar = >> $io->_readline;, $foo will be the same string as $bar, regardless >> what >> other pieces of the code did. Currently, this is not the case, >> because >> the readbuffer that _pushback pushes back into has new strings >> appended >> to the end but readline removes them from the front. > > I think this test is performed in the regressions already, but if > not then it is more than welcome. > >> This easily violates the "principle of least surprise", so I think we >> should change the readbuffer to a stack. As far as I can tell, >> changing >> the _pushback function to "unshift" instead of "push" to the >> readbuffer >> breaks only the Root/RootIO.t test designed to test the old >> behaviour. I >> don't see any other tests failing on my system that don't fail >> without >> this patch. >> >> Any comments from the core devs? > > I don't have a problem with that beyond the change to the RootIO.t > tests (it implies a specific behavior that some developers expect, > so is a very subtle API change). However, this is how one would > expect it, to be more like an 'unread' stack instead of a queue. In > fact, there is a module I used for Biome's pushback/readline called > IO::Unread that implements an IO layer for mimicing this behavior, > might be worth looking into. > >> Cheers, >> Kai > > chris > > > Christopher Fields > Senior Research Scientist > National Center for Supercomputing Applications > Institute for Genomic Biology > University of Illinois Urbana-Champaign > 1206 W. Gregory Dr. , MC-195 > Urbana, IL 61801 > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : =========================================================== From cjfields at illinois.edu Thu Aug 4 17:42:30 2011 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 4 Aug 2011 16:42:30 -0500 Subject: [Bioperl-l] Bio::Root::IO _readline/_pushback behavior In-Reply-To: References: <4E390CE8.2050100@biotech.uni-tuebingen.de> Message-ID: <4196E008-4A81-41E5-A4F9-F9F8D3851E5C@illinois.edu> Yeah, it's a queue; the 'buffering' is a simple internal array using push/shift. I say we merge the change in from the branch and fix any modules accordingly. chris On Aug 4, 2011, at 4:31 PM, Hilmar Lapp wrote: > I agree. In fact I'm surprised that $io->_pushback() does not act like unshift() - that's I thought how it is used. > > -hilmar > > On Aug 3, 2011, at 10:16 AM, Chris Fields wrote: > >> On Aug 3, 2011, at 3:55 AM, Kai Blin wrote: >> >>> -----BEGIN PGP SIGNED MESSAGE----- >>> Hash: SHA1 >>> >>> Hi folks, >>> >>> as I mentioned on https://redmine.open-bio.org/issues/3264 there is >>> something odd going on with Bio::Root::IO's _readline/_pushback >>> functions. This seems to be intentional, at least there is a test case >>> asserting the behaviour I'm seeing. It his however very confusing to the >>> unexpecting programmer using the code. >>> >>> One assumption I'd immediately make would be that if I have code that >>> does a $foo = $io->_readline; $io->_pushback($foo); $bar = >>> $io->_readline;, $foo will be the same string as $bar, regardless what >>> other pieces of the code did. Currently, this is not the case, because >>> the readbuffer that _pushback pushes back into has new strings appended >>> to the end but readline removes them from the front. >> >> I think this test is performed in the regressions already, but if not then it is more than welcome. >> >>> This easily violates the "principle of least surprise", so I think we >>> should change the readbuffer to a stack. As far as I can tell, changing >>> the _pushback function to "unshift" instead of "push" to the readbuffer >>> breaks only the Root/RootIO.t test designed to test the old behaviour. I >>> don't see any other tests failing on my system that don't fail without >>> this patch. >>> >>> Any comments from the core devs? >> >> I don't have a problem with that beyond the change to the RootIO.t tests (it implies a specific behavior that some developers expect, so is a very subtle API change). However, this is how one would expect it, to be more like an 'unread' stack instead of a queue. In fact, there is a module I used for Biome's pushback/readline called IO::Unread that implements an IO layer for mimicing this behavior, might be worth looking into. >> >>> Cheers, >>> Kai >> >> chris >> >> >> Christopher Fields >> Senior Research Scientist >> National Center for Supercomputing Applications >> Institute for Genomic Biology >> University of Illinois Urbana-Champaign >> 1206 W. Gregory Dr. , MC-195 >> Urbana, IL 61801 >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : > =========================================================== > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Thu Aug 4 18:11:29 2011 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 4 Aug 2011 17:11:29 -0500 Subject: [Bioperl-l] [Bioperl-guts-l] BioPerl on GitHub will not install In-Reply-To: <007DAD37-BC86-4D1F-8C40-816890661F7D@illinois.edu> References: <007DAD37-BC86-4D1F-8C40-816890661F7D@illinois.edu> Message-ID: <0A691C42-539E-45A1-B44F-7B0B5D8DE3D8@illinois.edu> Now fixed on github. There was some cruft left in Bio::Root::Build that didn't deal with the recent script renaming. chris On Aug 4, 2011, at 12:42 PM, Chris Fields wrote: > Yes, I can replicate that. It's from the recent renaming for scripts. I'll look into it. > > chris > > On Aug 4, 2011, at 12:39 PM, Razi Khaja wrote: > >> All, >> >> I just checked out the latest development version of BioPerl from GitHub and >> found that it does not install because bp_das_server.pl is missing. >> >> Building BioPerl >> 'blib/script/bp_das_server.pl' and 'blib/script/bp_das_server.pl' are >> identical (not copied) at /opt/bioperl-live/Bio/Root/Build.pm line 219 >> Use of uninitialized value in utime at /usr/lib/perl5/5.8.8/File/Copy.pm >> line 218. >> Use of uninitialized value in utime at /usr/lib/perl5/5.8.8/File/Copy.pm >> line 218. >> Can't rename 'blib/script/bp_das_server.pl' to 'blib/script/bp_das_server.pl': >> No such file or directory at /opt/bioperl-live/Bio/Root/Build.pm line 219. >> >> After copying the bp_das_server.pl that I had from a previous installation >> to 'blib/script', I was able to ./Build test and ./Build install the >> development version I checked out. >> >> Could someone test out this problem and fix it on github? if it really is a >> problem? >> >> Thanks, >> >> Razi >> _______________________________________________ >> Bioperl-guts-l mailing list >> Bioperl-guts-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-guts-l > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From shachigahoimbi at gmail.com Fri Aug 5 01:40:11 2011 From: shachigahoimbi at gmail.com (Shachi Gahoi) Date: Fri, 5 Aug 2011 11:10:11 +0530 Subject: [Bioperl-l] How to show branch length value in tree In-Reply-To: <4E3A80EF.2010409@gmail.com> References: <4E3953FE.5080304@gmail.com> <4E3954AE.2080401@gmail.com> <4E3A80EF.2010409@gmail.com> Message-ID: Instead of both node id and accession, Can I replace node id with accession? On Thu, Aug 4, 2011 at 4:52 PM, Roy Chaudhuri wrote: > Hi Shachi, > > Please keep replies on the mailing list, that way others can follow the > discussion. > > As I mentioned, it is not possible to draw njplot-style trees with labelled > branches using Bio::Tree::Draw::Cladogram, it currently only labels nodes > (you could perhaps add branch labels as a feature request on Redmine). > > The code I gave overwrites the existing "leaf" node ids (the accessions) > with branch lengths, if you want to also keep the existing labels you could > try something like: > > > for my $node ($t1->get_nodes) { > if ($node->is_Leaf) { > $node->id($node->branch_**length.' '.$node->id); > } else { > > $node->id($node->branch_**length) > } > } > > Cheers, > Roy. > > > On 04/08/2011 05:36, Shachi Gahoi wrote: > >> Thank You so much. Now branch length is coming in tree. >> >> But I want Accesssion number in place of node id. >> >> I attached snapshot of tree as I want. Please tell me how can I do this. >> >> >> >> >> On Wed, Aug 3, 2011 at 7:31 PM, Roy Chaudhuri > >> wrote: >> >> Sorry, the code had a typo, it should be: >> >> >> my $obj1 = Bio::Tree::Draw::Cladogram->__**new(-bootstrap => 1, >> -tree => $t1, >> -compact => 0); >> for my $node ($t1->get_nodes) { >> >> $node->id($node->branch___**length) if defined >> $node->branch_length; >> } >> $obj1->print(-file => "$dir/$stem.eps") >> >> On 03/08/2011 14:58, Roy Chaudhuri wrote: >> >> Hi Shachi, >> >> I don't think you can draw labels on branches using >> Bio::Tree::Draw::Cladogram. However, it will draw node labels, >> so you >> could copy the branch lengths over to the node ids: >> >> my $obj1 = Bio::Tree::Draw::Cladogram->__**new(-bootstrap => 1, >> -tree => $t1, >> -compact => 0); >> for my $node ($tree->get_nodes) { >> $node->id($node->branch___**length) if defined >> $node->branch_length; >> } >> $obj1->print(-file => "$dir/$stem.eps") >> >> Incidentally, in your script you write the tree out to a file, >> then read >> it back in using TreeIO. This is unnecessary, you can use $tree >> directly >> as input to Bio::Tree::Draw::Cladogram. >> >> Alternatively, you could write out a newick file and use >> non-Bioperl >> software such as njplot or MEGA to draw your tree with labelled >> branch >> lengths. >> >> Cheers, >> Roy. >> >> On 03/08/2011 07:00, Shachi Gahoi wrote: >> >> Dear All >> >> I am using Bio::Tree modules for constructing and drawing >> tree. *I am unable >> to show branch length value in tree. >> * >> Please tell me How can I do this, if anybody knows. >> >> Here is my script which i am using...and i also attached >> generated tree. >> >> Thanks in advance >> >> ##############################**__############################ >> **##__##########################**####__###### >> >> use Bio::AlignIO; >> use Bio::Align::ProteinStatistics; >> use Bio::Tree::DistanceFactory; >> use Bio::TreeIO; >> use Bio::Tree::Draw::Cladogram; >> >> # for a dna alignment >> # can also use ProteinStatistics >> >> my $alnio = Bio::AlignIO->new(-file => 'ADP.aln', >> -format=>'clustalw'); >> >> my $dfactory = Bio::Tree::DistanceFactory->__**new(-method => >> 'UPGMA'); >> >> my $stats = Bio::Align::ProteinStatistics-**__>new; >> >> my $treeout = Bio::TreeIO->new(-format => 'newick', -file >> =>'>ADP1.dnd'); >> >> while( my $aln = $alnio->next_aln ) >> { >> my $mat = $stats->distance(-method => 'Kimura', -align >> => $aln); >> >> my $tree = $dfactory->make_tree($mat); >> $treeout->write_tree($tree); >> } >> >> my $dir = shift || '.'; >> >> opendir(DIR, $dir) || die $!; >> for my $file ( readdir(DIR) ) >> { >> next unless $file =~ /(\S+)\.dnd$/; >> my $stem = $1; >> my $treeio = Bio::TreeIO->new('-format' => 'newick', >> '-file' => "$dir/$file"); >> >> if( my $t1 = $treeio->next_tree ) >> { >> my $obj1 = >> Bio::Tree::Draw::Cladogram->__**new(-bootstrap => 1, >> -tree >> => $t1, >> >> -compact => 0); >> $obj1->print(-file => "$dir/$stem.eps"); >> } >> } >> >> ##############################**__############################ >> **##__##########################**####__############## >> >> >> >> >> ______________________________**___________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> >> > >> >> http://lists.open-bio.org/__**mailman/listinfo/bioperl-l >> >> > >> >> >> >> >> >> >> -- >> Regards, >> Shachi >> > > -- Regards, Shachi From kai.blin at biotech.uni-tuebingen.de Fri Aug 5 04:40:57 2011 From: kai.blin at biotech.uni-tuebingen.de (Kai Blin) Date: Fri, 05 Aug 2011 10:40:57 +0200 Subject: [Bioperl-l] Bio::Root::IO _readline/_pushback behavior In-Reply-To: <4196E008-4A81-41E5-A4F9-F9F8D3851E5C@illinois.edu> References: <4E390CE8.2050100@biotech.uni-tuebingen.de> <4196E008-4A81-41E5-A4F9-F9F8D3851E5C@illinois.edu> Message-ID: <4E3BAC99.8050806@biotech.uni-tuebingen.de> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2011-08-04 23:42, Chris Fields wrote: > Yeah, it's a queue; the 'buffering' is a simple internal array using > push/shift. I say we merge the change in from the branch and fix > any modules accordingly. Ok, I'm happy to take care of it, if people can tell me how to find and fix modules that use the old assumption. My initial attempt right after making the change was to run the test suite, which came up clean apart from the RootIO.t case that my patch now modifies as well. Cheers, Kai - -- Dipl.-Inform. Kai Blin kai.blin at biotech.uni-tuebingen.de Institute for Microbiology and Infection Medicine Division of Microbiology/Biotechnology Eberhard-Karls-Universit?t T?bingen Auf der Morgenstelle 28 Phone : ++49 7071 29-78841 D-72076 T?bingen Fax : ++49 7071 29-5979 Germany Homepage: http://www.mikrobio.uni-tuebingen.de/ag_wohlleben -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJOO6yZAAoJEKM5lwBiwTTPdjsH/0ELbz9VYIzxlpx+QZ3Jvd55 KTXVP+oOzjIDlOdxbdqYR0w04VXnpkQek3hVt0mbreuKvtdMJY/YhRwZLiOzYSak ruhswUJQnm3K2vkaqpgLESIIUASneFrW7ezfV3R9q/Ov730GBDAtkLTEk7cVV5Cg W515ixJtNC7v6fZmNFJZudQbcUYYgy+8BFgvNUaSoH8YqubMXzjFXknBWeWT0qco ivHjqIc6Nkap799ijPiLEU7ArI1pEOB2jyvjntIocFR72imbo7e86RaVHJCNl/N7 GFbRGoH2m7LVeWFYuNM3vsTS3W4KVLg9U/8UBysykR3uoHAVJhm4T5nCT4NKE/w= =z6QZ -----END PGP SIGNATURE----- From roy.chaudhuri at gmail.com Fri Aug 5 06:54:32 2011 From: roy.chaudhuri at gmail.com (Roy Chaudhuri) Date: Fri, 05 Aug 2011 11:54:32 +0100 Subject: [Bioperl-l] How to show branch length value in tree In-Reply-To: References: <4E3953FE.5080304@gmail.com> <4E3954AE.2080401@gmail.com> <4E3A80EF.2010409@gmail.com> Message-ID: <4E3BCBE8.4030303@gmail.com> In that case then you only want to add branch lengths to non-leaf nodes, so it would be: for my $node ($t1->get_nodes) { $node->id($node->branch_length) unless $node->is_Leaf } On 05/08/2011 06:40, Shachi Gahoi wrote: > > Instead of both node id and accession, Can I replace node id with accession? > > > On Thu, Aug 4, 2011 at 4:52 PM, Roy Chaudhuri > wrote: > > Hi Shachi, > > Please keep replies on the mailing list, that way others can follow > the discussion. > > As I mentioned, it is not possible to draw njplot-style trees with > labelled branches using Bio::Tree::Draw::Cladogram, it currently > only labels nodes (you could perhaps add branch labels as a feature > request on Redmine). > > The code I gave overwrites the existing "leaf" node ids (the > accessions) with branch lengths, if you want to also keep the > existing labels you could try something like: > > > for my $node ($t1->get_nodes) { > if ($node->is_Leaf) { > $node->id($node->branch___length.' '.$node->id); > } else { > > $node->id($node->branch___length) > } > } > > Cheers, > Roy. > > > On 04/08/2011 05:36, Shachi Gahoi wrote: > > Thank You so much. Now branch length is coming in tree. > > But I want Accesssion number in place of node id. > > I attached snapshot of tree as I want. Please tell me how can I > do this. > > > > > On Wed, Aug 3, 2011 at 7:31 PM, Roy Chaudhuri > > >> wrote: > > Sorry, the code had a typo, it should be: > > > my $obj1 = Bio::Tree::Draw::Cladogram->____new(-bootstrap => 1, > -tree => $t1, > -compact => 0); > for my $node ($t1->get_nodes) { > > $node->id($node->branch_____length) if defined > $node->branch_length; > } > $obj1->print(-file => "$dir/$stem.eps") > > On 03/08/2011 14:58, Roy Chaudhuri wrote: > > Hi Shachi, > > I don't think you can draw labels on branches using > Bio::Tree::Draw::Cladogram. However, it will draw node > labels, > so you > could copy the branch lengths over to the node ids: > > my $obj1 = > Bio::Tree::Draw::Cladogram->____new(-bootstrap => 1, > -tree => > $t1, > -compact => > 0); > for my $node ($tree->get_nodes) { > $node->id($node->branch_____length) if defined > $node->branch_length; > } > $obj1->print(-file => "$dir/$stem.eps") > > Incidentally, in your script you write the tree out to a > file, > then read > it back in using TreeIO. This is unnecessary, you can > use $tree > directly > as input to Bio::Tree::Draw::Cladogram. > > Alternatively, you could write out a newick file and use > non-Bioperl > software such as njplot or MEGA to draw your tree with > labelled > branch > lengths. > > Cheers, > Roy. > > On 03/08/2011 07:00, Shachi Gahoi wrote: > > Dear All > > I am using Bio::Tree modules for constructing and > drawing > tree. *I am unable > to show branch length value in tree. > * > Please tell me How can I do this, if anybody knows. > > Here is my script which i am using...and i also attached > generated tree. > > Thanks in advance > > > ##############################____############################__##__##########################__####__###### > > use Bio::AlignIO; > use Bio::Align::ProteinStatistics; > use Bio::Tree::DistanceFactory; > use Bio::TreeIO; > use Bio::Tree::Draw::Cladogram; > > # for a dna alignment > # can also use ProteinStatistics > > my $alnio = Bio::AlignIO->new(-file => 'ADP.aln', > -format=>'clustalw'); > > my $dfactory = > Bio::Tree::DistanceFactory->____new(-method => > 'UPGMA'); > > my $stats = Bio::Align::ProteinStatistics-____>new; > > my $treeout = Bio::TreeIO->new(-format => 'newick', > -file > =>'>ADP1.dnd'); > > while( my $aln = $alnio->next_aln ) > { > my $mat = $stats->distance(-method => 'Kimura', > -align > => $aln); > > my $tree = $dfactory->make_tree($mat); > $treeout->write_tree($tree); > } > > my $dir = shift || '.'; > > opendir(DIR, $dir) || die $!; > for my $file ( readdir(DIR) ) > { > next unless $file =~ /(\S+)\.dnd$/; > my $stem = $1; > my $treeio = Bio::TreeIO->new('-format' => > 'newick', > '-file' => "$dir/$file"); > > if( my $t1 = $treeio->next_tree ) > { > my $obj1 = > Bio::Tree::Draw::Cladogram->____new(-bootstrap => 1, > > -tree > => $t1, > > -compact => 0); > $obj1->print(-file => "$dir/$stem.eps"); > } > } > > > ##############################____############################__##__##########################__####__############## > > > > > ___________________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > > > > http://lists.open-bio.org/____mailman/listinfo/bioperl-l > > > > > > > > > > -- > Regards, > Shachi > > > > > > -- > Regards, > Shachi From lskatz at gmail.com Fri Aug 5 10:32:50 2011 From: lskatz at gmail.com (Lee Katz) Date: Fri, 5 Aug 2011 10:32:50 -0400 Subject: [Bioperl-l] SeqIO: paired end reads In-Reply-To: <57EA9809-E999-43EF-B340-9A552A4A3FB6@gmail.com> References: <57EA9809-E999-43EF-B340-9A552A4A3FB6@gmail.com> Message-ID: Thank you. I figured out through the Newbler manual that there is a linker sequence to separate the paired end reads. Then, the forum at http://seqanswers.com/forums/showthread.php?t=12940 showed me that the linker sequence is "GTTGGAACCGAAAGGGTTTGAATTCAAACCCTTTCGGTTCCAAC". I think a useful addition to bioperl could be to have paired end reads. This is outside of the domain of bioperl, but now I am left wondering how I could specify the distance between reads in Newbler, if the linker sequence is fixed. On Wed, Aug 3, 2011 at 5:17 PM, Jason Stajich wrote: > it depends on the assembler - For Illumina usually the paired ends end with > /1 /2 and they have the same ID but are in two different files. Depends on > if you are using interleaved paired reads or in two separate files. some > just expect the paired reads to be mated by virtue of being in same order in > two files. the ABYSS and Velvet manuals both explain what is expected so > you will want to check on what are Newbler's assumptions on how the paired > ends are encoded. > > There are simulator tools if that is what you are trying to do in the end? > checkout wgsim which comes with samtools or try dnaa > > > On Aug 3, 2011, at 1:01 PM, Lee Katz wrote: > > > Hi all! I was wondering how to construct paired end reads from scratch. > I > > know the locations of certain sequences across the genome with a high > degree > > of confidence and so I want to give them to my assembler as paired end > > reads, along with my other sequence runs (454 and Illumina runs). I plan > to > > use Newbler. > > > > My only problem is that I do not know the correct format in order to > specify > > distance and sequences for a paired end reads run, and so I hope that > there > > is a SeqIO solution. At the least, I hope that one bioperl member can > point > > me to where the definition of the paired end reads file format is...? > > > > Thank you! > > > > --Lee > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From cjfields at illinois.edu Fri Aug 5 11:50:42 2011 From: cjfields at illinois.edu (Chris Fields) Date: Fri, 5 Aug 2011 10:50:42 -0500 Subject: [Bioperl-l] Bio::Root::IO _readline/_pushback behavior In-Reply-To: <4E3BAC99.8050806@biotech.uni-tuebingen.de> References: <4E390CE8.2050100@biotech.uni-tuebingen.de> <4196E008-4A81-41E5-A4F9-F9F8D3851E5C@illinois.edu> <4E3BAC99.8050806@biotech.uni-tuebingen.de> Message-ID: <86DE321E-E532-4089-9B89-E257DB37CE46@illinois.edu> I would just go based on the test suite for now. If we run into others that don't have tests we need to add new tests for those anyway. chris On Aug 5, 2011, at 3:40 AM, Kai Blin wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 2011-08-04 23:42, Chris Fields wrote: > >> Yeah, it's a queue; the 'buffering' is a simple internal array using >> push/shift. I say we merge the change in from the branch and fix >> any modules accordingly. > > Ok, I'm happy to take care of it, if people can tell me how to find and > fix modules that use the old assumption. My initial attempt right after > making the change was to run the test suite, which came up clean apart > from the RootIO.t case that my patch now modifies as well. > > Cheers, > Kai > > - -- > Dipl.-Inform. Kai Blin kai.blin at biotech.uni-tuebingen.de > Institute for Microbiology and Infection Medicine > Division of Microbiology/Biotechnology > Eberhard-Karls-Universit?t T?bingen > Auf der Morgenstelle 28 Phone : ++49 7071 29-78841 > D-72076 T?bingen Fax : ++49 7071 29-5979 > Germany > Homepage: http://www.mikrobio.uni-tuebingen.de/ag_wohlleben > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.10 (GNU/Linux) > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ > > iQEcBAEBAgAGBQJOO6yZAAoJEKM5lwBiwTTPdjsH/0ELbz9VYIzxlpx+QZ3Jvd55 > KTXVP+oOzjIDlOdxbdqYR0w04VXnpkQek3hVt0mbreuKvtdMJY/YhRwZLiOzYSak > ruhswUJQnm3K2vkaqpgLESIIUASneFrW7ezfV3R9q/Ov730GBDAtkLTEk7cVV5Cg > W515ixJtNC7v6fZmNFJZudQbcUYYgy+8BFgvNUaSoH8YqubMXzjFXknBWeWT0qco > ivHjqIc6Nkap799ijPiLEU7ArI1pEOB2jyvjntIocFR72imbo7e86RaVHJCNl/N7 > GFbRGoH2m7LVeWFYuNM3vsTS3W4KVLg9U/8UBysykR3uoHAVJhm4T5nCT4NKE/w= > =z6QZ > -----END PGP SIGNATURE----- > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Fri Aug 5 16:49:54 2011 From: cjfields at illinois.edu (Chris Fields) Date: Fri, 5 Aug 2011 15:49:54 -0500 Subject: [Bioperl-l] BioPerl Test requirements In-Reply-To: <0D28A228-53D1-4843-B99D-9F8A48132EA2@illinois.edu> References: <79E5ED11-1D3F-436F-AA02-D922EAEB123A@illinois.edu> <0D28A228-53D1-4843-B99D-9F8A48132EA2@illinois.edu> Message-ID: <1FDBD8D4-E8E6-44EB-A18A-7E74A0EF9014@illinois.edu> Okay, I tested this out on a branch and then merged into 'master'. Test::Most is a 'build_requires'; Bio::Root::Test is now just a wrapper for Test::Most methods, with a few extra wrinkles to deal with Test::Warn and a few additional methods. I also removed extraneous modules in t/lib along with Bio::Root::Test::Warn (that code was merged into Bio::Root::Test to keep all evilness in one contained location). The nice thing is the transition didn't require changing any tests. However, this will require some testing across the board to make sure everything's working. Maybe worth getting the code cleaned up for another quick point release prior to the GSoC mayhem to ensue shortly... :) chris On Aug 1, 2011, at 3:34 PM, Chris Fields wrote: > Okay, will do. I'll initially test on a branch and then pull in. Thanks for the feedback Hilmar and Dave! > > chris > > On Aug 1, 2011, at 3:30 PM, Hilmar Lapp wrote: > >> I think the small burden this change incurs for each developer is well outweighed by the reduced maintenance and installation burden. Go for it. >> >> -hilmar >> >> On Aug 1, 2011, at 12:07 AM, Chris Fields wrote: >> >>> All, >>> >>> We are currently using a BioPerl-specific module for running tests called Bio::Root::Test. It is essentially a wrapper module, re-exporting all the methods for Test::More, Test::Exception, and Test::Warn. One problem: it currently expects a copy of Test::Warn and Test::Exception in each repository as a fallback. Another problem: these included modules appear to be triggering dependencies with debian packaging. >>> >>> As an example of one hidden dependency, the included Test::Warn requires Array::Compare, which converted to Moose a few years ago, so you automatically have to install the entire Moose dependency tree, even though Bioperl doesn't require it (not a slam on Moose, you really SHOULD be using Moose these days. No, really :). >>> >>> Anway, more recent versions of Test::Warn don't have this requirement, but as we package an old version of this module we get stuck with the dependencies until we (manually) update this for each repository. Ick. >>> >>> I think the best solution is to remove the bioperl-local modules in t/lib and list Test::Most instead as a 'build_requires' in Build.PL, e.g. the module is only necessary for the build phase so is optionally installed. Test::Most essentially does exactly the same thing as Bio::Root::Test and more; it also includes Test::Deep and Test::Diff (Bio::Root::Test has a few additional methods of use as well). >>> >>> As this will require developers to use Test::Most instead, though, I though it would be worth asking on the list to see if there are any objections. Any thoughts? >>> >>> >>> chris >>> >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> -- >> =========================================================== >> : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : >> =========================================================== >> >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From kai.blin at biotech.uni-tuebingen.de Fri Aug 5 18:35:32 2011 From: kai.blin at biotech.uni-tuebingen.de (Kai Blin) Date: Sat, 06 Aug 2011 00:35:32 +0200 Subject: [Bioperl-l] Bio::Root::IO _readline/_pushback behavior In-Reply-To: <86DE321E-E532-4089-9B89-E257DB37CE46@illinois.edu> References: <4E390CE8.2050100@biotech.uni-tuebingen.de> <4196E008-4A81-41E5-A4F9-F9F8D3851E5C@illinois.edu> <4E3BAC99.8050806@biotech.uni-tuebingen.de> <86DE321E-E532-4089-9B89-E257DB37CE46@illinois.edu> Message-ID: <4E3C7034.2000106@biotech.uni-tuebingen.de> On 2011-08-05 17:50, Chris Fields wrote: > I would just go based on the test suite for now. If we run into > others that don't have tests we need to add new tests for those > anyway. Ok, pushed to master. Cheers, Kai -- Dipl.-Inform. Kai Blin kai.blin at biotech.uni-tuebingen.de Institute for Microbiology and Infection Medicine Division of Microbiology/Biotechnology Eberhard-Karls-University of T?bingen Auf der Morgenstelle 28 Phone : ++49 7071 29-78841 D-72076 T?bingen Fax : ++49 7071 29-5979 Deutschland Homepage: http://www.mikrobio.uni-tuebingen.de/ag_wohlleben From shachigahoimbi at gmail.com Sat Aug 6 00:25:43 2011 From: shachigahoimbi at gmail.com (Shachi Gahoi) Date: Sat, 6 Aug 2011 09:55:43 +0530 Subject: [Bioperl-l] How to show branch length value in tree In-Reply-To: <4E3BCBE8.4030303@gmail.com> References: <4E3953FE.5080304@gmail.com> <4E3954AE.2080401@gmail.com> <4E3A80EF.2010409@gmail.com> <4E3BCBE8.4030303@gmail.com> Message-ID: Thank you so much. Please tell me one more thing, *can I reduce branch length font? * On Fri, Aug 5, 2011 at 4:24 PM, Roy Chaudhuri wrote: > In that case then you only want to add branch lengths to non-leaf nodes, so > it would be: > > > for my $node ($t1->get_nodes) { > $node->id($node->branch_**length) unless $node->is_Leaf > > } > > > On 05/08/2011 06:40, Shachi Gahoi wrote: > >> >> Instead of both node id and accession, Can I replace node id with >> accession? >> >> >> On Thu, Aug 4, 2011 at 4:52 PM, Roy Chaudhuri > >> wrote: >> >> Hi Shachi, >> >> Please keep replies on the mailing list, that way others can follow >> the discussion. >> >> As I mentioned, it is not possible to draw njplot-style trees with >> labelled branches using Bio::Tree::Draw::Cladogram, it currently >> only labels nodes (you could perhaps add branch labels as a feature >> request on Redmine). >> >> The code I gave overwrites the existing "leaf" node ids (the >> accessions) with branch lengths, if you want to also keep the >> existing labels you could try something like: >> >> >> for my $node ($t1->get_nodes) { >> if ($node->is_Leaf) { >> $node->id($node->branch___**length.' '.$node->id); >> } else { >> >> $node->id($node->branch___**length) >> } >> } >> >> Cheers, >> Roy. >> >> >> On 04/08/2011 05:36, Shachi Gahoi wrote: >> >> Thank You so much. Now branch length is coming in tree. >> >> But I want Accesssion number in place of node id. >> >> I attached snapshot of tree as I want. Please tell me how can I >> do this. >> >> >> >> >> On Wed, Aug 3, 2011 at 7:31 PM, Roy Chaudhuri >> >> > >> > >>> >> wrote: >> >> Sorry, the code had a typo, it should be: >> >> >> my $obj1 = Bio::Tree::Draw::Cladogram->__**__new(-bootstrap => >> 1, >> -tree => $t1, >> -compact => 0); >> for my $node ($t1->get_nodes) { >> >> $node->id($node->branch_____**length) if defined >> $node->branch_length; >> } >> $obj1->print(-file => "$dir/$stem.eps") >> >> On 03/08/2011 14:58, Roy Chaudhuri wrote: >> >> Hi Shachi, >> >> I don't think you can draw labels on branches using >> Bio::Tree::Draw::Cladogram. However, it will draw node >> labels, >> so you >> could copy the branch lengths over to the node ids: >> >> my $obj1 = >> Bio::Tree::Draw::Cladogram->__**__new(-bootstrap => 1, >> -tree => >> $t1, >> -compact => >> 0); >> for my $node ($tree->get_nodes) { >> $node->id($node->branch_____**length) if defined >> $node->branch_length; >> } >> $obj1->print(-file => "$dir/$stem.eps") >> >> Incidentally, in your script you write the tree out to a >> file, >> then read >> it back in using TreeIO. This is unnecessary, you can >> use $tree >> directly >> as input to Bio::Tree::Draw::Cladogram. >> >> Alternatively, you could write out a newick file and use >> non-Bioperl >> software such as njplot or MEGA to draw your tree with >> labelled >> branch >> lengths. >> >> Cheers, >> Roy. >> >> On 03/08/2011 07:00, Shachi Gahoi wrote: >> >> Dear All >> >> I am using Bio::Tree modules for constructing and >> drawing >> tree. *I am unable >> to show branch length value in tree. >> * >> Please tell me How can I do this, if anybody knows. >> >> Here is my script which i am using...and i also >> attached >> generated tree. >> >> Thanks in advance >> >> >> ##############################**____##########################** >> ##__##__######################**####__####__###### >> >> use Bio::AlignIO; >> use Bio::Align::ProteinStatistics; >> use Bio::Tree::DistanceFactory; >> use Bio::TreeIO; >> use Bio::Tree::Draw::Cladogram; >> >> # for a dna alignment >> # can also use ProteinStatistics >> >> my $alnio = Bio::AlignIO->new(-file => 'ADP.aln', >> -format=>'clustalw'); >> >> my $dfactory = >> Bio::Tree::DistanceFactory->__**__new(-method => >> 'UPGMA'); >> >> my $stats = Bio::Align::ProteinStatistics-**____>new; >> >> my $treeout = Bio::TreeIO->new(-format => 'newick', >> -file >> =>'>ADP1.dnd'); >> >> while( my $aln = $alnio->next_aln ) >> { >> my $mat = $stats->distance(-method => 'Kimura', >> -align >> => $aln); >> >> my $tree = $dfactory->make_tree($mat); >> $treeout->write_tree($tree); >> } >> >> my $dir = shift || '.'; >> >> opendir(DIR, $dir) || die $!; >> for my $file ( readdir(DIR) ) >> { >> next unless $file =~ /(\S+)\.dnd$/; >> my $stem = $1; >> my $treeio = Bio::TreeIO->new('-format' => >> 'newick', >> '-file' => "$dir/$file"); >> >> if( my $t1 = $treeio->next_tree ) >> { >> my $obj1 = >> Bio::Tree::Draw::Cladogram->__**__new(-bootstrap => >> 1, >> >> -tree >> => $t1, >> >> -compact => 0); >> $obj1->print(-file => "$dir/$stem.eps"); >> } >> } >> >> >> ##############################**____##########################** >> ##__##__######################**####__####__############## >> >> >> >> >> ______________________________**_____________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org > bio.org > >> >> >> >> >> >> http://lists.open-bio.org/____**mailman/listinfo/bioperl-l >> >> > >> >> >> >> >> >> >> >> >> >> >> -- >> Regards, >> Shachi >> >> >> >> >> >> -- >> Regards, >> Shachi >> > > -- Regards, Shachi From p.j.a.cock at googlemail.com Sun Aug 7 05:40:52 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Sun, 7 Aug 2011 10:40:52 +0100 Subject: [Bioperl-l] SeqIO: paired end reads In-Reply-To: References: <57EA9809-E999-43EF-B340-9A552A4A3FB6@gmail.com> Message-ID: On Friday, August 5, 2011, Lee Katz wrote: > Thank you. I figured out through the Newbler manual that there is a linker > sequence to separate the paired end reads. Then, the forum at > http://seqanswers.com/forums/showthread.php?t=12940 showed me that the > linker sequence is "GTTGGAACCGAAAGGGTTTGAATTCAAACCCTTTCGGTTCCAAC". There is more than one Roche 454 linker sequence depending on the chemistry used, one is the same as it's reversve complement, one isn't. There is nothing in the SFF file format (nor the Roche specific XML manifest last time I checked) that handles the paired end information explicitly. > I think a useful addition to bioperl could be to have paired end reads. > Maybe, but to do this well you'd want to do flow space alignment of the reads to the linker sequence to find the imperfectly called linker sequences. Personally I use ssf_extract which is a free open source command line tool for this (calling an external aligned tool for paid end 454). > This is outside of the domain of bioperl, but now I am left wondering how I > could specify the distance between reads in Newbler, if the linker sequence > is fixed. How to do that depends on the aligned or assembly tool you are using. Peter From cjfields at illinois.edu Sun Aug 7 11:51:19 2011 From: cjfields at illinois.edu (Chris Fields) Date: Sun, 7 Aug 2011 10:51:19 -0500 Subject: [Bioperl-l] SeqIO: paired end reads In-Reply-To: References: <57EA9809-E999-43EF-B340-9A552A4A3FB6@gmail.com> Message-ID: <19923C8B-6C84-4D9B-8D37-86CAE9BC681E@illinois.edu> On Aug 7, 2011, at 4:40 AM, Peter Cock wrote: > On Friday, August 5, 2011, Lee Katz wrote: >> Thank you. I figured out through the Newbler manual that there is a > linker >> sequence to separate the paired end reads. Then, the forum at >> http://seqanswers.com/forums/showthread.php?t=12940 showed me that the >> linker sequence is "GTTGGAACCGAAAGGGTTTGAATTCAAACCCTTTCGGTTCCAAC". > > There is more than one Roche 454 linker sequence depending on the chemistry > used, one is the same as it's reversve complement, one isn't. > > There is nothing in the SFF file format (nor the Roche specific XML manifest > last time I checked) that handles the paired end information explicitly. Yep, it's all implied AFAIK. >> I think a useful addition to bioperl could be to have paired end reads. >> > > Maybe, but to do this well you'd want to do flow space alignment of the > reads to the linker sequence to find the imperfectly called linker > sequences. > > Personally I use ssf_extract which is a free open source command line tool > for this (calling an external aligned tool for paid end 454). I think it could be done, but I would implement something like this as a wrapper around faster tools (like sff_extract or similar). Implementing the functionality in pure (bio)perl/(bio)python doesn't make much sense if there are newer/faster tools out there. >> This is outside of the domain of bioperl, but now I am left wondering how > I >> could specify the distance between reads in Newbler, if the linker > sequence >> is fixed. > > How to do that depends on the aligned or assembly tool you are using. > > Peter Yep. I don't think there is a defined way to specify that in any format that I know of. chris From Russell.Smithies at agresearch.co.nz Sun Aug 7 17:45:19 2011 From: Russell.Smithies at agresearch.co.nz (Smithies, Russell) Date: Mon, 8 Aug 2011 09:45:19 +1200 Subject: [Bioperl-l] How to show branch length value in tree In-Reply-To: References: <4E3953FE.5080304@gmail.com> <4E3954AE.2080401@gmail.com> <4E3A80EF.2010409@gmail.com> <4E3BCBE8.4030303@gmail.com> Message-ID: <18DF7D20DFEC044098A1062202F5FFF3396074D3C9@exchsth.agresearch.co.nz> The constructor for Bio::Tree::Draw::Cladogram lets you specify the font and size, did you try setting it there? Title : new Usage : my $obj = Bio::Tree::Draw::Cladogram->new(); Function: Builds a new Bio::Tree::Draw::Cladogram object Returns : Bio::Tree::Draw::Cladogram Args : -tree => Bio::Tree::Tree object -second => Bio::Tree::Tree object (optional) -font => font name [string] (optional) <<<<------------- -size => font size [integer] (optional) <<<<------------- -top => top margin [integer] (optional) -bottom => bottom margin [integer] (optional) -left => left margin [integer] (optional) -right => right margin [integer] (optional) -tip => extra tip space [integer] (optional) -column => extra space between cladograms [integer] (optional) -compact => ignore branch lengths [boolean] (optional) -ratio => horizontal to vertical ratio [integer] (optional) -colors => use colors to color edges [boolean] (optional) -bootstrap => draw bootstrap or internal ids [boolean] --Russell > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Shachi Gahoi > Sent: Saturday, 6 August 2011 4:26 p.m. > To: Roy Chaudhuri > Cc: bioperl-l List > Subject: Re: [Bioperl-l] How to show branch length value in tree > > Thank you so much. > > Please tell me one more thing, *can I reduce branch length font? > * > On Fri, Aug 5, 2011 at 4:24 PM, Roy Chaudhuri > wrote: > > > In that case then you only want to add branch lengths to non-leaf > nodes, so > > it would be: > > > > > > for my $node ($t1->get_nodes) { > > $node->id($node->branch_**length) unless $node->is_Leaf > > > > } > > > > > > On 05/08/2011 06:40, Shachi Gahoi wrote: > > > >> > >> Instead of both node id and accession, Can I replace node id with > >> accession? > >> > >> > >> On Thu, Aug 4, 2011 at 4:52 PM, Roy Chaudhuri > >> >> wrote: > >> > >> Hi Shachi, > >> > >> Please keep replies on the mailing list, that way others can > follow > >> the discussion. > >> > >> As I mentioned, it is not possible to draw njplot-style trees > with > >> labelled branches using Bio::Tree::Draw::Cladogram, it currently > >> only labels nodes (you could perhaps add branch labels as a > feature > >> request on Redmine). > >> > >> The code I gave overwrites the existing "leaf" node ids (the > >> accessions) with branch lengths, if you want to also keep the > >> existing labels you could try something like: > >> > >> > >> for my $node ($t1->get_nodes) { > >> if ($node->is_Leaf) { > >> $node->id($node->branch___**length.' '.$node->id); > >> } else { > >> > >> $node->id($node->branch___**length) > >> } > >> } > >> > >> Cheers, > >> Roy. > >> > >> > >> On 04/08/2011 05:36, Shachi Gahoi wrote: > >> > >> Thank You so much. Now branch length is coming in tree. > >> > >> But I want Accesssion number in place of node id. > >> > >> I attached snapshot of tree as I want. Please tell me how can > I > >> do this. > >> > >> > >> > >> > >> On Wed, Aug 3, 2011 at 7:31 PM, Roy Chaudhuri > >> > >> > > >> >> >>> > >> wrote: > >> > >> Sorry, the code had a typo, it should be: > >> > >> > >> my $obj1 = Bio::Tree::Draw::Cladogram->__**__new(- > bootstrap => > >> 1, > >> -tree => > $t1, > >> -compact => > 0); > >> for my $node ($t1->get_nodes) { > >> > >> $node->id($node->branch_____**length) if defined > >> $node->branch_length; > >> } > >> $obj1->print(-file => "$dir/$stem.eps") > >> > >> On 03/08/2011 14:58, Roy Chaudhuri wrote: > >> > >> Hi Shachi, > >> > >> I don't think you can draw labels on branches using > >> Bio::Tree::Draw::Cladogram. However, it will draw > node > >> labels, > >> so you > >> could copy the branch lengths over to the node ids: > >> > >> my $obj1 = > >> Bio::Tree::Draw::Cladogram->__**__new(-bootstrap => 1, > >> -tree > => > >> $t1, > >> -compact > => > >> 0); > >> for my $node ($tree->get_nodes) { > >> $node->id($node->branch_____**length) if > defined > >> $node->branch_length; > >> } > >> $obj1->print(-file => "$dir/$stem.eps") > >> > >> Incidentally, in your script you write the tree out > to a > >> file, > >> then read > >> it back in using TreeIO. This is unnecessary, you can > >> use $tree > >> directly > >> as input to Bio::Tree::Draw::Cladogram. > >> > >> Alternatively, you could write out a newick file and > use > >> non-Bioperl > >> software such as njplot or MEGA to draw your tree > with > >> labelled > >> branch > >> lengths. > >> > >> Cheers, > >> Roy. > >> > >> On 03/08/2011 07:00, Shachi Gahoi wrote: > >> > >> Dear All > >> > >> I am using Bio::Tree modules for constructing and > >> drawing > >> tree. *I am unable > >> to show branch length value in tree. > >> * > >> Please tell me How can I do this, if anybody > knows. > >> > >> Here is my script which i am using...and i also > >> attached > >> generated tree. > >> > >> Thanks in advance > >> > >> > >> > ##############################**____##########################** > >> ##__##__######################**####__####__###### > >> > >> use Bio::AlignIO; > >> use Bio::Align::ProteinStatistics; > >> use Bio::Tree::DistanceFactory; > >> use Bio::TreeIO; > >> use Bio::Tree::Draw::Cladogram; > >> > >> # for a dna alignment > >> # can also use ProteinStatistics > >> > >> my $alnio = Bio::AlignIO->new(-file => 'ADP.aln', > >> -format=>'clustalw'); > >> > >> my $dfactory = > >> Bio::Tree::DistanceFactory->__**__new(-method => > >> 'UPGMA'); > >> > >> my $stats = Bio::Align::ProteinStatistics- > **____>new; > >> > >> my $treeout = Bio::TreeIO->new(-format => > 'newick', > >> -file > >> =>'>ADP1.dnd'); > >> > >> while( my $aln = $alnio->next_aln ) > >> { > >> my $mat = $stats->distance(-method => > 'Kimura', > >> -align > >> => $aln); > >> > >> my $tree = $dfactory->make_tree($mat); > >> $treeout->write_tree($tree); > >> } > >> > >> my $dir = shift || '.'; > >> > >> opendir(DIR, $dir) || die $!; > >> for my $file ( readdir(DIR) ) > >> { > >> next unless $file =~ /(\S+)\.dnd$/; > >> my $stem = $1; > >> my $treeio = Bio::TreeIO->new('-format' => > >> 'newick', > >> '-file' => "$dir/$file"); > >> > >> if( my $t1 = $treeio->next_tree ) > >> { > >> my $obj1 = > >> Bio::Tree::Draw::Cladogram->__**__new(-bootstrap > => > >> 1, > >> > >> -tree > >> => $t1, > >> > >> -compact => 0); > >> $obj1->print(-file => > "$dir/$stem.eps"); > >> } > >> } > >> > >> > >> > ##############################**____##########################** > >> ##__##__######################**####__####__############## > >> > >> > >> > >> > >> > ______________________________**_____________________ > >> Bioperl-l mailing list > >> Bioperl-l at lists.open-bio.org >> bio.org > > >> l at lists.open-__bio.org> > >> bio.org> > >> >> > >> > >> http://lists.open-bio.org/____**mailman/listinfo/bioperl- > l > >> l > >> > > >> l > >> l > >> >> > >> > >> > >> > >> > >> > >> > >> -- > >> Regards, > >> Shachi > >> > >> > >> > >> > >> > >> -- > >> Regards, > >> Shachi > >> > > > > > > > -- > Regards, > Shachi > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= From cjfields at illinois.edu Tue Aug 9 16:10:37 2011 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 9 Aug 2011 15:10:37 -0500 Subject: [Bioperl-l] Question to Bio::SearchIO::infernal.pm In-Reply-To: References: <4E32E14B020000EE00004F57@gwia1.boku.ac.at> Message-ID: <683C7B42-338F-42AE-AF93-11BFB4DB2CB7@illinois.edu> Following this up: Nadine, did you have a bug to report? It's kind of hard to fix this without some example data. chris On Aug 3, 2011, at 8:10 AM, Chris Fields wrote: > Nadine, > > Hard to guess w/o seeing the report, but I'm not terribly surprised. I believe I only coded for simple 1 CM reports, IIRC. You'll have to file this as a bug on redmine along with an example. > > chris > > On Jul 29, 2011, at 9:35 AM, Nadine Elpida Tatto wrote: > >> Hi There! >> >> >> >> I was wondering if you would or can help me. >> >> >> I have an infernal report containing about 2000 CMs from an infernal run against Rfam.cm. To parse this report I wanted to use Bio::SearchIO::infernal.pm. Unfortunately this turned out to be a problem for me, because "$parser->next_result" only delivers the result for the first CM in the report and nothing more. >> >> >> My code: >> #!/usr/bin/perl -w >> >> >> use strict;use Data::Dumper; >> use Bio::SearchIO; >> >> >> my $infile = $ARGV[0]; # infernal report >> my $parser = Bio::SearchIO->new(-format => 'Infernal', >> -file => $infile); >> >> >> while( my $result = $parser->next_result ) { >> print $result->query_name . "\n"; >> } >> >> >> exit; >> >> >> >> >> The output: >> >> >> ntatto:~$ ./infernalParser.pl infernal.output >> 5S_rRNA >> ntatto:~$ >> >> >> >> >> I would expect the following (like parsing a blast report): >> >> >> ntatto:~$ ./infernalParser.pl infernal.output >> 5S_rRNA >> 5_8S_rRNA >> U1 >> ... >> ntatto:~$ >> >> >> >> I would be glad for help. >> >> >> Thank you in advance. >> >> >> Best Regards >> >> >> N Tatto >> >> >> >> >> >> >> >> >> >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From torsten.seemann at infotech.monash.edu.au Sun Aug 14 04:32:46 2011 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Sun, 14 Aug 2011 18:32:46 +1000 Subject: [Bioperl-l] Genbank file : bad features (tag) order with /translation In-Reply-To: <5D9B8006-A20E-4AAE-88EF-0A1DCA56B26E@verizon.net> References: <7717ecd0-c9b6-4ecc-895f-db60ca89e259@l7g2000vbz.googlegroups.com> <5D9B8006-A20E-4AAE-88EF-0A1DCA56B26E@verizon.net> Message-ID: > I currently use BioPerl and SeqIO::genbank to create the *gbf files for NCBI submission, they've always accepted them. In fact I think they don't even use them, I believe they use the *tbl, *fsa, and *agp files and the ASN file as data sources. I'm pretty sure that NCBI/Genbank do *not* accept Genbank files for submission - which I found somewhat ironic! They require an ASN1 formatted file (XML-like hierarchial format, pre-dates XML), which is sometimes given a .sqn extenison if you use the Sequin GUI to prepare it. There are command line tools like "tbl2asn" which will take the .tbl and .fsa files Brian has listed to produce the ASN file too. As far as I know, there is no NCBI tools to take a .gbk and produce the .tbl/.fsa/.agp - does anyone know otherwise? -- --Torsten Seemann --Victorian Bioinformatics Consortium, Dept. Microbiology, Monash University, AUSTRALIA From cjfields at illinois.edu Sun Aug 14 10:22:10 2011 From: cjfields at illinois.edu (Chris Fields) Date: Sun, 14 Aug 2011 09:22:10 -0500 Subject: [Bioperl-l] Genbank file : bad features (tag) order with /translation In-Reply-To: References: <7717ecd0-c9b6-4ecc-895f-db60ca89e259@l7g2000vbz.googlegroups.com> <5D9B8006-A20E-4AAE-88EF-0A1DCA56B26E@verizon.net> Message-ID: <410A8BF5-D5EF-4E7A-B91C-D3DDACBABB75@illinois.edu> Not that I'm aware of, though it shouldn't be hard to set something up using Bio::SeqIO for that. chris On Aug 14, 2011, at 3:32 AM, Torsten Seemann wrote: >> I currently use BioPerl and SeqIO::genbank to create the *gbf files for NCBI submission, they've always accepted them. In fact I think they don't even use them, I believe they use the *tbl, *fsa, and *agp files and the ASN file as data sources. > > I'm pretty sure that NCBI/Genbank do *not* accept Genbank files for > submission - which I found somewhat ironic! > > They require an ASN1 formatted file (XML-like hierarchial format, > pre-dates XML), which is sometimes given a .sqn extenison if you use > the Sequin GUI to prepare it. There are command line tools like > "tbl2asn" which will take the .tbl and .fsa files Brian has listed to > produce the ASN file too. > > As far as I know, there is no NCBI tools to take a .gbk and produce > the .tbl/.fsa/.agp - does anyone know otherwise? > > -- > --Torsten Seemann > --Victorian Bioinformatics Consortium, Dept. Microbiology, Monash > University, AUSTRALIA > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From maximilien1er at gmail.com Sun Aug 14 10:23:39 2011 From: maximilien1er at gmail.com (Maxime =?ISO-8859-1?Q?D=E9raspe?=) Date: Sun, 14 Aug 2011 10:23:39 -0400 Subject: [Bioperl-l] Genbank file : bad features (tag) order with /translation In-Reply-To: References: <7717ecd0-c9b6-4ecc-895f-db60ca89e259@l7g2000vbz.googlegroups.com> <5D9B8006-A20E-4AAE-88EF-0A1DCA56B26E@verizon.net> Message-ID: <1313331819.15034.4.camel@maximilian-home> I know that Artemis from sanger institute can convert a genbank file into a sequin tab file. Then you could use that file to submit it to ncbi with their sequin soft. But I think that the genbank file would be ok too. Max On Sun, 2011-08-14 at 18:32 +1000, Torsten Seemann wrote: > > I currently use BioPerl and SeqIO::genbank to create the *gbf files for NCBI submission, they've always accepted them. In fact I think they don't even use them, I believe they use the *tbl, *fsa, and *agp files and the ASN file as data sources. > > I'm pretty sure that NCBI/Genbank do *not* accept Genbank files for > submission - which I found somewhat ironic! > > They require an ASN1 formatted file (XML-like hierarchial format, > pre-dates XML), which is sometimes given a .sqn extenison if you use > the Sequin GUI to prepare it. There are command line tools like > "tbl2asn" which will take the .tbl and .fsa files Brian has listed to > produce the ASN file too. > > As far as I know, there is no NCBI tools to take a .gbk and produce > the .tbl/.fsa/.agp - does anyone know otherwise? > From punit_vergoboy2004 at yahoo.co.in Thu Aug 18 08:14:54 2011 From: punit_vergoboy2004 at yahoo.co.in (punit kumar) Date: Thu, 18 Aug 2011 17:44:54 +0530 (IST) Subject: [Bioperl-l] query about Bio::Tools::Run::RemoteBlast In-Reply-To: <410A8BF5-D5EF-4E7A-B91C-D3DDACBABB75@illinois.edu> References: <7717ecd0-c9b6-4ecc-895f-db60ca89e259@l7g2000vbz.googlegroups.com> <5D9B8006-A20E-4AAE-88EF-0A1DCA56B26E@verizon.net> <410A8BF5-D5EF-4E7A-B91C-D3DDACBABB75@illinois.edu> Message-ID: <1313669694.59013.YahooMailNeo@web137303.mail.in.yahoo.com> hi friends ,? i am new to Bioperl , and i am using "Bio::Tools::Run::RemoteBlast" for remote blast ?i tried to use this module and i?succeed?a little yet, i want to get the description part of blast alignments which were found against my query sequence, as result is shown in format as given below, which is the out put table of ONLINE BLAST, Sequences producing significant alignments: Accession Description Max score Total score Query coverage E value Links NP_216760.1 acyl carrier protein [Mycobacterium tuberculosis H37Rv] >ref|NP_336774.1| acyl carrier protein [Mycobacterium tuberculosis CDC1551] >ref|NP_855917.1| acyl carrier protein [Mycobacterium bovis AF2122/97] >ref|YP_978350.1| acyl carrier protein [Mycobacterium bovis BCG str. Pasteur 1173P2] >ref|YP_001283588.1| acyl carrier protein [Mycobacterium tuberculosis H37Ra] >ref|YP_001288206.1| acyl carrier protein [Mycobacterium tuberculosis F11] >ref|ZP_02551632.1| acyl carrier protein [Mycobacterium tuberculosis H37Ra] >ref|YP_002645307.1| acyl carrier protein [Mycobacterium bovis BCG str. Tokyo 172] >ref|YP_003031689.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis KZN 1435] >ref|ZP_04925721.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis C] >ref|ZP_04981085.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis str. Haarlem] >ref|ZP_05141736.1| acyl carrier protein [Mycobacterium tuberculosis '98-R604 INH-RIF-EM'] >ref|ZP_06433498.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis T46] >ref|ZP_06437620.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis CPHL_A] >ref|ZP_06443178.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis KZN 605] >ref|ZP_06450592.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis T17] >ref|ZP_06455160.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis K85] >ref|ZP_06504896.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis 02_1987] >ref|ZP_06510220.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis T92] >ref|ZP_06513730.1| meromycolate extension acyl carrier protein acpm [Mycobacterium tuberculosis EAS054] >ref|ZP_06517747.1| meromycolate extension acyl carrier protein acpm [Mycobacterium tuberculosis T85] >ref|ZP_06521786.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis GM 1503] >ref|ZP_06799170.1| acyl carrier protein [Mycobacterium tuberculosis 210] >ref|ZP_06952619.1| acyl carrier protein [Mycobacterium tuberculosis KZN 4207] >ref|ZP_06960948.1| acyl carrier protein [Mycobacterium tuberculosis KZN R506] >ref|ZP_07013145.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis 94_M4241A] >ref|ZP_07414839.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu001] >ref|ZP_07418616.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu002] >ref|ZP_07423348.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu003] >ref|ZP_07427715.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu004] >ref|ZP_07432018.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu005] >ref|ZP_07436410.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu006] >ref|ZP_07440655.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu008] >ref|ZP_07445228.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu007] >ref|ZP_07481045.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu009] >ref|ZP_07485275.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu010] >ref|ZP_07489492.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu011] >ref|ZP_07494023.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu012] >ref|ZP_07816044.1| acyl carrier protein [Mycobacterium tuberculosis KZN V2475] >ref|YP_004723912.1| meromycolate extension acyl carrier protein ACPM [Mycobacterium africanum GM041182] >ref|YP_004745700.1| meromycolate extension acyl carrier protein ACPM [Mycobacterium canettii CIPT 140010059] >sp|P0A4W6.1|ACPM_MYCTU RecName: Full=Meromycolate extension acyl carrier protein; Short=ACP >sp|P0A4W7.1|ACPM_MYCBO RecName: Full=Meromycolate extension acyl carrier protein; Short=ACP >emb|CAA94640.1| MEROMYCOLATE EXTENSION ACYL CARRIER PROTEIN ACPM [Mycobacterium tuberculosis H37Rv] >gb|AAK46588.1| acyl carrier protein [Mycobacterium tuberculosis CDC1551] >emb|CAD97121.1| MEROMYCOLATE EXTENSION ACYL CARRIER PROTEIN ACPM [Mycobacterium bovis AF2122/97] >emb|CAL72249.1| Meromycolate extension acyl carrier protein acpM [Mycobacterium bovis BCG str. Pasteur 1173P2] >gb|EAY60463.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis C] >gb|EBA42598.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis str. Haarlem] >gb|ABQ74026.1| meromycolate extension acyl carrier protein AcpM [Mycobacterium tuberculosis H37Ra] >gb|ABR06604.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis F11] >dbj|BAH26539.1| acyl carrier protein [Mycobacterium bovis BCG str. Tokyo 172] >gb|ACT24794.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis KZN 1435] >gb|EFD13913.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis T46] >gb|EFD18035.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis CPHL_A] >gb|EFD21093.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis KZN 605] >gb|EFD43942.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis K85] >gb|EFD47767.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis T17] >gb|EFD53534.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis 02_1987] >gb|EFD58858.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis T92] >gb|EFD62368.1| meromycolate extension acyl carrier protein acpm [Mycobacterium tuberculosis EAS054] >gb|EFD73930.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis GM 1503] >gb|EFD77945.1| meromycolate extension acyl carrier protein acpm [Mycobacterium tuberculosis T85] >gb|EFI30824.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis 94_M4241A] >gb|EFO74536.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu001] >gb|EFP15742.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu002] >gb|EFP19094.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu003] >gb|EFP22930.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu004] >gb|EFP26734.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu005] >gb|EFP30496.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu006] >gb|EFP33906.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu007] >gb|EFP38213.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu008] >gb|EFP42922.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu009] >gb|EFP46864.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu010] >gb|EFP50800.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu011] >gb|EFP54373.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu012] >gb|EGB28294.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis CDC1551A] >gb|EGE50793.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis W-148] >gb|AEB03875.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis KZN 4207] >gb|AEJ47271.1| acyl carrier protein [Mycobacterium tuberculosis CCDC5079] >gb|AEJ50890.1| acyl carrier protein [Mycobacterium tuberculosis CCDC5180] >emb|CCC27325.1| meromycolate extension acyl carrier protein ACPM [Mycobacterium africanum GM041182] >emb|CCC44598.1| meromycolate extension acyl carrier protein ACPM [Mycobacterium canettii CIPT 140010059] >emb|CCC64838.1| Meromycolate extension acyl carrier protein acpM [Mycobacterium bovis BCG str. Moreau RDJ] 223 223 100% 1e-74 1KLP_A Chain A, The Solution Structure Of Acyl Carrier Protein From Mycobacterium Tuberculosis 220 220 99% 2e-73 ZP_04748738.1 acyl carrier protein [Mycobacterium kansasii ATCC 12478] 165 165 100% 9e-52 ZP_05224070.1 acyl carrier protein [Mycobacterium intracellulare ATCC 13950] 162 162 100% 8e-51 NP_960931.1 acyl carrier protein [Mycobacterium avium subsp. paratuberculosis K-10] >ref|YP_881402.1| acyl carrier protein [Mycobacterium avium 104] >ref|ZP_05216419.1| acyl carrier protein [Mycobacterium avium subsp. avium ATCC 25291] >gb|AAS04314.1| AcpM [Mycobacterium avium subsp. paratuberculosis K-10] >gb|ABK65172.1| acyl carrier protein [Mycobacterium avium 104] >gb|EGO40713.1| acyl carrier protein [Mycobacterium avium subsp. paratuberculosis S397] 162 162 100% 8e-51 NP_302135.1 acyl carrier protein [Mycobacterium leprae TN] >ref|YP_002503765.1| acyl carrier protein [Mycobacterium leprae Br4923] >sp|O69475.1|ACPM_MYCLE RecName: Full=Meromycolate extension acyl carrier protein; Short=ACP >emb|CAA19202.1| acyl carrier protein [Mycobacterium leprae] >emb|CAC30605.1| acyl carrier protein (meromycolate extension) [Mycobacterium leprae] >emb|CAR71749.1| acyl carrier protein (meromycolate extension) [Mycobacterium leprae Br4923] 162 162 100% 2e-50 ZP_07966703.1 hypothetical protein HMPREF9336_03075 [Segniliparus rugosus ATCC BAA-974] >gb|EFV12044.1| hypothetical protein HMPREF9336_03075 [Segniliparus rugosus ATCC BAA-974] 162 162 88% 3e-50 YP_905336.1 acyl carrier protein [Mycobacterium ulcerans Agy99] >ref|YP_001851618.1| meromycolate extension acyl carrier protein AcpM [Mycobacterium marinum M] >gb|ABL03865.1| meromycolate extension acyl carrier protein AcpM [Mycobacterium ulcerans Agy99] >gb|ACC41763.1| meromycolate extension acyl carrier protein AcpM [Mycobacterium marinum M] 161 161 100% 3e-50 ZP_08713925.1 acyl carrier protein [Mycobacterium colombiense CECT 3035] >gb|EGT87768.1| acyl carrier protein [Mycobacterium colombiense CECT 3035] 160 160 100% 6e-50 YP_003660002.1 phosphopantetheine-binding protein [Segniliparus rotundus DSM 44985] >gb|ADG99171.1| phosphopantetheine-binding protein [Segniliparus rotundus DSM 44985] 160 160 88% 8e-50 ? ? ? ? ? ? ? ? ? ? ? where in my code: print "hit name is ",$hit->name, "\n"; # gives me the refrence of aligned sequence ? ? ? print"Score: ".$hsp->score."\n";??# gives me the score of aligned sequence ? ? ?print"E-val: ".$hsp->expect."\n";??# gives me the evalue of aligned sequence ? ? ?print"percent identity: ".$hsp->percent_identity."\n";??# gives me the query coverage ?of aligned sequence i want to use??#print "Description ",$hsp->desc, "\n"; to show the description but i am not getting can any body help me out for this i need to know urgently, thanks to read and i hope i was succesfull to explain my problem . below is the copy of my code i am trying to use : ? use Bio::Tools::Run::RemoteBlast; ? use strict; ? my $v = 1; ? my $prog = 'blastp'; ? my $db ? = 'refseq_protein'; ? my $e_val= '1e-10'; #1e-10 ?my $result; ?#my $code=q| my $answer = my $a / my $b;|; ? ? ? my @params = ( '-prog' => $prog, ? '-data' => $db, ? '-expect' => $e_val ); ? my $factory = Bio::Tools::Run::RemoteBlast->new(@params); ? $v = 1; ? my $str = Bio::SeqIO->new(-file=>'prot.txt' , '-format' => 'fasta' ); ? my $input; ? while($input = $str->next_seq()) ? { ?? ? # ?Blast a sequence against a database: ?? ? my $r = $factory->submit_blast($input); ? print STDERR "waiting..." if( $v > 0 ); ?? ? my %hit_evalue; ? my @evalue; ?? ? while ( my @rids = $factory->each_rid ) { ? ? ? foreach my $rid ( @rids ) { ? ?my $rc = $factory->retrieve_blast($rid); ? ?if( !ref($rc) ) { ? ? ? ?if( $rc < 0 ) { ? ? ? ?$factory->remove_rid($rid); ? ?} ? ? ? ?print STDERR "." if ( $v > 0 ); ? ? ? ?sleep 5; ? ?} else {? ? ? ? ?$factory->remove_rid($rid); ? ? ? ?#print $rid."\n\n"; ? ? ?my $result = $rc->next_result; ? ? ? ? ? ? ?print "db is ", $result->database_name(), "\n"; ? ? ? ?my $count = 0; ? ? ? ?while( my $hit = $result->next_hit ) { ? ?$count++; ? ?#next unless ( $v > 0); ? ?#print "hit name is ", $hit->name, "\n"; ? ?while( my $hsp = $hit->next_hsp ) ?{ ? ? ?print "hit name is ",$hit->name, "\n"; ? ? ?#print "Query name is ",$hsp->desc, "\n"; exit; ? ? ?? ? ? ?print"Score: ".$hsp->score."\n"; ? ? ?print"E-val: ".$hsp->expect."\n"; ? ? ?print"percent identity: ".$hsp->percent_identity."\n"; ?? ?} ? ? ? ? ? ?} ? ?} ? ? ? } ? } ? } From pcantalupo at gmail.com Thu Aug 18 08:55:18 2011 From: pcantalupo at gmail.com (Paul Cantalupo) Date: Thu, 18 Aug 2011 08:55:18 -0400 Subject: [Bioperl-l] query about Bio::Tools::Run::RemoteBlast In-Reply-To: <1313669694.59013.YahooMailNeo@web137303.mail.in.yahoo.com> References: <7717ecd0-c9b6-4ecc-895f-db60ca89e259@l7g2000vbz.googlegroups.com> <5D9B8006-A20E-4AAE-88EF-0A1DCA56B26E@verizon.net> <410A8BF5-D5EF-4E7A-B91C-D3DDACBABB75@illinois.edu> <1313669694.59013.YahooMailNeo@web137303.mail.in.yahoo.com> Message-ID: Punit I think you want '$hit->description' not '$hsp->desc' Paul Paul Cantalupo University of Pittsburgh On Thu, Aug 18, 2011 at 8:14 AM, punit kumar wrote: > hi friends , > > i am new to Bioperl , and i am using "Bio::Tools::Run::RemoteBlast" for remote blast i tried to use this module and i succeed a little yet, i want to get the description part of blast alignments which were found against my query sequence, as result is shown in format as given below, which is the out put table of ONLINE BLAST, > > Sequences producing significant alignments: > Accession > Description > Max score > Total score > Query coverage > E value > Links > NP_216760.1 acyl carrier protein [Mycobacterium tuberculosis H37Rv] >ref|NP_336774.1| acyl carrier protein [Mycobacterium tuberculosis CDC1551] >ref|NP_855917.1| acyl carrier protein [Mycobacterium bovis AF2122/97] >ref|YP_978350.1| acyl carrier protein [Mycobacterium bovis BCG str. Pasteur 1173P2] >ref|YP_001283588.1| acyl carrier protein [Mycobacterium tuberculosis H37Ra] >ref|YP_001288206.1| acyl carrier protein [Mycobacterium tuberculosis F11] >ref|ZP_02551632.1| acyl carrier protein [Mycobacterium tuberculosis H37Ra] >ref|YP_002645307.1| acyl carrier protein [Mycobacterium bovis BCG str. Tokyo 172] >ref|YP_003031689.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis KZN 1435] >ref|ZP_04925721.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis C] >ref|ZP_04981085.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis str. Haarlem] >ref|ZP_05141736.1| acyl carrier > protein [Mycobacterium tuberculosis '98-R604 INH-RIF-EM'] >ref|ZP_06433498.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis T46] >ref|ZP_06437620.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis CPHL_A] >ref|ZP_06443178.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis KZN 605] >ref|ZP_06450592.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis T17] >ref|ZP_06455160.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis K85] >ref|ZP_06504896.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis 02_1987] >ref|ZP_06510220.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis T92] >ref|ZP_06513730.1| meromycolate extension acyl carrier protein acpm [Mycobacterium tuberculosis EAS054] >ref|ZP_06517747.1| meromycolate extension acyl carrier protein acpm > [Mycobacterium tuberculosis T85] >ref|ZP_06521786.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis GM 1503] >ref|ZP_06799170.1| acyl carrier protein [Mycobacterium tuberculosis 210] >ref|ZP_06952619.1| acyl carrier protein [Mycobacterium tuberculosis KZN 4207] >ref|ZP_06960948.1| acyl carrier protein [Mycobacterium tuberculosis KZN R506] >ref|ZP_07013145.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis 94_M4241A] >ref|ZP_07414839.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu001] >ref|ZP_07418616.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu002] >ref|ZP_07423348.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu003] >ref|ZP_07427715.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu004] >ref|ZP_07432018.1| meromycolate extension acyl carrier protein > acpM [Mycobacterium tuberculosis SUMu005] >ref|ZP_07436410.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu006] >ref|ZP_07440655.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu008] >ref|ZP_07445228.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu007] >ref|ZP_07481045.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu009] >ref|ZP_07485275.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu010] >ref|ZP_07489492.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu011] >ref|ZP_07494023.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu012] >ref|ZP_07816044.1| acyl carrier protein [Mycobacterium tuberculosis KZN V2475] >ref|YP_004723912.1| meromycolate extension acyl carrier protein ACPM [Mycobacterium africanum > GM041182] >ref|YP_004745700.1| meromycolate extension acyl carrier protein ACPM [Mycobacterium canettii CIPT 140010059] >sp|P0A4W6.1|ACPM_MYCTU RecName: Full=Meromycolate extension acyl carrier protein; Short=ACP >sp|P0A4W7.1|ACPM_MYCBO RecName: Full=Meromycolate extension acyl carrier protein; Short=ACP >emb|CAA94640.1| MEROMYCOLATE EXTENSION ACYL CARRIER PROTEIN ACPM [Mycobacterium tuberculosis H37Rv] >gb|AAK46588.1| acyl carrier protein [Mycobacterium tuberculosis CDC1551] >emb|CAD97121.1| MEROMYCOLATE EXTENSION ACYL CARRIER PROTEIN ACPM [Mycobacterium bovis AF2122/97] >emb|CAL72249.1| Meromycolate extension acyl carrier protein acpM [Mycobacterium bovis BCG str. Pasteur 1173P2] >gb|EAY60463.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis C] >gb|EBA42598.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis str. Haarlem] >gb|ABQ74026.1| meromycolate extension acyl carrier protein AcpM > [Mycobacterium tuberculosis H37Ra] >gb|ABR06604.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis F11] >dbj|BAH26539.1| acyl carrier protein [Mycobacterium bovis BCG str. Tokyo 172] >gb|ACT24794.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis KZN 1435] >gb|EFD13913.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis T46] >gb|EFD18035.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis CPHL_A] >gb|EFD21093.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis KZN 605] >gb|EFD43942.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis K85] >gb|EFD47767.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis T17] >gb|EFD53534.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis 02_1987] >gb|EFD58858.1| meromycolate extension acyl carrier > protein acpM [Mycobacterium tuberculosis T92] >gb|EFD62368.1| meromycolate extension acyl carrier protein acpm [Mycobacterium tuberculosis EAS054] >gb|EFD73930.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis GM 1503] >gb|EFD77945.1| meromycolate extension acyl carrier protein acpm [Mycobacterium tuberculosis T85] >gb|EFI30824.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis 94_M4241A] >gb|EFO74536.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu001] >gb|EFP15742.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu002] >gb|EFP19094.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu003] >gb|EFP22930.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu004] >gb|EFP26734.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu005] > >gb|EFP30496.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu006] >gb|EFP33906.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu007] >gb|EFP38213.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu008] >gb|EFP42922.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu009] >gb|EFP46864.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu010] >gb|EFP50800.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu011] >gb|EFP54373.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu012] >gb|EGB28294.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis CDC1551A] >gb|EGE50793.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis W-148] >gb|AEB03875.1| meromycolate extension acyl > carrier protein acpM [Mycobacterium tuberculosis KZN 4207] >gb|AEJ47271.1| acyl carrier protein [Mycobacterium tuberculosis CCDC5079] >gb|AEJ50890.1| acyl carrier protein [Mycobacterium tuberculosis CCDC5180] >emb|CCC27325.1| meromycolate extension acyl carrier protein ACPM [Mycobacterium africanum GM041182] >emb|CCC44598.1| meromycolate extension acyl carrier protein ACPM [Mycobacterium canettii CIPT 140010059] >emb|CCC64838.1| Meromycolate extension acyl carrier protein acpM [Mycobacterium bovis BCG str. Moreau RDJ] 223 223 100% 1e-74 > 1KLP_A Chain A, The Solution Structure Of Acyl Carrier Protein From Mycobacterium Tuberculosis 220 220 99% 2e-73 > ZP_04748738.1 acyl carrier protein [Mycobacterium kansasii ATCC 12478] 165 165 100% 9e-52 > ZP_05224070.1 acyl carrier protein [Mycobacterium intracellulare ATCC 13950] 162 162 100% 8e-51 > NP_960931.1 acyl carrier protein [Mycobacterium avium subsp. paratuberculosis K-10] >ref|YP_881402.1| acyl carrier protein [Mycobacterium avium 104] >ref|ZP_05216419.1| acyl carrier protein [Mycobacterium avium subsp. avium ATCC 25291] >gb|AAS04314.1| AcpM [Mycobacterium avium subsp. paratuberculosis K-10] >gb|ABK65172.1| acyl carrier protein [Mycobacterium avium 104] >gb|EGO40713.1| acyl carrier protein [Mycobacterium avium subsp. paratuberculosis S397] 162 162 100% 8e-51 > NP_302135.1 acyl carrier protein [Mycobacterium leprae TN] >ref|YP_002503765.1| acyl carrier protein [Mycobacterium leprae Br4923] >sp|O69475.1|ACPM_MYCLE RecName: Full=Meromycolate extension acyl carrier protein; Short=ACP >emb|CAA19202.1| acyl carrier protein [Mycobacterium leprae] >emb|CAC30605.1| acyl carrier protein (meromycolate extension) [Mycobacterium leprae] >emb|CAR71749.1| acyl carrier protein (meromycolate extension) [Mycobacterium leprae Br4923] 162 162 100% 2e-50 > ZP_07966703.1 hypothetical protein HMPREF9336_03075 [Segniliparus rugosus ATCC BAA-974] >gb|EFV12044.1| hypothetical protein HMPREF9336_03075 [Segniliparus rugosus ATCC BAA-974] 162 162 88% 3e-50 > YP_905336.1 acyl carrier protein [Mycobacterium ulcerans Agy99] >ref|YP_001851618.1| meromycolate extension acyl carrier protein AcpM [Mycobacterium marinum M] >gb|ABL03865.1| meromycolate extension acyl carrier protein AcpM [Mycobacterium ulcerans Agy99] >gb|ACC41763.1| meromycolate extension acyl carrier protein AcpM [Mycobacterium marinum M] 161 161 100% 3e-50 > ZP_08713925.1 acyl carrier protein [Mycobacterium colombiense CECT 3035] >gb|EGT87768.1| acyl carrier protein [Mycobacterium colombiense CECT 3035] 160 160 100% 6e-50 > YP_003660002.1 phosphopantetheine-binding protein [Segniliparus rotundus DSM 44985] >gb|ADG99171.1| phosphopantetheine-binding protein [Segniliparus rotundus DSM 44985] 160 160 88% 8e-50 > > where in my code: > > print "hit name is ",$hit->name, "\n"; # gives me the refrence of aligned sequence > print"Score: ".$hsp->score."\n"; # gives me the score of aligned sequence > print"E-val: ".$hsp->expect."\n"; # gives me the evalue of aligned sequence > print"percent identity: ".$hsp->percent_identity."\n"; # gives me the query coverage of aligned sequence > > i want to use #print "Description ",$hsp->desc, "\n"; to show the description but i am not getting can any body help me out for this i need to know urgently, thanks to read and i hope i was succesfull to explain my problem . > > below is the copy of my code i am trying to use : > > > > > use Bio::Tools::Run::RemoteBlast; > use strict; > my $v = 1; > my $prog = 'blastp'; > my $db = 'refseq_protein'; > my $e_val= '1e-10'; #1e-10 > > my $result; > #my $code=q| my $answer = my $a / my $b;|; > > > > > > my @params = ( > '-prog' => $prog, > '-data' => $db, > '-expect' => $e_val > ); > > my $factory = Bio::Tools::Run::RemoteBlast->new(@params); > $v = 1; > my $str = Bio::SeqIO->new(-file=>'prot.txt' , '-format' => 'fasta' ); > my $input; > while($input = $str->next_seq()) > { > > # Blast a sequence against a database: > > my $r = $factory->submit_blast($input); > print STDERR "waiting..." if( $v > 0 ); > > my %hit_evalue; > my @evalue; > > while ( my @rids = $factory->each_rid ) { > foreach my $rid ( @rids ) { > my $rc = $factory->retrieve_blast($rid); > if( !ref($rc) ) { > if( $rc < 0 ) { > $factory->remove_rid($rid); > } > print STDERR "." if ( $v > 0 ); > sleep 5; > } else { > $factory->remove_rid($rid); > #print $rid."\n\n"; > my $result = $rc->next_result; > > print "db is ", $result->database_name(), "\n"; > my $count = 0; > while( my $hit = $result->next_hit ) { > $count++; > #next unless ( $v > 0); > #print "hit name is ", $hit->name, "\n"; > while( my $hsp = $hit->next_hsp ) > { > print "hit name is ",$hit->name, "\n"; > #print "Query name is ",$hsp->desc, "\n"; exit; > > print"Score: ".$hsp->score."\n"; > print"E-val: ".$hsp->expect."\n"; > print"percent identity: ".$hsp->percent_identity."\n"; > } > > > } > } > } > } > } > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From David.Messina at sbc.su.se Fri Aug 19 05:07:35 2011 From: David.Messina at sbc.su.se (Dave Messina) Date: Fri, 19 Aug 2011 11:07:35 +0200 Subject: [Bioperl-l] Fwd: pls help.. In-Reply-To: References: Message-ID: Whoops, resending ? the attachment was too big. Ravi, please provide only a few example lines from your GFF file, or host the file elsewhere and post a link to it. Dave ---------- Forwarded message ---------- From: Dave Messina Date: Fri, Aug 19, 2011 at 10:53 Subject: pls help.. To: ravi.devani89 at gmail.com Cc: bioperl-l Ravi, Your message belongs on the main BioPerl list, not the bioperl-dev list, so I'm reposting it there. To sign up for the main list, go to: http://bioperl.org/mailman/listinfo/bioperl-l Dave ---------- Forwarded message ---------- From: Ravi Devani To: bioperl-dev at lists.open-bio.org Date: Fri, 19 Aug 2011 13:54:22 +0530 Subject: Fwd: pls help.. i tried to create a gff3 file from .gbk file using bioperl genbank2gff3 script but what i get is same features repeating many times.. and the file keeps growing in size ntil my harddisk gets full.. i have tried to filter all other features except "region" but still it repeats a single entry many times.. i have attached a part of the file generated.. pls kindly help me. From ravi.devani89 at gmail.com Fri Aug 19 01:16:00 2011 From: ravi.devani89 at gmail.com (Ravi Devani) Date: Fri, 19 Aug 2011 10:46:00 +0530 Subject: [Bioperl-l] pls help.. In-Reply-To: References: Message-ID: ---------- Forwarded message ---------- From: Ravi Devani Date: Thu, Aug 18, 2011 at 12:40 PM Subject: pls help.. To: scott at scottcain.net i tried to create a gff3 file from .gbk file using bp_genbank2gff3.pl but what i get is same features repeating many times.. and the file keeps growing in size ntil my harddisk gets full.. i have tried to filter all other features except "region" but still it repeats a single entry many times.. i have attached a part of the file generated.. pls kindly help me. -------------- next part -------------- A non-text attachment was scrubbed... Name: ref_chrUn.gff Type: application/octet-stream Size: 602112 bytes Desc: not available URL: From anjan.purkayastha at gmail.com Mon Aug 15 10:32:39 2011 From: anjan.purkayastha at gmail.com (ANJAN PURKAYASTHA) Date: Mon, 15 Aug 2011 10:32:39 -0400 Subject: [Bioperl-l] Problem with Bio::DB::Taxonomy Message-ID: Hello, I wrote a short test script for the Bio::DB::Taxonomy module: ================================================ #!/usr/bin/perl -w use strict; use Bio::DB::Taxonomy; my ($nodesfile, $namesfile)= ('nodes.dmp', 'names.dmp'); my $db= new Bio::DB::Taxonomy(-source => 'flatfile', -nodesfile => $nodesfile, -namesfile => $namesfile ); my $bacteria= $db->get_Taxonomy_Node(-taxonid => '2'); print("$bacteria->id\t$bacteria->name\n"); ================================================ On running this script I expect the following output: 2 Bacteria. Instead I get a warning: UNIVERSAL->import is deprecated and will be removed in a future perl at /usr/share/perl5/vendor_perl/Bio/Tree/TreeFunctionsI.pm line 94. and the following ouput: Bio::Taxon=HASH(0x158dbe0)->id Bio::Taxon=HASH(0x158dbe0)->name The script seems to be working but there seems to be a problem with dereferencing a Bio::Taxon object. Any leads on how to troubleshoot this will be much appreciated. Thanks Anjan -- =================================== Anjan Purkayastha, PhD Senior Computational Biologist TessArae LLC 46090 Lake Center Plaza, Suite 304 Potomac Falls, VA 20165** Office- 703.444.7188 ext. 116 Mobile-703.740.6939 =================================== From scott at scottcain.net Fri Aug 19 09:45:47 2011 From: scott at scottcain.net (Scott Cain) Date: Fri, 19 Aug 2011 09:45:47 -0400 Subject: [Bioperl-l] pls help.. In-Reply-To: References: Message-ID: <92CA808D-16F0-4F08-BC44-8A0C06292EA8@scottcain.net> Ravi, The gff file is fairly useless from a debugging perspective. Can you please attach the genbank file you're using? Also, please indicate what version of bioperl you're using. Scott Sent from my iPhone On Aug 19, 2011, at 1:16 AM, Ravi Devani wrote: > ---------- Forwarded message ---------- > From: Ravi Devani > Date: Thu, Aug 18, 2011 at 12:40 PM > Subject: pls help.. > To: scott at scottcain.net > > > i tried to create a gff3 file from .gbk file using > bp_genbank2gff3.pl but > what i get is same features repeating many times.. and the file keeps > growing in size ntil my harddisk gets full.. i have tried to filter > all > other features except "region" but still it repeats a single entry > many > times.. i have attached a part of the file generated.. pls kindly > help me. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Fri Aug 19 10:05:03 2011 From: cjfields at illinois.edu (Chris Fields) Date: Fri, 19 Aug 2011 09:05:03 -0500 Subject: [Bioperl-l] Problem with Bio::DB::Taxonomy In-Reply-To: References: Message-ID: <7A733494-D831-43ED-9AE4-AB62AC5A2761@illinois.edu> Anjan, You are likely using an old version of BioPerl (this was fixed in the latest release on CPAN I believe). Bio::DB::Taxonomy uses Bio::Taxon, so the use ofname() is incorrect; it is node_name(); if this is documented somewhere it is incorrect, so let us know where that came from. Also, the print statement at the end isn't interpolating correctly; in general with objects I make this more explicit: print $bacteria->id."\t".$bacteria->node_name."\n"; Correcting that, it works for me: [cjfields at pyrimidine1 anjan]$ perl test.pl 2 Bacteria chris On Aug 15, 2011, at 9:32 AM, ANJAN PURKAYASTHA wrote: > Hello, > I wrote a short test script for the Bio::DB::Taxonomy module: > ================================================ > #!/usr/bin/perl -w > use strict; > use Bio::DB::Taxonomy; > > my ($nodesfile, $namesfile)= ('nodes.dmp', 'names.dmp'); > > my $db= new Bio::DB::Taxonomy(-source => 'flatfile', > -nodesfile => $nodesfile, > -namesfile => $namesfile > ); > > my $bacteria= $db->get_Taxonomy_Node(-taxonid => '2'); > print("$bacteria->id\t$bacteria->name\n"); > ================================================ > > On running this script I expect the following output: 2 Bacteria. > > Instead I get a warning: > UNIVERSAL->import is deprecated and will be removed in a future perl at > /usr/share/perl5/vendor_perl/Bio/Tree/TreeFunctionsI.pm line 94. > > and the following ouput: > Bio::Taxon=HASH(0x158dbe0)->id Bio::Taxon=HASH(0x158dbe0)->name > > The script seems to be working but there seems to be a problem with > dereferencing a Bio::Taxon object. > > Any leads on how to troubleshoot this will be much appreciated. > Thanks > Anjan > > > > -- > =================================== > Anjan Purkayastha, PhD > Senior Computational Biologist > TessArae LLC > 46090 Lake Center Plaza, Suite 304 > Potomac Falls, VA 20165** > Office- 703.444.7188 ext. 116 > Mobile-703.740.6939 > =================================== > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Fri Aug 19 10:26:06 2011 From: cjfields at illinois.edu (Chris Fields) Date: Fri, 19 Aug 2011 09:26:06 -0500 Subject: [Bioperl-l] pls help.. In-Reply-To: <92CA808D-16F0-4F08-BC44-8A0C06292EA8@scottcain.net> References: <92CA808D-16F0-4F08-BC44-8A0C06292EA8@scottcain.net> Message-ID: <1D308407-17A9-4203-9D6C-D71FA0FD74D0@illinois.edu> Scott, http://www.ncbi.nlm.nih.gov/nuccore/NW_002121371.1?report=gbwithparts&log$=seqview (it's in the GFF file) It definitely is getting stuck in a loop for the genomic region, but using the file for GFF3 doesn't make sense (very few features of note). On Aug 19, 2011, at 8:45 AM, Scott Cain wrote: > Ravi, > > The gff file is fairly useless from a debugging perspective. Can you please attach the genbank file you're using? Also, please indicate what version of bioperl you're using. > > Scott > > > Sent from my iPhone > > On Aug 19, 2011, at 1:16 AM, Ravi Devani wrote: > >> ---------- Forwarded message ---------- >> From: Ravi Devani >> Date: Thu, Aug 18, 2011 at 12:40 PM >> Subject: pls help.. >> To: scott at scottcain.net >> >> >> i tried to create a gff3 file from .gbk file using bp_genbank2gff3.pl but >> what i get is same features repeating many times.. and the file keeps >> growing in size ntil my harddisk gets full.. i have tried to filter all >> other features except "region" but still it repeats a single entry many >> times.. i have attached a part of the file generated.. pls kindly help me. >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From scott at scottcain.net Fri Aug 19 10:38:16 2011 From: scott at scottcain.net (Scott Cain) Date: Fri, 19 Aug 2011 10:38:16 -0400 Subject: [Bioperl-l] pls help.. In-Reply-To: <1D308407-17A9-4203-9D6C-D71FA0FD74D0@illinois.edu> References: <92CA808D-16F0-4F08-BC44-8A0C06292EA8@scottcain.net> <1D308407-17A9-4203-9D6C-D71FA0FD74D0@illinois.edu> Message-ID: I was wondering if perhaps the genbank file had been manipulated in some way. Scott On Fri, Aug 19, 2011 at 10:26 AM, Chris Fields wrote: > Scott, > > http://www.ncbi.nlm.nih.gov/nuccore/NW_002121371.1?report=gbwithparts&log$=seqview > > (it's in the GFF file) > > It definitely is getting stuck in a loop for the genomic region, but using the file for GFF3 doesn't make sense (very few features of note). > > On Aug 19, 2011, at 8:45 AM, Scott Cain wrote: > >> Ravi, >> >> The gff file is fairly useless from a debugging perspective. Can you please attach the genbank file you're using? ?Also, please indicate what version of bioperl you're using. >> >> Scott >> >> >> Sent from my iPhone >> >> On Aug 19, 2011, at 1:16 AM, Ravi Devani wrote: >> >>> ---------- Forwarded message ---------- >>> From: Ravi Devani >>> Date: Thu, Aug 18, 2011 at 12:40 PM >>> Subject: pls help.. >>> To: scott at scottcain.net >>> >>> >>> i tried to create a gff3 file from .gbk file using bp_genbank2gff3.pl but >>> what i get is same features repeating many times.. and the file keeps >>> growing in size ntil my harddisk gets full.. i have tried to filter all >>> other features except "region" but still it repeats a single entry many >>> times.. ?i have attached a part of the file generated.. pls kindly help me. >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- ------------------------------------------------------------------------ Scott Cain, Ph. D.? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? scott at scottcain dot net GMOD Coordinator (http://gmod.org/)? ? ? ? ? ? ? ? ? ?? 216-392-3087 Ontario Institute for Cancer Research From cjfields at illinois.edu Fri Aug 19 15:19:40 2011 From: cjfields at illinois.edu (Chris Fields) Date: Fri, 19 Aug 2011 14:19:40 -0500 Subject: [Bioperl-l] pls help.. In-Reply-To: References: <92CA808D-16F0-4F08-BC44-8A0C06292EA8@scottcain.net> <1D308407-17A9-4203-9D6C-D71FA0FD74D0@illinois.edu> Message-ID: Yeah, the output is rather odd. Maybe it's using the contig file version? chris On Aug 19, 2011, at 9:38 AM, Scott Cain wrote: > I was wondering if perhaps the genbank file had been manipulated in some way. > > Scott > > > On Fri, Aug 19, 2011 at 10:26 AM, Chris Fields wrote: >> Scott, >> >> http://www.ncbi.nlm.nih.gov/nuccore/NW_002121371.1?report=gbwithparts&log$=seqview >> >> (it's in the GFF file) >> >> It definitely is getting stuck in a loop for the genomic region, but using the file for GFF3 doesn't make sense (very few features of note). >> >> On Aug 19, 2011, at 8:45 AM, Scott Cain wrote: >> >>> Ravi, >>> >>> The gff file is fairly useless from a debugging perspective. Can you please attach the genbank file you're using? Also, please indicate what version of bioperl you're using. >>> >>> Scott >>> >>> >>> Sent from my iPhone >>> >>> On Aug 19, 2011, at 1:16 AM, Ravi Devani wrote: >>> >>>> ---------- Forwarded message ---------- >>>> From: Ravi Devani >>>> Date: Thu, Aug 18, 2011 at 12:40 PM >>>> Subject: pls help.. >>>> To: scott at scottcain.net >>>> >>>> >>>> i tried to create a gff3 file from .gbk file using bp_genbank2gff3.pl but >>>> what i get is same features repeating many times.. and the file keeps >>>> growing in size ntil my harddisk gets full.. i have tried to filter all >>>> other features except "region" but still it repeats a single entry many >>>> times.. i have attached a part of the file generated.. pls kindly help me. >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> > > > > -- > ------------------------------------------------------------------------ > Scott Cain, Ph. D. scott at scottcain dot net > GMOD Coordinator (http://gmod.org/) 216-392-3087 > Ontario Institute for Cancer Research From hlapp at drycafe.net Fri Aug 19 23:38:51 2011 From: hlapp at drycafe.net (Hilmar Lapp) Date: Fri, 19 Aug 2011 22:38:51 -0500 Subject: [Bioperl-l] [BioSQL-l] How is is_circular recorded in BioSQL (by BioPerl)? In-Reply-To: <4E2D79D6.6020108@gmail.com> References: <4E2D5000.30305@gmail.com> <4E2D5314.5090107@gmail.com> <4E2D5BAC.8020001@gmail.com> <4E2D79D6.6020108@gmail.com> Message-ID: <59AF5708-AECD-4375-9EB8-6E79D4B21C26@drycafe.net> I realize I'm chiming in here late, but the below sums it up quite well. In fact, biosequence.alphabet column was originally (pre-2002) called molecule, and the BioPerl Genbank writer defaults to alphabet() if molecule() is not defined. -hilmar Sent with a tap. On Jul 25, 2011, at 9:12 AM, Roy Chaudhuri wrote: > As with the is_circular hack, you could store the molecule type by adding it as an annotation in the SequenceProcessor (it's stored as $seq->molecule by BioPerl). > > Actually, when round-tripping a GenBank file through BioSQL, the LOCUS line molecule type ends up in lower case, which makes me wonder if it is coming from alphabet in the biosequence table. From hlapp at drycafe.net Fri Aug 19 21:02:12 2011 From: hlapp at drycafe.net (Hilmar Lapp) Date: Fri, 19 Aug 2011 20:02:12 -0500 Subject: [Bioperl-l] Error writing SequenceProcessor to associate GO terms in biosql database In-Reply-To: <26C59A57-F54A-4237-8D97-4E7A77E55D59@sgul.ac.uk> References: <26C59A57-F54A-4237-8D97-4E7A77E55D59@sgul.ac.uk> Message-ID: <6BDB69DE-5856-4061-96FA-0CF2884EDD9E@drycafe.net> Hi Adam I'm not sure whether you've received a response to this. Apologies if not. There is indeed a NOT NULL constraint on seqfeature_qualifier_value.value. The only other metadata association table in BioSQL that does this is location_qualifier_value. In the latter case there is arguably some sense to that (at least originally for locations the purpose of that table was pretty much to store the fuzzy location start/end properties), but for seqfeatures this looks like a bug to me. I'll post this to the BioSQL list and fix it f there are no objections, but feel free to drop the NOT NULL on that column yourself in the meantime. The INSERT query gets constructed in the innards of Bioperl-db. There is no reason to mess with that for this problem though - just drop the NOT NULL constraint. -hilmar Sent with a tap. On Jul 26, 2011, at 10:07 AM, Adam Witney wrote: > > Hi, > > I'm trying to write a SequenceProcessor for a genbank file to associate GO terms to the GO data preloaded in my biosql database. The command looks like this: > > perl load_seqdatabase.pl --dbname=biosql --driver=Pg --host=myhost --port= 5432 --dbuser=user --dbpass=pass -format genbank -namespace testing -pipeline 'GOSequenceProcessor' --debug S_sonnei.EB1_s_sonnei.dat > > The SequenceProcessor process_seq looks like this: > > sub process_seq{ > my ($self,$seq) = @_; > > my @features = $seq->get_SeqFeatures(); > foreach my $feat ( @features ) { > if ( $feat->has_tag('db_xref') ) { > my @db_xrefs = $feat->get_tag_values('db_xref'); > > foreach my $db_xref (@db_xrefs) { > if ( $db_xref =~ m/^GO:/ ) { > my $term = Bio::Annotation::OntologyTerm->new(-identifier => $db_xref, > -ontology => 'Gene Ontology'); > $feat->annotation->add_Annotation($term); > } > } > } > } > > return ($seq); > } > > But this gives this error: > > preparing INSERT statement: INSERT INTO seqfeature_qualifier_value (seqfeature_id, term_id, rank) VALUES (?, ?, ?) > TermAdaptor::add_assoc: binding column 1 to "935181" (FK to Bio::SeqFeature::Generic) > TermAdaptor::add_assoc: binding column 2 to "50253" (FK to Bio::Annotation::OntologyTerm) > TermAdaptor::add_assoc: binding column 3 to "1" (rank) > > --------------------- WARNING --------------------- > MSG: TermAdaptor::add_assoc: unexpected failure of statement execution: ERROR: null value in column "value" violates not-null constraint > name: INSERT ASSOC [1] Bio::SeqFeature::Generic;Bio::Annotation::OntologyTerm > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::add_association /var/users/adam/BioPerl/bioperl-db/lib//Bio/DB/BioSQL/BasePersistenceAdaptor.pm:458 > STACK Bio::DB::BioSQL::AnnotationCollectionAdaptor::add_association /var/users/adam/BioPerl/bioperl-db/lib//Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:468 > STACK Bio::DB::BioSQL::SeqFeatureAdaptor::store_children /var/users/adam/BioPerl/bioperl-db/lib//Bio/DB/BioSQL/SeqFeatureAdaptor.pm:304 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create /var/users/adam/BioPerl/bioperl-db/lib//Bio/DB/BioSQL/BasePersistenceAdaptor.pm:227 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store /var/users/adam/BioPerl/bioperl-db/lib//Bio/DB/BioSQL/BasePersistenceAdaptor.pm:264 > STACK Bio::DB::Persistent::PersistentObject::store /var/users/adam/BioPerl/bioperl-db/lib//Bio/DB/Persistent/PersistentObject.pm:284 > STACK Bio::DB::BioSQL::SeqAdaptor::store_children /var/users/adam/BioPerl/bioperl-db/lib//Bio/DB/BioSQL/SeqAdaptor.pm:257 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create /var/users/adam/BioPerl/bioperl-db/lib//Bio/DB/BioSQL/BasePersistenceAdaptor.pm:227 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store /var/users/adam/BioPerl/bioperl-db/lib//Bio/DB/BioSQL/BasePersistenceAdaptor.pm:264 > STACK Bio::DB::Persistent::PersistentObject::store /var/users/adam/BioPerl/bioperl-db/lib//Bio/DB/Persistent/PersistentObject.pm:284STACK (eval) /var/users/adam/BioPerl/bioperl-db/scripts/biosql/load_seqdatabase.pl:630 > STACK toplevel /var/users/adam/BioPerl/bioperl-db/scripts/biosql/load_seqdatabase.pl:612 > > As you can see it generates an INSERT against seqfeature_qualifier_value without including a 'value' field, which is of course defined as NOT NULL. > > Firstly, is this the best way to achieve this? And secondly, where is the INSERT statement put together, I can't seem to find it in the object hierarchy > > Thanks > > adam > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From ulrik.stervbo at gmail.com Sun Aug 21 13:33:44 2011 From: ulrik.stervbo at gmail.com (Ulrik Stervbo) Date: Sun, 21 Aug 2011 19:33:44 +0200 Subject: [Bioperl-l] Change of Expasy Protparam url Message-ID: it seems the there are some minor changes with the urls for the expasy-services. In the Protparam.pm, line 110 should be changed from @args=('-url'=>'http://www.expasy.org/cgi-bin/protparam','-form'=>'sequence', at args); to @args=('-url'=>'http://web.expasy.org/cgi-bin/protparam/protparam','-form'=>'sequence', at args); At least it seems to be working here, after adding the change to my local Protparam.pm Cheers, Ulrik From cjfields at illinois.edu Sun Aug 21 13:56:10 2011 From: cjfields at illinois.edu (Chris Fields) Date: Sun, 21 Aug 2011 12:56:10 -0500 Subject: [Bioperl-l] Change of Expasy Protparam url In-Reply-To: References: Message-ID: <9178B7E4-6EF2-4BC7-9B1C-9E5B282B5012@illinois.edu> Thanks for pointing that out. I've updated that on github. The critical thing is to get some tests working, so a failure for the webservice doesn't happen again w/o some exceptions (so we can track this). chris On Aug 21, 2011, at 12:33 PM, Ulrik Stervbo wrote: > it seems the there are some minor changes with the urls for the expasy-services. > > In the Protparam.pm, line 110 should be changed from > @args=('-url'=>'http://www.expasy.org/cgi-bin/protparam','-form'=>'sequence', at args); > > to > > @args=('-url'=>'http://web.expasy.org/cgi-bin/protparam/protparam','-form'=>'sequence', at args); > > At least it seems to be working here, after adding the change to my > local Protparam.pm > > Cheers, > Ulrik > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From scott at scottcain.net Mon Aug 22 11:51:55 2011 From: scott at scottcain.net (Scott Cain) Date: Mon, 22 Aug 2011 11:51:55 -0400 Subject: [Bioperl-l] pls help.. In-Reply-To: References: <92CA808D-16F0-4F08-BC44-8A0C06292EA8@scottcain.net> <1D308407-17A9-4203-9D6C-D71FA0FD74D0@illinois.edu> Message-ID: Hello Ravi, Please keep the BioPerl mailing list cc'ed. I downloaded your 1.7GB multi-genbank file and started processing it with bp_genbank2gff3.pl and I killed it when the GFF file got to 10GB, however, it was working as expected. I suggest you upgrade to the most recent release of BioPerl and try again. Additionally, it might make sense to break that big multi-genbank file into smaller files. Scott On Sun, Aug 21, 2011 at 11:33 AM, Ravi Devani wrote: > scott i hv given the link to the gbk file, please kindly help me > > On 8/19/11, Scott Cain wrote: >> Ravi, >> >> I also meant to ask what version of BioPerl you are using. ?When I run >> this command >> >> ? bp_genbank2gff3.pl NW_002121371.gbk >> >> I get a rather dull GFF3 file with 4 lines of GFF (one region and >> three gaps) and a fasta section. >> >> Scott >> >> >> On Fri, Aug 19, 2011 at 12:33 PM, Ravi Devani >> wrote: >>> No the genbank file has not been manipulated >>> >>> On 8/19/11, Scott Cain wrote: >>>> I was wondering if perhaps the genbank file had been manipulated in some >>>> way. >>>> >>>> Scott >>>> >>>> >>>> On Fri, Aug 19, 2011 at 10:26 AM, Chris Fields >>>> wrote: >>>>> Scott, >>>>> >>>>> http://www.ncbi.nlm.nih.gov/nuccore/NW_002121371.1?report=gbwithparts&log$=seqview >>>>> >>>>> (it's in the GFF file) >>>>> >>>>> It definitely is getting stuck in a loop for the genomic region, but >>>>> using >>>>> the file for GFF3 doesn't make sense (very few features of note). >>>>> >>>>> On Aug 19, 2011, at 8:45 AM, Scott Cain wrote: >>>>> >>>>>> Ravi, >>>>>> >>>>>> The gff file is fairly useless from a debugging perspective. Can you >>>>>> please attach the genbank file you're using? ?Also, please indicate >>>>>> what >>>>>> version of bioperl you're using. >>>>>> >>>>>> Scott >>>>>> >>>>>> >>>>>> Sent from my iPhone >>>>>> >>>>>> On Aug 19, 2011, at 1:16 AM, Ravi Devani >>>>>> wrote: >>>>>> >>>>>>> ---------- Forwarded message ---------- >>>>>>> From: Ravi Devani >>>>>>> Date: Thu, Aug 18, 2011 at 12:40 PM >>>>>>> Subject: pls help.. >>>>>>> To: scott at scottcain.net >>>>>>> >>>>>>> >>>>>>> i tried to create a gff3 file from .gbk file using bp_genbank2gff3.pl >>>>>>> but >>>>>>> what i get is same features repeating many times.. and the file keeps >>>>>>> growing in size ntil my harddisk gets full.. i have tried to filter >>>>>>> all >>>>>>> other features except "region" but still it repeats a single entry >>>>>>> many >>>>>>> times.. ?i have attached a part of the file generated.. pls kindly >>>>>>> help >>>>>>> me. >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Bioperl-l mailing list >>>>>>> Bioperl-l at lists.open-bio.org >>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>>>> _______________________________________________ >>>>>> Bioperl-l mailing list >>>>>> Bioperl-l at lists.open-bio.org >>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>>> >>>>> >>>> >>>> >>>> >>>> -- >>>> ------------------------------------------------------------------------ >>>> Scott Cain, Ph. D.? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? scott at scottcain >>>> dot >>>> net >>>> GMOD Coordinator (http://gmod.org/)? ? ? ? ? ? ? ? ? ?? 216-392-3087 >>>> Ontario Institute for Cancer Research >>>> >>> >> >> >> >> -- >> ------------------------------------------------------------------------ >> Scott Cain, Ph. D.? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? scott at scottcain dot >> net >> GMOD Coordinator (http://gmod.org/)? ? ? ? ? ? ? ? ? ?? 216-392-3087 >> Ontario Institute for Cancer Research >> > -- ------------------------------------------------------------------------ Scott Cain, Ph. D.? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? scott at scottcain dot net GMOD Coordinator (http://gmod.org/)? ? ? ? ? ? ? ? ? ?? 216-392-3087 Ontario Institute for Cancer Research From allenday at ionflux.com Mon Aug 22 14:40:33 2011 From: allenday at ionflux.com (Allen Day, PhD) Date: Mon, 22 Aug 2011 18:40:33 +0000 Subject: [Bioperl-l] Beijing and Los Angeles Human NGS Biostatistics/Informatics jobs Message-ID: Hi all, Ion Flux is a startup that I just created to apply NGS technology to the clinical diagnostics field. We like to think of ourselves as an enterprise class "23andme". This is an early-stage startup -- you will have a chance to influence the company and to be rewarded accordingly. I am Allen, the founder. We have a couple of open positions - for smart, passionate, scientist / engineering types. Others need not apply. Please check out these job descriptions, if this sparks your interest: http://ionflux.com/blog/careers/bioinformatician-data-modeling-and-processing/ http://ionflux.com/blog/careers/bioinformatician-data-analysis-and-algorithms/ Our offices are in Los Angeles (UCLA adjacent) and Beijing (????@??????). I'm happy to post future openings to other lists in the future if this isn't the right venue for an occasional job announcement. -Allen From acpatel at gmail.com Mon Aug 22 15:25:50 2011 From: acpatel at gmail.com (Anand Patel) Date: Mon, 22 Aug 2011 14:25:50 -0500 Subject: [Bioperl-l] primer3redux 0.09 add_targets is not there Message-ID: I'm trying to get Primer3Redux to work, and am noticing some strange things. While I found and changed my parameters to the new primer3 2.2.3 parameters, I still can't find add_targets. Assigning the parameters using set_parameters works for primer3redux, add_targets is ?leftover? from primer3. So is this a doc/POD issue? Thanks, Anand Anand C. Patel, MD MS Washington University School of Medicine acpatel at gmail.com From cjfields1 at gmail.com Mon Aug 22 15:42:28 2011 From: cjfields1 at gmail.com (Christopher Fields) Date: Mon, 22 Aug 2011 14:42:28 -0500 Subject: [Bioperl-l] primer3redux 0.09 add_targets is not there In-Reply-To: References: Message-ID: On Aug 22, 2011, at 2:25 PM, Anand Patel wrote: > I'm trying to get Primer3Redux to work, and am noticing some strange > things. While I found and changed my parameters to the new primer3 > 2.2.3 parameters, I still can't find add_targets. > > Assigning the parameters using set_parameters works for primer3redux, > add_targets is ?leftover? from primer3. > > So is this a doc/POD issue? I'm confused. You are trying to use add_targets with the latest primer3, but it isn't there? Or is the Primer3Redux wrapper missing this parameter? chris > Thanks, > Anand > > Anand C. Patel, MD MS > Washington University School of Medicine > acpatel at gmail.com > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From acpatel at gmail.com Mon Aug 22 15:52:12 2011 From: acpatel at gmail.com (Anand Patel) Date: Mon, 22 Aug 2011 14:52:12 -0500 Subject: [Bioperl-l] primer3redux 0.09 add_targets is not there In-Reply-To: References: Message-ID: my $primer3 = Bio::Tools::Run::Primer3Redux->new(-outfile => "temp.out", -path => "/usr/bin/primer3_core"); If I use this: $primer3->add_targets( 'PRIMER_OPT_TM'=>$PRIMER_OPT_TM,'PRIMER_MIN_TM'=>$PRIMER_MIN_TM, 'PRIMER_MAX_TM'=>$PRIMER_MAX_TM, 'PRIMER_PAIR_MAX_DIFF_TM'=>$PRIMER_MAX_DIFF_TM, 'PRIMER_MAX_SIZE'=>$PRIMER_MAX_SIZE,'PRIMER_OPT_SIZE'=>$PRIMER_OPT_SIZE, 'PRIMER_MIN_SIZE'=>$PRIMER_MIN_SIZE, 'PRIMER_MAX_GC'=>$PRIMER_MAX_GC, 'PRIMER_OPT_GC_PERCENT'=>$PRIMER_OPT_GC_PERCENT, 'PRIMER_MIN_GC'=>$PRIMER_MIN_GC, 'SEQUENCE_TARGET'=>$TARGET, 'PRIMER_PRODUCT_SIZE_RANGE' =>$PRIMER_PRODUCT_SIZE_RANGE); I get: Can't locate object method "add_targets" via package "Bio::Tools::Run::Primer3Redux" at p3ra.pl line 31, line 1. On the other hand, if I change that line to: $primer3->set_parameters( 'PRIMER_OPT_TM'=>$PRIMER_OPT_TM,'PRIMER_MIN_TM'=>$PRIMER_MIN_TM, 'PRIMER_MAX_TM'=>$PRIMER_MAX_TM, 'PRIMER_PAIR_MAX_DIFF_TM'=>$PRIMER_MAX_DIFF_TM, 'PRIMER_MAX_SIZE'=>$PRIMER_MAX_SIZE,'PRIMER_OPT_SIZE'=>$PRIMER_OPT_SIZE, 'PRIMER_MIN_SIZE'=>$PRIMER_MIN_SIZE, 'PRIMER_MAX_GC'=>$PRIMER_MAX_GC, 'PRIMER_OPT_GC_PERCENT'=>$PRIMER_OPT_GC_PERCENT, 'PRIMER_MIN_GC'=>$PRIMER_MIN_GC, 'SEQUENCE_TARGET'=>$TARGET, 'PRIMER_PRODUCT_SIZE_RANGE' =>$PRIMER_PRODUCT_SIZE_RANGE); It works. When I looked at the source code for Primer3Redux, I couldn't find add_targets, but set_parameters looked like it might work, so I used that instead, and it worked. But I see over in the github that there are other issues with the documentation (how primer3redux's result object is now 3 deep rather than 2 deep). Not sure if this is in that category or not. Thanks, Anand On Mon, Aug 22, 2011 at 2:42 PM, Christopher Fields wrote: > On Aug 22, 2011, at 2:25 PM, Anand Patel wrote: > >> I'm trying to get Primer3Redux to work, and am noticing some strange >> things. ?While I found and changed my parameters to the new primer3 >> 2.2.3 parameters, I still can't find add_targets. >> >> Assigning the parameters using set_parameters works for primer3redux, >> add_targets is ?leftover? from primer3. >> >> So is this a doc/POD issue? > > I'm confused. ?You are trying to use add_targets with the latest primer3, but it isn't there? ?Or is the Primer3Redux wrapper missing this parameter? > > chris > >> Thanks, >> Anand >> >> Anand C. Patel, MD MS >> Washington University School of Medicine >> acpatel at gmail.com >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From cjfields at illinois.edu Mon Aug 22 16:10:25 2011 From: cjfields at illinois.edu (Chris Fields) Date: Mon, 22 Aug 2011 15:10:25 -0500 Subject: [Bioperl-l] primer3redux 0.09 add_targets is not there In-Reply-To: References: Message-ID: <3BE41688-C163-4EA1-AF6A-34A6052FCFEA@illinois.edu> On Aug 22, 2011, at 2:52 PM, Anand Patel wrote: > my $primer3 = Bio::Tools::Run::Primer3Redux->new(-outfile => > "temp.out", -path => "/usr/bin/primer3_core"); > > If I use this: > $primer3->add_targets( > 'PRIMER_OPT_TM'=>$PRIMER_OPT_TM,'PRIMER_MIN_TM'=>$PRIMER_MIN_TM, > 'PRIMER_MAX_TM'=>$PRIMER_MAX_TM, > 'PRIMER_PAIR_MAX_DIFF_TM'=>$PRIMER_MAX_DIFF_TM, > 'PRIMER_MAX_SIZE'=>$PRIMER_MAX_SIZE,'PRIMER_OPT_SIZE'=>$PRIMER_OPT_SIZE, > 'PRIMER_MIN_SIZE'=>$PRIMER_MIN_SIZE, > 'PRIMER_MAX_GC'=>$PRIMER_MAX_GC, > 'PRIMER_OPT_GC_PERCENT'=>$PRIMER_OPT_GC_PERCENT, > 'PRIMER_MIN_GC'=>$PRIMER_MIN_GC, > 'SEQUENCE_TARGET'=>$TARGET, 'PRIMER_PRODUCT_SIZE_RANGE' > =>$PRIMER_PRODUCT_SIZE_RANGE); > > I get: > Can't locate object method "add_targets" via package > "Bio::Tools::Run::Primer3Redux" at p3ra.pl line 31, line 1. > > On the other hand, if I change that line to: > $primer3->set_parameters( > 'PRIMER_OPT_TM'=>$PRIMER_OPT_TM,'PRIMER_MIN_TM'=>$PRIMER_MIN_TM, > 'PRIMER_MAX_TM'=>$PRIMER_MAX_TM, > 'PRIMER_PAIR_MAX_DIFF_TM'=>$PRIMER_MAX_DIFF_TM, > 'PRIMER_MAX_SIZE'=>$PRIMER_MAX_SIZE,'PRIMER_OPT_SIZE'=>$PRIMER_OPT_SIZE, > 'PRIMER_MIN_SIZE'=>$PRIMER_MIN_SIZE, > 'PRIMER_MAX_GC'=>$PRIMER_MAX_GC, > 'PRIMER_OPT_GC_PERCENT'=>$PRIMER_OPT_GC_PERCENT, > 'PRIMER_MIN_GC'=>$PRIMER_MIN_GC, > 'SEQUENCE_TARGET'=>$TARGET, 'PRIMER_PRODUCT_SIZE_RANGE' > =>$PRIMER_PRODUCT_SIZE_RANGE); > > It works. When I looked at the source code for Primer3Redux, I > couldn't find add_targets, but set_parameters looked like it might > work, so I used that instead, and it worked. > > But I see over in the github that there are other issues with the > documentation (how primer3redux's result object is now 3 deep rather > than 2 deep). Not sure if this is in that category or not. That is true; documentation was to be updated but that hasn't happened yet (haven't had the free time to work specifically on this, and I think fschwach was to work on some HOWTO documentation). I do plan on an update in the next few weeks to address the various Issues on github, if you can file this as well it would help. I have to go back and look at the history of add_targets() reative to primer3 bioperl code, but I don't think this was part of the commit history of Bio::Tools::Run::Primer3Redux (maybe for the old version, Bio::Tools::Run::Primer3), so that is probably cruft left over from the update. Would be easy enough to alias it for convenience... chris > Thanks, > Anand ... From miquel.amat at me.com Tue Aug 23 16:11:15 2011 From: miquel.amat at me.com (Miguel A. Amat) Date: Tue, 23 Aug 2011 16:11:15 -0400 Subject: [Bioperl-l] Installation on OS X Lion Message-ID: I am trying to install bioperl on mac os x 10.7 but ran into problems with the dependency packages Bio::ASN1::EntrezGene and DBD::mysql. I am running the latest version of CPAN and perl -v 5.12.3 and the BioPerl-1.6.1 package. The installation was being conducted interactively through via the "perl Build.PL" command. Can you provide some help, or suggest an alternative way of installing BioPerl? From cjfields at illinois.edu Tue Aug 23 20:14:49 2011 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 23 Aug 2011 19:14:49 -0500 Subject: [Bioperl-l] Installation on OS X Lion In-Reply-To: References: Message-ID: Try installing the latest version from CPAN; this bypasses the Bio::ASN1::EntrezGene req. DBD::mysql is only needed if you intend on using modules requiring that functionality. chris On Aug 23, 2011, at 3:11 PM, Miguel A. Amat wrote: > I am trying to install bioperl on mac os x 10.7 but ran into problems with the dependency packages Bio::ASN1::EntrezGene and DBD::mysql. > > I am running the latest version of CPAN and perl -v 5.12.3 and the BioPerl-1.6.1 package. The installation was being conducted interactively through via the "perl Build.PL" command. > > Can you provide some help, or suggest an alternative way of installing BioPerl? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From miquel.amat at me.com Tue Aug 23 23:25:31 2011 From: miquel.amat at me.com (Miguel A Amat) Date: Tue, 23 Aug 2011 23:25:31 -0400 Subject: [Bioperl-l] Installation on OS X Lion In-Reply-To: References: Message-ID: Thanks for the feedback, Chris. Now I just need to get GD to install ... On Aug 23, 2011, at 8:14 PM, Chris Fields wrote: > Try installing the latest version from CPAN; this bypasses the Bio::ASN1::EntrezGene req. DBD::mysql is only needed if you intend on using modules requiring that functionality. > > chris > > On Aug 23, 2011, at 3:11 PM, Miguel A. Amat wrote: > >> I am trying to install bioperl on mac os x 10.7 but ran into problems with the dependency packages Bio::ASN1::EntrezGene and DBD::mysql. >> >> I am running the latest version of CPAN and perl -v 5.12.3 and the BioPerl-1.6.1 package. The installation was being conducted interactively through via the "perl Build.PL" command. >> >> Can you provide some help, or suggest an alternative way of installing BioPerl? >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > From scott at scottcain.net Wed Aug 24 10:31:44 2011 From: scott at scottcain.net (Scott Cain) Date: Wed, 24 Aug 2011 10:31:44 -0400 Subject: [Bioperl-l] Installation on OS X Lion In-Reply-To: References: Message-ID: <0D4184A9-2166-4869-823A-BC780E684DCE@scottcain.net> Hi Miguel, Did you try the installer for snow leopard on sourceforge: http://sourceforge.net/projects/gmod/files/Generic%20Genome%20Browser/libgd-MacOSX/ I don't know if it will work on lion but I don't have a copy of lion yet to try it out on. Scott Sent from my iPhone On Aug 23, 2011, at 11:25 PM, Miguel A Amat wrote: > Thanks for the feedback, Chris. Now I just need to get GD to > install ... > > On Aug 23, 2011, at 8:14 PM, Chris Fields > wrote: > >> Try installing the latest version from CPAN; this bypasses the >> Bio::ASN1::EntrezGene req. DBD::mysql is only needed if you intend >> on using modules requiring that functionality. >> >> chris >> >> On Aug 23, 2011, at 3:11 PM, Miguel A. Amat wrote: >> >>> I am trying to install bioperl on mac os x 10.7 but ran into >>> problems with the dependency packages Bio::ASN1::EntrezGene and >>> DBD::mysql. >>> >>> I am running the latest version of CPAN and perl -v 5.12.3 and the >>> BioPerl-1.6.1 package. The installation was being conducted >>> interactively through via the "perl Build.PL" command. >>> >>> Can you provide some help, or suggest an alternative way of >>> installing BioPerl? >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From sheena.scroggins at gmail.com Wed Aug 24 12:21:07 2011 From: sheena.scroggins at gmail.com (Sheena Scroggins) Date: Wed, 24 Aug 2011 09:21:07 -0700 Subject: [Bioperl-l] End of GSoC Message-ID: I just wanted to give a GIANT thanks to my mentors on the BioPerl project, Rob Buels and Chris Fields. They helped me tremendously and we made great progress on the reorganization. All of the modules we extracted can be found on github at https://github.com/bioperl We used a Dist Zilla plugin bundle, which can also be found there. The steps used in the process will be outlined on the BioPerl wiki in the upcoming weeks. The reorganization is off to a great start and by outlining the workflow I'm hoping others will be able to contribute more easily. The progress updates were posted at techomics.com during the project, although they were sporadic. The original outline of the project can be found there as well. Thanks again to all the mentors of GSoC, this program wouldn't work without you! Sheena From miquel.amat at me.com Wed Aug 24 13:48:06 2011 From: miquel.amat at me.com (Miguel A. Amat) Date: Wed, 24 Aug 2011 13:48:06 -0400 Subject: [Bioperl-l] Installation on OS X Lion In-Reply-To: <0D4184A9-2166-4869-823A-BC780E684DCE@scottcain.net> References: <0D4184A9-2166-4869-823A-BC780E684DCE@scottcain.net> Message-ID: <251484F2-E0EB-454D-B664-BB0834FFCF76@me.com> Thanks for all the help; I finally got it to work. Here are the steps I took: upgraded CPAN and used latest version of BioPerl installed dependencies in interactive mode, but GD failed. Quit the installation and tried ?install GD-SVG?; this one seems to have less functionality than GD, but it worked. Installed Bio::Perl. Then, installed Bio::ASN1::EntrezGene Best. On Aug 24, 2011, at 10:31 AM, Scott Cain wrote: > Hi Miguel, > > Did you try the installer for snow leopard on sourceforge: > > http://sourceforge.net/projects/gmod/files/Generic%20Genome%20Browser/libgd-MacOSX/ > > I don't know if it will work on lion but I don't have a copy of lion yet to try it out on. > > Scott > > > Sent from my iPhone > > On Aug 23, 2011, at 11:25 PM, Miguel A Amat wrote: > >> Thanks for the feedback, Chris. Now I just need to get GD to install ... >> >> On Aug 23, 2011, at 8:14 PM, Chris Fields wrote: >> >>> Try installing the latest version from CPAN; this bypasses the Bio::ASN1::EntrezGene req. DBD::mysql is only needed if you intend on using modules requiring that functionality. >>> >>> chris >>> >>> On Aug 23, 2011, at 3:11 PM, Miguel A. Amat wrote: >>> >>>> I am trying to install bioperl on mac os x 10.7 but ran into problems with the dependency packages Bio::ASN1::EntrezGene and DBD::mysql. >>>> >>>> I am running the latest version of CPAN and perl -v 5.12.3 and the BioPerl-1.6.1 package. The installation was being conducted interactively through via the "perl Build.PL" command. >>>> >>>> Can you provide some help, or suggest an alternative way of installing BioPerl? >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Wed Aug 24 13:51:19 2011 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 24 Aug 2011 12:51:19 -0500 Subject: [Bioperl-l] Installation on OS X Lion In-Reply-To: <251484F2-E0EB-454D-B664-BB0834FFCF76@me.com> References: <0D4184A9-2166-4869-823A-BC780E684DCE@scottcain.net> <251484F2-E0EB-454D-B664-BB0834FFCF76@me.com> Message-ID: <200F67E8-7B4E-40AD-9C0D-37160B970F22@illinois.edu> Interesting, since GD::SVG requires GD. Anyway, glad to know it's working for you! chris On Aug 24, 2011, at 12:48 PM, Miguel A. Amat wrote: > Thanks for all the help; I finally got it to work. Here are the steps I took: > > > ? upgraded CPAN and used latest version of BioPerl > ? installed dependencies in interactive mode, but GD failed. > ? Quit the installation and tried ?install GD-SVG?; this one seems to have less functionality than GD, but it worked. > ? Installed Bio::Perl. > ? Then, installed Bio::ASN1::EntrezGene > > > > > > > Best. > > > On Aug 24, 2011, at 10:31 AM, Scott Cain wrote: > >> Hi Miguel, >> >> Did you try the installer for snow leopard on sourceforge: >> >> http://sourceforge.net/projects/gmod/files/Generic%20Genome%20Browser/libgd-MacOSX/ >> >> I don't know if it will work on lion but I don't have a copy of lion yet to try it out on. >> >> Scott >> >> >> Sent from my iPhone >> >> On Aug 23, 2011, at 11:25 PM, Miguel A Amat wrote: >> >>> Thanks for the feedback, Chris. Now I just need to get GD to install ... >>> >>> On Aug 23, 2011, at 8:14 PM, Chris Fields wrote: >>> >>>> Try installing the latest version from CPAN; this bypasses the Bio::ASN1::EntrezGene req. DBD::mysql is only needed if you intend on using modules requiring that functionality. >>>> >>>> chris >>>> >>>> On Aug 23, 2011, at 3:11 PM, Miguel A. Amat wrote: >>>> >>>>> I am trying to install bioperl on mac os x 10.7 but ran into problems with the dependency packages Bio::ASN1::EntrezGene and DBD::mysql. >>>>> >>>>> I am running the latest version of CPAN and perl -v 5.12.3 and the BioPerl-1.6.1 package. The installation was being conducted interactively through via the "perl Build.PL" command. >>>>> >>>>> Can you provide some help, or suggest an alternative way of installing BioPerl? >>>>> _______________________________________________ >>>>> Bioperl-l mailing list >>>>> Bioperl-l at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > From abualiga2 at gmail.com Wed Aug 24 14:09:10 2011 From: abualiga2 at gmail.com (galeb abu-ali) Date: Wed, 24 Aug 2011 14:09:10 -0400 Subject: [Bioperl-l] append schema to proxy In-Reply-To: References: Message-ID: Hi, I'm trying to run a program that generates a circular genome homology atlas "BLASTatlas" ( http://www.cbs.dtu.dk/ws/ws.php?entry=BLASTatlas ). I think the problem is with the module that appends schemas to the proxy, and I don't know how to do that manually. I've emailed the author couple times and have not heard back. Pasted below is the error message. At your convenience, I'd greatly appreciate your help. thanks galeb p/s - also, is there another program that can generate concetric circular plots of BLAST scores for multiple bacterial genomes with a per nucleotide resolution? thanks [galeb at localhost GeneWiz]$ BLASTatlas -modus circle -ref BX571966.fsa -proteins BX571966.proteins.fsa -ann BX571966.ann -blastcfg blast.cfg -customcfg custom.cfg --dnap="Intrinsic Curvature,Stacking Energy,Position Preference" -title "B. pseudomallei K96243" > sgeneric.ps # title set to 'B. pseudomallei K96243' # output format is ps # modus is 'circle' # loading reference genome ... # loading proteins ... # parsing blast lane configuration (blast.cfg) ... # .. parsing blast lane (B. ubonensis Bu) ... # .. .. program: tblastn # .. .. parsing color 101010_040410 # .. .. .. color from: r:10, g:10, b:10 # .. .. .. color to: r:04, g:04, b:10 # .. .. byrange: 0 .. 0.8 # .. parsing sequene source 'cat ./19539.fsa |' ... 1142 done # .. parsing blast lane (B. pseudomallei DM98) ... # .. .. program: tblastn # .. .. parsing color 101010_040410 # .. .. .. color from: r:10, g:10, b:10 # .. .. .. color to: r:04, g:04, b:10 # .. .. byrange: 0 .. 0.8 # .. parsing sequene source 'cat ./19509.fsa |' ... 2370 done # parsing custom lane configuration (custom.cfg) ... # .. parsing custom data entry SIDD at -0.035 ... # .. .. parsing color 000010_101010 # .. .. .. color from: r:00, g:00, b:10 # .. .. .. color to: r:10, g:10, b:10 # .. .. byrange: 9 .. 10 # .. .. boxfilter 5000 ... # .. parsing data source 'gunzip -c BX571966-57a2f2c2e11ca0dd8cd74493d667d4d6-3173005.sidd--0.035-c-10-c.out.gz | cut -f4 |' ... # .. .. parsing data source ... 3173005 done # reading external files and build hash of sequences ... *panic: schemas() removed in v2.00, not needed anymore* at /usr/local/lib/perl5/site_perl/5.12.2/XML/Compile/WSDL11.pm line 65 XML::Compile::WSDL11::schemas(XML::Compile::WSDL11=HASH(0x1fed6740)) at xml-compile.pl line 48 main::appendSchemas(XML::Compile::WSDL11=HASH(0x1fed6740), " http://www.cbs.dtu.dk/ws/common/ws_common_1_0b.xsd", " http://www.cbs.dtu.dk/ws/BLASTatlas/ws_blastatlas_1_0_ws2.xsd") at BLASTatlas line 177 From roy.chaudhuri at gmail.com Wed Aug 24 14:21:12 2011 From: roy.chaudhuri at gmail.com (Roy Chaudhuri) Date: Wed, 24 Aug 2011 19:21:12 +0100 Subject: [Bioperl-l] append schema to proxy In-Reply-To: References: Message-ID: <4E554118.90108@gmail.com> Hi Galeb, This is the wrong mailing list for your question - it's intended for discussion of the Bioperl toolkit, not general bioinformatics questions. Next time, try a general bioinformatics mailing list such as BBB: http://www.bioinformatics.org/lists/bbb Having said all that, maybe you could try BRIG: http://sourceforge.net/projects/brig/ http://www.biomedcentral.com/1471-2164/12/402 Cheers, Roy. On 24/08/2011 19:09, galeb abu-ali wrote: > Hi, > > I'm trying to run a program that generates a circular genome homology atlas > "BLASTatlas" ( http://www.cbs.dtu.dk/ws/ws.php?entry=BLASTatlas ). I think > the problem is with the module that appends schemas to the proxy, and I > don't know how to do that manually. I've emailed the author couple times and > have not heard back. Pasted below is the error message. At your convenience, > I'd greatly appreciate your help. > > thanks > > galeb > > p/s - also, is there another program that can generate concetric circular > plots of BLAST scores for multiple bacterial genomes with a per nucleotide > resolution? thanks > > [galeb at localhost GeneWiz]$ BLASTatlas -modus circle -ref BX571966.fsa > -proteins BX571966.proteins.fsa -ann BX571966.ann -blastcfg blast.cfg > -customcfg custom.cfg --dnap="Intrinsic Curvature,Stacking Energy,Position > Preference" -title "B. pseudomallei K96243"> sgeneric.ps > # title set to 'B. pseudomallei K96243' > # output format is ps > # modus is 'circle' > # loading reference genome ... > # loading proteins ... > # parsing blast lane configuration (blast.cfg) ... > # .. parsing blast lane (B. ubonensis Bu) ... > # .. .. program: tblastn > # .. .. parsing color 101010_040410 > # .. .. .. color from: r:10, g:10, b:10 > # .. .. .. color to: r:04, g:04, b:10 > # .. .. byrange: 0 .. 0.8 > # .. parsing sequene source 'cat ./19539.fsa |' ... 1142 done > # .. parsing blast lane (B. pseudomallei DM98) ... > # .. .. program: tblastn > # .. .. parsing color 101010_040410 > # .. .. .. color from: r:10, g:10, b:10 > # .. .. .. color to: r:04, g:04, b:10 > # .. .. byrange: 0 .. 0.8 > # .. parsing sequene source 'cat ./19509.fsa |' ... 2370 done > # parsing custom lane configuration (custom.cfg) ... > # .. parsing custom data entry SIDD at -0.035 ... > # .. .. parsing color 000010_101010 > # .. .. .. color from: r:00, g:00, b:10 > # .. .. .. color to: r:10, g:10, b:10 > # .. .. byrange: 9 .. 10 > # .. .. boxfilter 5000 ... > # .. parsing data source 'gunzip -c > BX571966-57a2f2c2e11ca0dd8cd74493d667d4d6-3173005.sidd--0.035-c-10-c.out.gz > | cut -f4 |' ... > # .. .. parsing data source ... 3173005 done > # reading external files and build hash of sequences ... > *panic: schemas() removed in v2.00, not needed anymore* > at /usr/local/lib/perl5/site_perl/5.12.2/XML/Compile/WSDL11.pm line 65 > XML::Compile::WSDL11::schemas(XML::Compile::WSDL11=HASH(0x1fed6740)) at > xml-compile.pl line 48 > main::appendSchemas(XML::Compile::WSDL11=HASH(0x1fed6740), " > http://www.cbs.dtu.dk/ws/common/ws_common_1_0b.xsd", " > http://www.cbs.dtu.dk/ws/BLASTatlas/ws_blastatlas_1_0_ws2.xsd") at > BLASTatlas line 177 > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Wed Aug 24 14:22:26 2011 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 24 Aug 2011 13:22:26 -0500 Subject: [Bioperl-l] append schema to proxy In-Reply-To: References: Message-ID: <61E5BF3C-653F-40D3-8764-0DA61859BC8B@illinois.edu> Sorry, but this doesn't have anything to do with BioPerl. Not sure you'll get an answer here. chris On Aug 24, 2011, at 1:09 PM, galeb abu-ali wrote: > Hi, > > I'm trying to run a program that generates a circular genome homology atlas > "BLASTatlas" ( http://www.cbs.dtu.dk/ws/ws.php?entry=BLASTatlas ). I think > the problem is with the module that appends schemas to the proxy, and I > don't know how to do that manually. I've emailed the author couple times and > have not heard back. Pasted below is the error message. At your convenience, > I'd greatly appreciate your help. > > thanks > > galeb > > p/s - also, is there another program that can generate concetric circular > plots of BLAST scores for multiple bacterial genomes with a per nucleotide > resolution? thanks > > [galeb at localhost GeneWiz]$ BLASTatlas -modus circle -ref BX571966.fsa > -proteins BX571966.proteins.fsa -ann BX571966.ann -blastcfg blast.cfg > -customcfg custom.cfg --dnap="Intrinsic Curvature,Stacking Energy,Position > Preference" -title "B. pseudomallei K96243" > sgeneric.ps > # title set to 'B. pseudomallei K96243' > # output format is ps > # modus is 'circle' > # loading reference genome ... > # loading proteins ... > # parsing blast lane configuration (blast.cfg) ... > # .. parsing blast lane (B. ubonensis Bu) ... > # .. .. program: tblastn > # .. .. parsing color 101010_040410 > # .. .. .. color from: r:10, g:10, b:10 > # .. .. .. color to: r:04, g:04, b:10 > # .. .. byrange: 0 .. 0.8 > # .. parsing sequene source 'cat ./19539.fsa |' ... 1142 done > # .. parsing blast lane (B. pseudomallei DM98) ... > # .. .. program: tblastn > # .. .. parsing color 101010_040410 > # .. .. .. color from: r:10, g:10, b:10 > # .. .. .. color to: r:04, g:04, b:10 > # .. .. byrange: 0 .. 0.8 > # .. parsing sequene source 'cat ./19509.fsa |' ... 2370 done > # parsing custom lane configuration (custom.cfg) ... > # .. parsing custom data entry SIDD at -0.035 ... > # .. .. parsing color 000010_101010 > # .. .. .. color from: r:00, g:00, b:10 > # .. .. .. color to: r:10, g:10, b:10 > # .. .. byrange: 9 .. 10 > # .. .. boxfilter 5000 ... > # .. parsing data source 'gunzip -c > BX571966-57a2f2c2e11ca0dd8cd74493d667d4d6-3173005.sidd--0.035-c-10-c.out.gz > | cut -f4 |' ... > # .. .. parsing data source ... 3173005 done > # reading external files and build hash of sequences ... > *panic: schemas() removed in v2.00, not needed anymore* > at /usr/local/lib/perl5/site_perl/5.12.2/XML/Compile/WSDL11.pm line 65 > XML::Compile::WSDL11::schemas(XML::Compile::WSDL11=HASH(0x1fed6740)) at > xml-compile.pl line 48 > main::appendSchemas(XML::Compile::WSDL11=HASH(0x1fed6740), " > http://www.cbs.dtu.dk/ws/common/ws_common_1_0b.xsd", " > http://www.cbs.dtu.dk/ws/BLASTatlas/ws_blastatlas_1_0_ws2.xsd") at > BLASTatlas line 177 > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From abualiga2 at gmail.com Wed Aug 24 14:39:33 2011 From: abualiga2 at gmail.com (abualiga2 at gmail.com) Date: Wed, 24 Aug 2011 18:39:33 +0000 Subject: [Bioperl-l] append schema to proxy In-Reply-To: <4E554118.90108@gmail.com> Message-ID: <00504502ec3723598f04ab44a23f@google.com> Roy, thanks! I'll try that. galeb On Aug 24, 2011 2:21pm, Roy Chaudhuri wrote: > Hi Galeb, > This is the wrong mailing list for your question - it's intended for > discussion of the Bioperl toolkit, not general bioinformatics questions. > Next time, try a general bioinformatics mailing list such as BBB: > http://www.bioinformatics.org/lists/bbb > Having said all that, maybe you could try BRIG: > http://sourceforge.net/projects/brig/ > http://www.biomedcentral.com/1471-2164/12/402 > Cheers, > Roy. From slucky at ibab.ac.in Mon Aug 22 02:01:16 2011 From: slucky at ibab.ac.in (Lucky Singh) Date: Mon, 22 Aug 2011 11:31:16 +0530 (IST) Subject: [Bioperl-l] Problem using Bio::Tools::Run::RemoteBlast Message-ID: <37711.192.168.1.254.1313992876.squirrel@webmail.ibab.ac.in> Dear sir/Ma'am, I am student of Institute of Bioinformatics and Applied Biotechnology, Bangalore, India. While doing my project work I needed remoteblast.pm. So I used default example program which is available with this package. Now I wanted to host it from web server, but This program is not working from it may be it is not able to create or write on file from web server but in command line it is working fine. I don't know the possible reason, please help me to figure it out. -> I am using same example program with basic cgi modification for taking input from web browser. -> Ubuntu 10.04 64 bit OS -> apache2 server -> I have given all permissions 777 recursively to cgi-bin folder -- Regards, Lucky Singh Institute of Bioinformatics and Applied Biotechnology, ------------------------------------------------------ Biotech Park Electronics City Phase I Bangalore 560 100 India. Tel: 080-28528900, 080-28528901, 080-28528902 Fax: 080-28528904 From abualiga2 at gmail.com Wed Aug 24 13:26:10 2011 From: abualiga2 at gmail.com (galeb abu-ali) Date: Wed, 24 Aug 2011 13:26:10 -0400 Subject: [Bioperl-l] append schema to proxy Message-ID: Hi, I'm trying to run a program that generates a circular genome homology atlas "BLASTatlas" ( http://www.cbs.dtu.dk/ws/ws.php?entry=BLASTatlas ). I think the problem is with the module that appends schemas to the proxy, and I don't know how to do that manually. I've emailed the author couple times and have not heard back. Pasted below is the error message. At your convenience, I'd greatly appreciate your help. thanks galeb p/s - also, is there another program that can generate concetric circular plots of BLAST scores for multiple bacterial genomes with a per nucleotide resolution? thanks [galeb at localhost GeneWiz]$ BLASTatlas -modus circle -ref BX571966.fsa -proteins BX571966.proteins.fsa -ann BX571966.ann -blastcfg blast.cfg -customcfg custom.cfg --dnap="Intrinsic Curvature,Stacking Energy,Position Preference" -title "B. pseudomallei K96243" > sgeneric.ps # title set to 'B. pseudomallei K96243' # output format is ps # modus is 'circle' # loading reference genome ... # loading proteins ... # parsing blast lane configuration (blast.cfg) ... # .. parsing blast lane (B. ubonensis Bu) ... # .. .. program: tblastn # .. .. parsing color 101010_040410 # .. .. .. color from: r:10, g:10, b:10 # .. .. .. color to: r:04, g:04, b:10 # .. .. byrange: 0 .. 0.8 # .. parsing sequene source 'cat ./19539.fsa |' ... 1142 done # .. parsing blast lane (B. pseudomallei DM98) ... # .. .. program: tblastn # .. .. parsing color 101010_040410 # .. .. .. color from: r:10, g:10, b:10 # .. .. .. color to: r:04, g:04, b:10 # .. .. byrange: 0 .. 0.8 # .. parsing sequene source 'cat ./19509.fsa |' ... 2370 done # parsing custom lane configuration (custom.cfg) ... # .. parsing custom data entry SIDD at -0.035 ... # .. .. parsing color 000010_101010 # .. .. .. color from: r:00, g:00, b:10 # .. .. .. color to: r:10, g:10, b:10 # .. .. byrange: 9 .. 10 # .. .. boxfilter 5000 ... # .. parsing data source 'gunzip -c BX571966-57a2f2c2e11ca0dd8cd74493d667d4d6-3173005.sidd--0.035-c-10-c.out.gz | cut -f4 |' ... # .. .. parsing data source ... 3173005 done # reading external files and build hash of sequences ... *panic: schemas() removed in v2.00, not needed anymore* at /usr/local/lib/perl5/site_perl/5.12.2/XML/Compile/WSDL11.pm line 65 XML::Compile::WSDL11::schemas(XML::Compile::WSDL11=HASH(0x1fed6740)) at xml-compile.pl line 48 main::appendSchemas(XML::Compile::WSDL11=HASH(0x1fed6740), " http://www.cbs.dtu.dk/ws/common/ws_common_1_0b.xsd", " http://www.cbs.dtu.dk/ws/BLASTatlas/ws_blastatlas_1_0_ws2.xsd") at BLASTatlas line 177 From jj.emerson at gmail.com Wed Aug 24 21:53:38 2011 From: jj.emerson at gmail.com (J.J. Emerson) Date: Wed, 24 Aug 2011 18:53:38 -0700 Subject: [Bioperl-l] Bio::SeqIO can't guess the format of data from a pipe Message-ID: Hello All, I have experienced some behavior in SeqIO that doesn't seem to be what I would expect. Basically, for a certain script, if I try to pass something like "-fh => \*STDIN" to Bio::SeqIO->new(), it will fail if both of the following two conditions are met simultaneously: 1. STDIN is coming from a pipe; 2. SeqIO is trying to guess the format. If STDIO is coming from redirection instead of a pipe or if the format is specified manually (i.e. BioPERL doesn't have to guess), the error doesn't seem to occur. This issue has been reported previously: http://lists.open-bio.org/pipermail/bioperl-l/2010-July/033681.html https://redmine.open-bio.org/issues/3122 This issue is ultimately one of using seek() on a pipe, which is forbidden (see below). To be clear, there are kludgy ways around this that allow BioPERL to take input from a pipe AND guess the format. My naive and inefficient kludge was to test for reading from STDIN and for the absence of a format. If both of these conditions are met, then I slurp STDIN into a variable and then open a filehandle on that variable, and pass it to SeqIO, which can guess the format if the fh isn't opened on a pipe. SeqIO then successfully guesses the format and does the SeqIO thing, at the expense of having the program pass over the data at least twice. And if the input file is huge, it could potentially consume all the memory. A better way to address the problem would be to process the input one line at a time, but this seems to require more extensive changes. The reason I'm reposting this is because I think that the inability to guess the sequence format from data originating from a pipe is an important limitation for a fundamental part of BioPERL. When designing scripts to be used in pipelines, the inability to guess formats for piped data limits BioPERL's pipelineability substantially. Even though previous reports of this have been made and a bug opened and closed, I was wondering if anyone thought this was worthwhile fixing so as to make SeqIO (and probably AlignIO as well?) more flexible? Does anyone think this should be refiled as a bug? Cheers, J.J. PS Below are snippets of code and/or errors related to reproducing the failure to guess unspecified formats. I'll see how Mailman treats my attachments and post the code as a reply if they don't work. The bioperl_fhtest.pl attachment is the script that reproduces the error. The w.fa is a fasta file containing some sequence. Here are the command lines to generate the behavior I observe (w.fa is a file containing some fasta sequences, in my case it was the w gene from different *Drosophila* species): ./bioperl_fhtest.pl fasta < w.fa # Works (redirection, no guessing) > ./bioperl_fhtest.pl < w.fa # Works (redirection, guessing) > > cat w.fa | ./bioperl_fhtest.pl fasta # Works (pipe, no guessing) > cat w.fa | ./bioperl_fhtest.pl # DOESN'T work (pipe, guessing) > Here's the error I get in the last case: ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: Failed resetting the filehandle; IO error occurred > STACK: Error::throw > STACK: Bio::Root::Root::throw > /usr/local/share/perl/5.10.1/Bio/Root/Root.pm:472 > STACK: Bio::Tools::GuessSeqFormat::guess > /usr/local/share/perl/5.10.1/Bio/Tools/GuessSeqFormat.pm:512 > STACK: Bio::SeqIO::new /usr/local/share/perl/5.10.1/Bio/SeqIO.pm:381 > STACK: ./bioperl_fhtest.pl:8 > ----------------------------------------------------------- > >From what I gather, the error is triggered by a failure of seek() on a STDIO fh on lines 517-518 (text from the version GuessSeqFormat.pm installed on my server): 512 if (defined $self->{-file}) { > 513 # Close the file we opened. > 514 close($fh); > 515 } elsif (ref $fh eq 'GLOB') { > 516 # Try seeking to the start position. > 517 seek($fh, $start_pos, 0) || $self->throw("Failed resetting > the ". > 518 "filehandle; IO error > occurred");; > 519 } elsif (defined $fh && $fh->can('setpos')) { > 520 # Seek to the start position. > 521 $fh->setpos($start_pos); > 522 } > -------------- next part -------------- A non-text attachment was scrubbed... Name: bioperl_fhtest.pl Type: text/x-perl-script Size: 505 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: w.fa Type: application/octet-stream Size: 6335 bytes Desc: not available URL: From frederic.sapet at gmail.com Thu Aug 25 09:24:08 2011 From: frederic.sapet at gmail.com (=?UTF-8?B?RnLDqWTDqXJpYyBTYXBldA==?=) Date: Thu, 25 Aug 2011 15:24:08 +0200 Subject: [Bioperl-l] fasta35 and fasta36 parsing support in BioPerl Message-ID: Hello I have tried to parse a fasta35 report file using BioPerl, in order to produce a valid HTML file. It seems to work well, but there's a small issue with homology string in the report. Please find in joined files, a test script. After that, I have tried to parse a fasta36 file, but this seems to be not supported yet: here is the error thrown : Uncaught exception from user code: ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: Unrecognized alignment line (3) '>--' STACK: Error::throw STACK: Bio::Root::Root::throw /home/bga/bioinfo/fsapet/BioPerlLive/lib/perl5/Bio/Root/Root.pm:472 STACK: Bio::SearchIO::fasta::next_result /home/bga/bioinfo/fsapet/BioPerlLive/lib/perl5/Bio/SearchIO/fasta.pm:1061 STACK: ./test.pl:36 ----------------------------------------------------------- at /usr/lib/perl5/site_perl/5.10.0/Error.pm line 184 Error::throw('Bio::Root::Exception', 'Unrecognized alignment line (3) \'>--\'') called at /home/bga/bioinfo/fsapet/BioPerlLive/lib/perl5/Bio/Root/Root.pm line 472 Bio::Root::Root::throw('Bio::SearchIO::fasta=HASH', 'Unrecognized alignment line (3) \'>--\'') called at /home/bga/bioinfo/fsapet/BioPerlLive/lib/perl5/Bio/SearchIO/fasta.pm line 1061 Bio::SearchIO::fasta::next_result('Bio::SearchIO::fasta=HASH') called at ./test.pl line 36 Thank you Fred -------------- next part -------------- A non-text attachment was scrubbed... Name: FastaBioPerl.tar.bz2 Type: application/x-bzip2 Size: 7692 bytes Desc: not available URL: From miquel.amat at me.com Tue Aug 23 02:07:54 2011 From: miquel.amat at me.com (Miguel A. Amat) Date: Tue, 23 Aug 2011 02:07:54 -0400 Subject: [Bioperl-l] Help Message-ID: <44829080-5467-4103-AF5B-D09CBDA6F99F@me.com> I am trying to install bioperl on mac os x 10.7 but ran into problems with the dependencies Bio::ASN1::EntrezGene and DBD::mysql. I am running the latest version of CPAN and perl -v 5.12.3 and the BioPerl-1.6.1 package. The installation was being conducted interactively through via the "perl Build.PL" command. Can you provide some help? From bosborne11 at verizon.net Thu Aug 25 10:35:29 2011 From: bosborne11 at verizon.net (Brian Osborne) Date: Thu, 25 Aug 2011 10:35:29 -0400 Subject: [Bioperl-l] SeqIO alters Genbank files Message-ID: bioperl-l, I need to run something by you before I commit code and tests. I have code that takes a Genbank file as input and creates another Genbank file as output. I noticed that SeqIO - specifically FTHelper.pm - was taking a tag like this in the input file: /score=100.1 And adding a "note" tag, so the output file contains this: /score=100.1 /note="score=100.1" I'm assuming that the code does this because NCBI will not accept score tags and values even though Bioperl, generally speaking, does not say that NCBI defines the fine details of Genbank format. On the other hand I don't like the idea that SeqIO is altering the content. It also turns out that if you have code that does multiple round-trips you end up with text like this: /score=100.1 /note="score=100.1" /note="score=100.1" /note="score=100.1" /note="score=100.1" Should I comment out the code that's doing these edits or not? Thanks again, Brian O. From cjfields at illinois.edu Thu Aug 25 12:21:15 2011 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 25 Aug 2011 11:21:15 -0500 Subject: [Bioperl-l] Problem using Bio::Tools::Run::RemoteBlast In-Reply-To: <37711.192.168.1.254.1313992876.squirrel@webmail.ibab.ac.in> References: <37711.192.168.1.254.1313992876.squirrel@webmail.ibab.ac.in> Message-ID: It's hard to evaluate what the problem is w/o code, the BioPerl version, and so on. It's very possible you are using an out-of-date BioPerl. chris On Aug 22, 2011, at 1:01 AM, Lucky Singh wrote: > Dear sir/Ma'am, > > I am student of Institute of Bioinformatics and Applied Biotechnology, > Bangalore, India. While doing my project work I needed remoteblast.pm. So > I used default example program which is available with this package. Now I > wanted to host it from web server, but This program is not working from it > may be it is not able to create or write on file from web server but in > command line it is working fine. I don't know the possible reason, please > help me to figure it out. > > > -> I am using same example program with basic cgi modification for taking > input from web browser. > -> Ubuntu 10.04 64 bit OS > -> apache2 server > -> I have given all permissions 777 recursively to cgi-bin folder > > > -- > Regards, > Lucky Singh > > Institute of Bioinformatics and Applied Biotechnology, > ------------------------------------------------------ > Biotech Park > Electronics City Phase I > Bangalore 560 100 > India. > Tel: 080-28528900, 080-28528901, 080-28528902 > Fax: 080-28528904 > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Thu Aug 25 12:34:40 2011 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 25 Aug 2011 11:34:40 -0500 Subject: [Bioperl-l] fasta35 and fasta36 parsing support in BioPerl In-Reply-To: References: Message-ID: <4C95797A-343C-4651-AF0C-964A7E10E8D1@illinois.edu> Frederic, The best place to post this is to our bug server: http://redmine.open-bio.org Attach all relevant data for the bug, this really helps us to diagnose the issue. chris On Aug 25, 2011, at 8:24 AM, Fr?d?ric Sapet wrote: > Hello > I have tried to parse a fasta35 report file using BioPerl, in order to > produce a valid HTML file. > It seems to work well, but there's a small issue with homology string > in the report. > Please find in joined files, a test script. > > After that, I have tried to parse a fasta36 file, but this seems to be > not supported yet: here is the error thrown : > > Uncaught exception from user code: > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: Unrecognized alignment line (3) '>--' > STACK: Error::throw > STACK: Bio::Root::Root::throw > /home/bga/bioinfo/fsapet/BioPerlLive/lib/perl5/Bio/Root/Root.pm:472 > STACK: Bio::SearchIO::fasta::next_result > /home/bga/bioinfo/fsapet/BioPerlLive/lib/perl5/Bio/SearchIO/fasta.pm:1061 > STACK: ./test.pl:36 > ----------------------------------------------------------- > at /usr/lib/perl5/site_perl/5.10.0/Error.pm line 184 > Error::throw('Bio::Root::Exception', 'Unrecognized alignment line (3) > \'>--\'') called at > /home/bga/bioinfo/fsapet/BioPerlLive/lib/perl5/Bio/Root/Root.pm line > 472 > Bio::Root::Root::throw('Bio::SearchIO::fasta=HASH', 'Unrecognized > alignment line (3) \'>--\'') called at > /home/bga/bioinfo/fsapet/BioPerlLive/lib/perl5/Bio/SearchIO/fasta.pm > line 1061 > Bio::SearchIO::fasta::next_result('Bio::SearchIO::fasta=HASH') called > at ./test.pl line 36 > > Thank you > > Fred > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Thu Aug 25 12:42:30 2011 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 25 Aug 2011 11:42:30 -0500 Subject: [Bioperl-l] SeqIO alters Genbank files In-Reply-To: References: Message-ID: Brian, I think comment out the code; our baked-in validation is only half-correct anyway, and I think it's probably a good idea to veer towards separation of format validation and parsing (they're two related but different concerns). To tell the truth, I think we should eschew using FTHelper altogether and just use a Bio::SeqFeatureI-based class directly. I haven't quite grasped the reasoning behind FTHelper.pm, and I would bet removing it as a middleman across the board would help parsing speed. Anyone have an objection to that, or at least an explanation for generation of tons of FTHelper instances that couldn't be handled by a Factory? chris On Aug 25, 2011, at 9:35 AM, Brian Osborne wrote: > bioperl-l, > > I need to run something by you before I commit code and tests. I have code that takes a Genbank file as input and creates another Genbank file as output. I noticed that SeqIO - specifically FTHelper.pm - was taking a tag like this in the input file: > > /score=100.1 > > And adding a "note" tag, so the output file contains this: > > /score=100.1 > /note="score=100.1" > > I'm assuming that the code does this because NCBI will not accept score tags and values even though Bioperl, generally speaking, does not say that NCBI defines the fine details of Genbank format. > > On the other hand I don't like the idea that SeqIO is altering the content. It also turns out that if you have code that does multiple round-trips you end up with text like this: > > /score=100.1 > /note="score=100.1" > /note="score=100.1" > /note="score=100.1" > /note="score=100.1" > > Should I comment out the code that's doing these edits or not? > > Thanks again, > > Brian O. > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Thu Aug 25 12:58:51 2011 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 25 Aug 2011 11:58:51 -0500 Subject: [Bioperl-l] Bio::SeqIO can't guess the format of data from a pipe In-Reply-To: References: Message-ID: <2E070303-ADDF-472B-835B-6D3959B83E9A@illinois.edu> On Aug 24, 2011, at 8:53 PM, J.J. Emerson wrote: > Hello All, > > I have experienced some behavior in SeqIO that doesn't seem to be what I > would expect. Basically, for a certain script, if I try to pass something > like "-fh => \*STDIN" to Bio::SeqIO->new(), it will fail if both of the > following two conditions are met simultaneously: > > 1. STDIN is coming from a pipe; > 2. SeqIO is trying to guess the format. > > If STDIO is coming from redirection instead of a pipe or if the format is > specified manually (i.e. BioPERL doesn't have to guess), the error doesn't > seem to occur. > > This issue has been reported previously: > > http://lists.open-bio.org/pipermail/bioperl-l/2010-July/033681.html > https://redmine.open-bio.org/issues/3122 Yes, this was addressed according to that case. > This issue is ultimately one of using seek() on a pipe, which is forbidden > (see below). To be clear, there are kludgy ways around this that allow > BioPERL to take input from a pipe AND guess the format. My naive and > inefficient kludge was to test for reading from STDIN and for the absence of > a format. If both of these conditions are met, then I slurp STDIN into a > variable and then open a filehandle on that variable, and pass it to SeqIO, > which can guess the format if the fh isn't opened on a pipe. SeqIO then > successfully guesses the format and does the SeqIO thing, at the expense of > having the program pass over the data at least twice. And if the input file > is huge, it could potentially consume all the memory. A better way to > address the problem would be to process the input one line at a time, but > this seems to require more extensive changes. Have you tried tempfiles? Not that this is a great solution, but it's very commonly used for large sequence data, and it is seekable. This behavior could also be wrapped in GuessSeqFormat i suppose (but see below) > The reason I'm reposting this is because I think that the inability to guess > the sequence format from data originating from a pipe is an important > limitation for a fundamental part of BioPERL. When designing scripts to be > used in pipelines, the inability to guess formats for piped data limits > BioPERL's pipelineability substantially. Even though previous reports of > this have been made and a bug opened and closed, I was wondering if anyone > thought this was worthwhile fixing so as to make SeqIO (and probably AlignIO > as well?) more flexible? > > Does anyone think this should be refiled as a bug? > > Cheers, > > J.J. The fundamental problem with pipes (as you indicated) is that the data stream is not seekable. We do have a built-in buffer in Bio::Root::IO that somewhat handles this, but Bio::Tools::GuessSeqFormat is (IIRC) designed to use the filehandle directly, bypassing the BioPerl IO layer completely. One solution is to redesign GuessSeqFormat to use Bio::Root::IO, have GuessSeqFormat push all data back to the buffer, then let SeqIO parse. That will require some fundamental changes for both Bio::Root::IO and Bio::SeqIO (note that one cannot pass a Bio::Root::IO instance to another Bio::Root::IO-based class for parsing at this time). The other option is (as hinted above) having GuessSeqFormat dump the data to a tempfile, seek back after guessing, and retain the filehandle for Bio::SeqIO. Not the best solutions, but either should work. My question (not a criticism, just trying to understand the problem): why are you going through all the trouble of using GuessSeqFormat as a permanent solution anyway? If you have a stream returning a possibly unknown data type, I would argue that the fundamental bug is not GuessSeqFormat but something else, more specifically not knowing the behavior of the data source and the returned format to begin with. Is something preventing that? My point is, GuessSeqFormat is fine as a temporary stop-gap, but it is not a permanent solution to your problems (it is guessing, after all). Note the code has had very little development over the years, and the related SeqIO code hasn't aged particularly well. > PS > > Below are snippets of code and/or errors related to reproducing the failure > to guess unspecified formats. I'll see how Mailman treats my attachments and > post the code as a reply if they don't work. > > The bioperl_fhtest.pl attachment is the script that reproduces the error. > The w.fa is a fasta file containing some sequence. > > Here are the command lines to generate the behavior I observe (w.fa is a > file containing some fasta sequences, in my case it was the w gene from > different *Drosophila* species): > > ./bioperl_fhtest.pl fasta < w.fa # Works (redirection, no guessing) >> ./bioperl_fhtest.pl < w.fa # Works (redirection, guessing) >> >> cat w.fa | ./bioperl_fhtest.pl fasta # Works (pipe, no guessing) >> cat w.fa | ./bioperl_fhtest.pl # DOESN'T work (pipe, guessing) >> > > > Here's the error I get in the last case: > > ------------- EXCEPTION: Bio::Root::Exception ------------- >> MSG: Failed resetting the filehandle; IO error occurred >> STACK: Error::throw >> STACK: Bio::Root::Root::throw >> /usr/local/share/perl/5.10.1/Bio/Root/Root.pm:472 >> STACK: Bio::Tools::GuessSeqFormat::guess >> /usr/local/share/perl/5.10.1/Bio/Tools/GuessSeqFormat.pm:512 >> STACK: Bio::SeqIO::new /usr/local/share/perl/5.10.1/Bio/SeqIO.pm:381 >> STACK: ./bioperl_fhtest.pl:8 >> ----------------------------------------------------------- >> > >> From what I gather, the error is triggered by a failure of seek() on a STDIO > fh on lines 517-518 (text from the version GuessSeqFormat.pm installed on my > server): > > 512 if (defined $self->{-file}) { >> 513 # Close the file we opened. >> 514 close($fh); >> 515 } elsif (ref $fh eq 'GLOB') { >> 516 # Try seeking to the start position. >> 517 seek($fh, $start_pos, 0) || $self->throw("Failed resetting >> the ". >> 518 "filehandle; IO error >> occurred");; >> 519 } elsif (defined $fh && $fh->can('setpos')) { >> 520 # Seek to the start position. >> 521 $fh->setpos($start_pos); >> 522 } >> > _______________________________________________ You are always welcome to reopen and update the bug, or file a new one. chris From cjfields at illinois.edu Thu Aug 25 13:16:03 2011 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 25 Aug 2011 12:16:03 -0500 Subject: [Bioperl-l] SeqIO alters Genbank files In-Reply-To: References: Message-ID: <393F144A-AECE-4F7D-B418-B71D46F3C82F@illinois.edu> Brian, Yes, that's correct (comment out or remove the other stuff). Not sure what difference it will make, I'm interested to see if anything fundamental expects this behavior and breaks with tests. Using 'git blame', it appears Allen Day added this in relation to Feature-Annotation code we actually reverted a few years ago, so this should be removed anyway. I still think we should work around FTHelper altogether. Reading the code, it seems like a ton of wasted instances being generated for no apparent reason. Now going back to our bioperl archives to see if there is any need for it... chris On Aug 25, 2011, at 11:53 AM, Brian Osborne wrote: > Chris, > > OK, will do. I should add that an early version of FTHelper was doing this same edit with the "strand", "source_tag", and "frame" tags but someone has commented out the "source_tag" and "strand" lines. > > Should I comment out both "score" and "frame" code? > > BIO > > On Aug 25, 2011, at 12:42 PM, Chris Fields wrote: > >> Brian, >> >> I think comment out the code; our baked-in validation is only half-correct anyway, and I think it's probably a good idea to veer towards separation of format validation and parsing (they're two related but different concerns). >> >> To tell the truth, I think we should eschew using FTHelper altogether and just use a Bio::SeqFeatureI-based class directly. I haven't quite grasped the reasoning behind FTHelper.pm, and I would bet removing it as a middleman across the board would help parsing speed. Anyone have an objection to that, or at least an explanation for generation of tons of FTHelper instances that couldn't be handled by a Factory? >> >> chris >> >> On Aug 25, 2011, at 9:35 AM, Brian Osborne wrote: >> >>> bioperl-l, >>> >>> I need to run something by you before I commit code and tests. I have code that takes a Genbank file as input and creates another Genbank file as output. I noticed that SeqIO - specifically FTHelper.pm - was taking a tag like this in the input file: >>> >>> /score=100.1 >>> >>> And adding a "note" tag, so the output file contains this: >>> >>> /score=100.1 >>> /note="score=100.1" >>> >>> I'm assuming that the code does this because NCBI will not accept score tags and values even though Bioperl, generally speaking, does not say that NCBI defines the fine details of Genbank format. >>> >>> On the other hand I don't like the idea that SeqIO is altering the content. It also turns out that if you have code that does multiple round-trips you end up with text like this: >>> >>> /score=100.1 >>> /note="score=100.1" >>> /note="score=100.1" >>> /note="score=100.1" >>> /note="score=100.1" >>> >>> Should I comment out the code that's doing these edits or not? >>> >>> Thanks again, >>> >>> Brian O. >>> >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > From bosborne11 at verizon.net Thu Aug 25 12:53:08 2011 From: bosborne11 at verizon.net (Brian Osborne) Date: Thu, 25 Aug 2011 12:53:08 -0400 Subject: [Bioperl-l] SeqIO alters Genbank files In-Reply-To: References: Message-ID: Chris, OK, will do. I should add that an early version of FTHelper was doing this same edit with the "strand", "source_tag", and "frame" tags but someone has commented out the "source_tag" and "strand" lines. Should I comment out both "score" and "frame" code? BIO On Aug 25, 2011, at 12:42 PM, Chris Fields wrote: > Brian, > > I think comment out the code; our baked-in validation is only half-correct anyway, and I think it's probably a good idea to veer towards separation of format validation and parsing (they're two related but different concerns). > > To tell the truth, I think we should eschew using FTHelper altogether and just use a Bio::SeqFeatureI-based class directly. I haven't quite grasped the reasoning behind FTHelper.pm, and I would bet removing it as a middleman across the board would help parsing speed. Anyone have an objection to that, or at least an explanation for generation of tons of FTHelper instances that couldn't be handled by a Factory? > > chris > > On Aug 25, 2011, at 9:35 AM, Brian Osborne wrote: > >> bioperl-l, >> >> I need to run something by you before I commit code and tests. I have code that takes a Genbank file as input and creates another Genbank file as output. I noticed that SeqIO - specifically FTHelper.pm - was taking a tag like this in the input file: >> >> /score=100.1 >> >> And adding a "note" tag, so the output file contains this: >> >> /score=100.1 >> /note="score=100.1" >> >> I'm assuming that the code does this because NCBI will not accept score tags and values even though Bioperl, generally speaking, does not say that NCBI defines the fine details of Genbank format. >> >> On the other hand I don't like the idea that SeqIO is altering the content. It also turns out that if you have code that does multiple round-trips you end up with text like this: >> >> /score=100.1 >> /note="score=100.1" >> /note="score=100.1" >> /note="score=100.1" >> /note="score=100.1" >> >> Should I comment out the code that's doing these edits or not? >> >> Thanks again, >> >> Brian O. >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From jj.emerson at gmail.com Thu Aug 25 14:52:48 2011 From: jj.emerson at gmail.com (J.J. Emerson) Date: Thu, 25 Aug 2011 11:52:48 -0700 Subject: [Bioperl-l] Bio::SeqIO can't guess the format of data from a pipe In-Reply-To: <2E070303-ADDF-472B-835B-6D3959B83E9A@illinois.edu> References: <2E070303-ADDF-472B-835B-6D3959B83E9A@illinois.edu> Message-ID: Hi Chris, You asked: My question (not a criticism, just trying to understand the problem): why > are you going through all the trouble of using GuessSeqFormat as a permanent > solution anyway? If you have a stream returning a possibly unknown data > type, I would argue that the fundamental bug is not GuessSeqFormat but > something else, more specifically not knowing the behavior of the data > source and the returned format to begin with. Is something preventing that? > In my particular case, I'm trying not to impose a particular usage scenario onto the script I'm writing in the hopes it will be useful (and general) to others in my lab in the future*. In my proximate case, I will certainly be able to provide SeqIO with a format argument. But insofar as GuessSeqFormat is considered desirable (and reasonable people could indeed disagree whether it is desirable) I think its applicability shouldn't hinge on whether it is guessing on a pipe or a file. My point is, GuessSeqFormat is fine as a temporary stop-gap, but it is not a > permanent solution to your problems (it is guessing, after all). Note the > code has had very little development over the years, and the related SeqIO > code hasn't aged particularly well. > I see. I wasn't aware that GuessSeqFormat was so relatively neglected. Given the rather challenging nature of the more elegant fix you suggested (using the buffering of Root:IO), perhaps I should consider dropping my issue or filing it as a feature request rather than a bug? Cheers, J.J. PS * The way I plan on using my script is roughly as follows: prog1 [some arguments] \ | myscript.pl --informat fasta \ | prog2 \ | prog3 > pipeline.output However, I'd like for the "--informat" switch to be optional, mainly to increase usability for other users. For any well considered format, the information is right there in the data to know what the format is, and as such, providing the format a second time is somewhat redundant. In principle, being able to do the following would be very useful: prog1 [some arguments] \ | myscript.pl \ | prog2 > pipeline.output The modularity of pipelining is very valuable and this is what caused me to anticipate a usage scenario that involved both GuessSeqFormat and reading from a pipe. On Thu, Aug 25, 2011 at 9:58 AM, Chris Fields wrote: > On Aug 24, 2011, at 8:53 PM, J.J. Emerson wrote: > > > Hello All, > > > > I have experienced some behavior in SeqIO that doesn't seem to be what I > > would expect. Basically, for a certain script, if I try to pass something > > like "-fh => \*STDIN" to Bio::SeqIO->new(), it will fail if both of the > > following two conditions are met simultaneously: > > > > 1. STDIN is coming from a pipe; > > 2. SeqIO is trying to guess the format. > > > > If STDIO is coming from redirection instead of a pipe or if the format is > > specified manually (i.e. BioPERL doesn't have to guess), the error > doesn't > > seem to occur. > > > > This issue has been reported previously: > > > > http://lists.open-bio.org/pipermail/bioperl-l/2010-July/033681.html > > https://redmine.open-bio.org/issues/3122 > > Yes, this was addressed according to that case. > > > This issue is ultimately one of using seek() on a pipe, which is > forbidden > > (see below). To be clear, there are kludgy ways around this that allow > > BioPERL to take input from a pipe AND guess the format. My naive and > > inefficient kludge was to test for reading from STDIN and for the absence > of > > a format. If both of these conditions are met, then I slurp STDIN into a > > variable and then open a filehandle on that variable, and pass it to > SeqIO, > > which can guess the format if the fh isn't opened on a pipe. SeqIO then > > successfully guesses the format and does the SeqIO thing, at the expense > of > > having the program pass over the data at least twice. And if the input > file > > is huge, it could potentially consume all the memory. A better way to > > address the problem would be to process the input one line at a time, but > > this seems to require more extensive changes. > > Have you tried tempfiles? Not that this is a great solution, but it's very > commonly used for large sequence data, and it is seekable. This behavior > could also be wrapped in GuessSeqFormat i suppose (but see below) > > > The reason I'm reposting this is because I think that the inability to > guess > > the sequence format from data originating from a pipe is an important > > limitation for a fundamental part of BioPERL. When designing scripts to > be > > used in pipelines, the inability to guess formats for piped data limits > > BioPERL's pipelineability substantially. Even though previous reports of > > this have been made and a bug opened and closed, I was wondering if > anyone > > thought this was worthwhile fixing so as to make SeqIO (and probably > AlignIO > > as well?) more flexible? > > > > Does anyone think this should be refiled as a bug? > > > > Cheers, > > > > J.J. > > The fundamental problem with pipes (as you indicated) is that the data > stream is not seekable. We do have a built-in buffer in Bio::Root::IO that > somewhat handles this, but Bio::Tools::GuessSeqFormat is (IIRC) designed to > use the filehandle directly, bypassing the BioPerl IO layer completely. > > One solution is to redesign GuessSeqFormat to use Bio::Root::IO, have > GuessSeqFormat push all data back to the buffer, then let SeqIO parse. That > will require some fundamental changes for both Bio::Root::IO and Bio::SeqIO > (note that one cannot pass a Bio::Root::IO instance to another > Bio::Root::IO-based class for parsing at this time). > > The other option is (as hinted above) having GuessSeqFormat dump the data > to a tempfile, seek back after guessing, and retain the filehandle for > Bio::SeqIO. Not the best solutions, but either should work. > > My question (not a criticism, just trying to understand the problem): why > are you going through all the trouble of using GuessSeqFormat as a permanent > solution anyway? If you have a stream returning a possibly unknown data > type, I would argue that the fundamental bug is not GuessSeqFormat but > something else, more specifically not knowing the behavior of the data > source and the returned format to begin with. Is something preventing that? > > My point is, GuessSeqFormat is fine as a temporary stop-gap, but it is not > a permanent solution to your problems (it is guessing, after all). Note the > code has had very little development over the years, and the related SeqIO > code hasn't aged particularly well. > > > PS > > > > Below are snippets of code and/or errors related to reproducing the > failure > > to guess unspecified formats. I'll see how Mailman treats my attachments > and > > post the code as a reply if they don't work. > > > > The bioperl_fhtest.pl attachment is the script that reproduces the > error. > > The w.fa is a fasta file containing some sequence. > > > > Here are the command lines to generate the behavior I observe (w.fa is a > > file containing some fasta sequences, in my case it was the w gene from > > different *Drosophila* species): > > > > ./bioperl_fhtest.pl fasta < w.fa # Works (redirection, no guessing) > >> ./bioperl_fhtest.pl < w.fa # Works (redirection, guessing) > >> > >> cat w.fa | ./bioperl_fhtest.pl fasta # Works (pipe, no guessing) > >> cat w.fa | ./bioperl_fhtest.pl # DOESN'T work (pipe, guessing) > >> > > > > > > Here's the error I get in the last case: > > > > ------------- EXCEPTION: Bio::Root::Exception ------------- > >> MSG: Failed resetting the filehandle; IO error occurred > >> STACK: Error::throw > >> STACK: Bio::Root::Root::throw > >> /usr/local/share/perl/5.10.1/Bio/Root/Root.pm:472 > >> STACK: Bio::Tools::GuessSeqFormat::guess > >> /usr/local/share/perl/5.10.1/Bio/Tools/GuessSeqFormat.pm:512 > >> STACK: Bio::SeqIO::new /usr/local/share/perl/5.10.1/Bio/SeqIO.pm:381 > >> STACK: ./bioperl_fhtest.pl:8 > >> ----------------------------------------------------------- > >> > > > >> From what I gather, the error is triggered by a failure of seek() on a > STDIO > > fh on lines 517-518 (text from the version GuessSeqFormat.pm installed on > my > > server): > > > > 512 if (defined $self->{-file}) { > >> 513 # Close the file we opened. > >> 514 close($fh); > >> 515 } elsif (ref $fh eq 'GLOB') { > >> 516 # Try seeking to the start position. > >> 517 seek($fh, $start_pos, 0) || $self->throw("Failed > resetting > >> the ". > >> 518 "filehandle; IO error > >> occurred");; > >> 519 } elsif (defined $fh && $fh->can('setpos')) { > >> 520 # Seek to the start position. > >> 521 $fh->setpos($start_pos); > >> 522 } > >> > > _______________________________________________ > > You are always welcome to reopen and update the bug, or file a new one. > > chris > > From cjfields at illinois.edu Thu Aug 25 17:04:15 2011 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 25 Aug 2011 16:04:15 -0500 Subject: [Bioperl-l] Bio::SeqIO can't guess the format of data from a pipe In-Reply-To: References: <2E070303-ADDF-472B-835B-6D3959B83E9A@illinois.edu> Message-ID: On Aug 25, 2011, at 1:52 PM, J.J. Emerson wrote: > Hi Chris, > > You asked: > > My question (not a criticism, just trying to understand the problem): why are you going through all the trouble of using GuessSeqFormat as a permanent solution anyway? If you have a stream returning a possibly unknown data type, I would argue that the fundamental bug is not GuessSeqFormat but something else, more specifically not knowing the behavior of the data source and the returned format to begin with. Is something preventing that? > > In my particular case, I'm trying not to impose a particular usage scenario onto the script I'm writing in the hopes it will be useful (and general) to others in my lab in the future*. In my proximate case, I will certainly be able to provide SeqIO with a format argument. But insofar as GuessSeqFormat is considered desirable (and reasonable people could indeed disagree whether it is desirable) I think its applicability shouldn't hinge on whether it is guessing on a pipe or a file. > > My point is, GuessSeqFormat is fine as a temporary stop-gap, but it is not a permanent solution to your problems (it is guessing, after all). Note the code has had very little development over the years, and the related SeqIO code hasn't aged particularly well. > > I see. I wasn't aware that GuessSeqFormat was so relatively neglected. Given the rather challenging nature of the more elegant fix you suggested (using the buffering of Root:IO), perhaps I should consider dropping my issue or filing it as a feature request rather than a bug? That's fine. I don't want to dissuade you from taking this on, either. > Cheers, > > J.J. > > PS > > * The way I plan on using my script is roughly as follows: > > prog1 [some arguments] \ > | myscript.pl --informat fasta \ > | prog2 \ > | prog3 > pipeline.output > > However, I'd like for the "--informat" switch to be optional, mainly to increase usability for other users. For any well considered format, the information is right there in the data to know what the format is, and as such, providing the format a second time is somewhat redundant. In principle, being able to do the following would be very useful: > > prog1 [some arguments] \ > | myscript.pl \ > | prog2 > pipeline.output > > The modularity of pipelining is very valuable and this is what caused me to anticipate a usage scenario that involved both GuessSeqFormat and reading from a pipe. Not disagreeing with you at all, flexible code is best. chris From hlapp at drycafe.net Thu Aug 25 22:29:44 2011 From: hlapp at drycafe.net (Hilmar Lapp) Date: Fri, 26 Aug 2011 11:29:44 +0900 Subject: [Bioperl-l] SeqIO alters Genbank files In-Reply-To: References: Message-ID: <00375B6C-64AE-4D43-9D98-6CD90C31A76A@drycafe.net> Could this behavior perhaps be made optional, with the default being off? -hilmar On Aug 25, 2011, at 11:35 PM, Brian Osborne wrote: > bioperl-l, > > I need to run something by you before I commit code and tests. I > have code that takes a Genbank file as input and creates another > Genbank file as output. I noticed that SeqIO - specifically > FTHelper.pm - was taking a tag like this in the input file: > > /score=100.1 > > And adding a "note" tag, so the output file contains this: > > /score=100.1 > /note="score=100.1" > > I'm assuming that the code does this because NCBI will not accept > score tags and values even though Bioperl, generally speaking, does > not say that NCBI defines the fine details of Genbank format. > > On the other hand I don't like the idea that SeqIO is altering the > content. It also turns out that if you have code that does multiple > round-trips you end up with text like this: > > /score=100.1 > /note="score=100.1" > /note="score=100.1" > /note="score=100.1" > /note="score=100.1" > > Should I comment out the code that's doing these edits or not? > > Thanks again, > > Brian O. > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : =========================================================== From carandraug+dev at gmail.com Fri Aug 26 10:20:39 2011 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Fri, 26 Aug 2011 15:20:39 +0100 Subject: [Bioperl-l] Problem using Bio::Tools::Run::RemoteBlast In-Reply-To: <37711.192.168.1.254.1313992876.squirrel@webmail.ibab.ac.in> References: <37711.192.168.1.254.1313992876.squirrel@webmail.ibab.ac.in> Message-ID: On 22 August 2011 07:01, Lucky Singh wrote: > Now I > wanted to host it from web server, but This program is not working from it > may be it is not able to create or write on file from web server but in > command line it is working fine. I don't know the possible reason, please > help me to figure it out. Have you looked in the apache logs (look in /var/log/apache2/error.log) ? Can you pastebin your whole code and the content of the error log after trying to run the script? From bosborne11 at verizon.net Fri Aug 26 10:39:44 2011 From: bosborne11 at verizon.net (Brian Osborne) Date: Fri, 26 Aug 2011 10:39:44 -0400 Subject: [Bioperl-l] SeqIO alters Genbank files In-Reply-To: <00375B6C-64AE-4D43-9D98-6CD90C31A76A@drycafe.net> References: <00375B6C-64AE-4D43-9D98-6CD90C31A76A@drycafe.net> Message-ID: <9EB8EA4F-0E22-4446-A57E-F726E001B068@verizon.net> Hilmar, Yes, of course. Are you thinking that this code is designed, in part, to help people submit to NCBI? BIO On Aug 25, 2011, at 10:29 PM, Hilmar Lapp wrote: > Could this behavior perhaps be made optional, with the default being off? > > -hilmar > > On Aug 25, 2011, at 11:35 PM, Brian Osborne wrote: > >> bioperl-l, >> >> I need to run something by you before I commit code and tests. I have code that takes a Genbank file as input and creates another Genbank file as output. I noticed that SeqIO - specifically FTHelper.pm - was taking a tag like this in the input file: >> >> /score=100.1 >> >> And adding a "note" tag, so the output file contains this: >> >> /score=100.1 >> /note="score=100.1" >> >> I'm assuming that the code does this because NCBI will not accept score tags and values even though Bioperl, generally speaking, does not say that NCBI defines the fine details of Genbank format. >> >> On the other hand I don't like the idea that SeqIO is altering the content. It also turns out that if you have code that does multiple round-trips you end up with text like this: >> >> /score=100.1 >> /note="score=100.1" >> /note="score=100.1" >> /note="score=100.1" >> /note="score=100.1" >> >> Should I comment out the code that's doing these edits or not? >> >> Thanks again, >> >> Brian O. >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : > =========================================================== > > > > From hlapp at drycafe.net Fri Aug 26 10:50:26 2011 From: hlapp at drycafe.net (Hilmar Lapp) Date: Fri, 26 Aug 2011 23:50:26 +0900 Subject: [Bioperl-l] SeqIO alters Genbank files In-Reply-To: <9EB8EA4F-0E22-4446-A57E-F726E001B068@verizon.net> References: <00375B6C-64AE-4D43-9D98-6CD90C31A76A@drycafe.net> <9EB8EA4F-0E22-4446-A57E-F726E001B068@verizon.net> Message-ID: On Aug 26, 2011, at 11:39 PM, Brian Osborne wrote: > Are you thinking that this code is designed, in part, to help people > submit to NCBI? I don't know, but perhaps. My thinking was, if the code is doing something that's useful in some, but bad in many or most other situations, it'd be nice if the useful behavior could be retained as an option for those who expressly want (or need) it. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : =========================================================== From florent.angly at gmail.com Sat Aug 27 07:12:05 2011 From: florent.angly at gmail.com (Florent Angly) Date: Sat, 27 Aug 2011 21:12:05 +1000 Subject: [Bioperl-l] Bio::SeqIO can't guess the format of data from a pipe In-Reply-To: References: <2E070303-ADDF-472B-835B-6D3959B83E9A@illinois.edu> Message-ID: <4E58D105.7050805@gmail.com> On the topic of guessing file formats, last I checked, it was difficult to reuse the format guessed by Bio::SeqIO For example, if I want to takes sequences in any format (FASTA, FASTQ, ...) and filter some of them out and put them in a new file in the same format, I need to do something along these lines: # Open the file and let BioPerl guess its format my $in = Bio::SeqIO->new( -file => $input_seqfile ); # Have Bioperl guess the format (again) so we can use the same format for the output file my $format = $in->_guess_format( $input_seqfile ); # Open the output file (same format as the input file my $out = Bio::SeqIO->new( -file => ">".$output_seqfile , format => $format ); # Now do the work... The limitations of the code above is that in is more complex than it should be and forces Bioperl do check the file format twice. My proposal would be to store the format of a file somewhere in the Bio::SeqIO object and create a new get/set method in Bio::SeqIO called format() to store of access its value. The idea would be that the example code above could be rewritten as: # Open the file and let BioPerl guess its format my $in = Bio::SeqIO->new( -file => $input_seqfile ); # Retrieve the format guessed by BioPerl my $format = $in->format( ); # Open the output file using the same format as the input file my $out = Bio::SeqIO->new( -file => ">".$output_seqfile , format => $format ); # Now do the work... I think this is more elegant since it is more readable, requires less computation (the file format is guessed once), and is more consistent with other Bio::SeqIO methods like alphabet, that guesses the alphabet but has a get/set method to access it. Florent On 26/08/11 07:04, Chris Fields wrote: > On Aug 25, 2011, at 1:52 PM, J.J. Emerson wrote: > >> Hi Chris, >> >> You asked: >> >> My question (not a criticism, just trying to understand the problem): why are you going through all the trouble of using GuessSeqFormat as a permanent solution anyway? If you have a stream returning a possibly unknown data type, I would argue that the fundamental bug is not GuessSeqFormat but something else, more specifically not knowing the behavior of the data source and the returned format to begin with. Is something preventing that? >> >> In my particular case, I'm trying not to impose a particular usage scenario onto the script I'm writing in the hopes it will be useful (and general) to others in my lab in the future*. In my proximate case, I will certainly be able to provide SeqIO with a format argument. But insofar as GuessSeqFormat is considered desirable (and reasonable people could indeed disagree whether it is desirable) I think its applicability shouldn't hinge on whether it is guessing on a pipe or a file. >> >> My point is, GuessSeqFormat is fine as a temporary stop-gap, but it is not a permanent solution to your problems (it is guessing, after all). Note the code has had very little development over the years, and the related SeqIO code hasn't aged particularly well. >> >> I see. I wasn't aware that GuessSeqFormat was so relatively neglected. Given the rather challenging nature of the more elegant fix you suggested (using the buffering of Root:IO), perhaps I should consider dropping my issue or filing it as a feature request rather than a bug? > That's fine. I don't want to dissuade you from taking this on, either. > >> Cheers, >> >> J.J. >> >> PS >> >> * The way I plan on using my script is roughly as follows: >> >> prog1 [some arguments] \ >> | myscript.pl --informat fasta \ >> | prog2 \ >> | prog3> pipeline.output >> >> However, I'd like for the "--informat" switch to be optional, mainly to increase usability for other users. For any well considered format, the information is right there in the data to know what the format is, and as such, providing the format a second time is somewhat redundant. In principle, being able to do the following would be very useful: >> >> prog1 [some arguments] \ >> | myscript.pl \ >> | prog2> pipeline.output >> >> The modularity of pipelining is very valuable and this is what caused me to anticipate a usage scenario that involved both GuessSeqFormat and reading from a pipe. > Not disagreeing with you at all, flexible code is best. > > chris > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Fri Aug 26 23:54:05 2011 From: cjfields at illinois.edu (Chris Fields) Date: Fri, 26 Aug 2011 22:54:05 -0500 Subject: [Bioperl-l] Bio::SeqIO can't guess the format of data from a pipe In-Reply-To: <4E58D105.7050805@gmail.com> References: <2E070303-ADDF-472B-835B-6D3959B83E9A@illinois.edu> <4E58D105.7050805@gmail.com> Message-ID: On Aug 27, 2011, at 6:12 AM, Florent Angly wrote: > On the topic of guessing file formats, last I checked, it was difficult to reuse the format guessed by Bio::SeqIO > > For example, if I want to takes sequences in any format (FASTA, FASTQ, ...) and filter some of them out and put them in a new file in the same format, I need to do something along these lines: > > # Open the file and let BioPerl guess its format > my $in = Bio::SeqIO->new( -file => $input_seqfile ); > > # Have Bioperl guess the format (again) so we can use the same format for the output file > my $format = $in->_guess_format( $input_seqfile ); > > # Open the output file (same format as the input file > my $out = Bio::SeqIO->new( -file => ">".$output_seqfile , format => $format ); > > # Now do the work... > > The limitations of the code above is that in is more complex than it should be and forces Bioperl do check the file format twice. My proposal would be to store the format of a file somewhere in the Bio::SeqIO object and create a new get/set method in Bio::SeqIO called format() to store of access its value. The name of the class is the format (that's how they are loaded). We could add this as a convenience level for Bio::SeqIO (fairly easy to do, actually), but it would only makes sense as a getter. Bio::SeqIO dynamically loads the proper Bio::SeqIO:: module in the constructor (Bio::SeqIO::genbank, for example). Being able to set the format to 'fasta' with a loaded Bio::SeqIO::genbank still gets GenBank format. > The idea would be that the example code above could be rewritten as: > > # Open the file and let BioPerl guess its format > my $in = Bio::SeqIO->new( -file => $input_seqfile ); > > # Retrieve the format guessed by BioPerl > my $format = $in->format( ); > > # Open the output file using the same format as the input file > my $out = Bio::SeqIO->new( -file => ">".$output_seqfile , format => $format ); > > # Now do the work... > > I think this is more elegant since it is more readable, requires less computation (the file format is guessed once), and is more consistent with other Bio::SeqIO methods like alphabet, that guesses the alphabet but has a get/set method to access it. > > Florent Guessing the alphabet for the vast majority of sequence data isn't quite as complex and quixotic as guessing a sequence format. The latter is far more variable and infinitely increases, much like standards (ex: http://xkcd.com/927/). Not that sequences aren't capable of change... chris From hlapp at drycafe.net Fri Aug 26 23:43:57 2011 From: hlapp at drycafe.net (Hilmar Lapp) Date: Sat, 27 Aug 2011 12:43:57 +0900 Subject: [Bioperl-l] Bio::SeqIO can't guess the format of data from a pipe In-Reply-To: <4E58D105.7050805@gmail.com> References: <2E070303-ADDF-472B-835B-6D3959B83E9A@illinois.edu> <4E58D105.7050805@gmail.com> Message-ID: The format is already available - it is in essence the class of the SeqIO instance: my $format = ref($in); Rather than passing that into SeqIO->new(), you can directly instantiate a new object from it: my $out = ref($in)->new(-file => ...); Would that address what you are trying to accomplish? -hilmar Sent with a tap. On Aug 27, 2011, at 8:12 PM, Florent Angly wrote: > My proposal would be to store the format of a file somewhere in the Bio::SeqIO object and create a new get/set method in Bio::SeqIO called format() to store of access its value. The idea would be that the example code above could be rewritten as: > > # Open the file and let BioPerl guess its format > my $in = Bio::SeqIO->new( -file => $input_seqfile ); > > # Retrieve the format guessed by BioPerl > my $format = $in->format( ); > > # Open the output file using the same format as the input file > my $out = Bio::SeqIO->new( -file => ">".$output_seqfile , format => $format ); > > # Now do the work... > > I think this is more elegant since it is more readable, requires less computation (the file format is guessed once), and is more consistent with other Bio::SeqIO methods like alphabet, that guesses the alphabet but has a get/set method to access it. From florent.angly at gmail.com Sun Aug 28 05:08:32 2011 From: florent.angly at gmail.com (Florent Angly) Date: Sun, 28 Aug 2011 19:08:32 +1000 Subject: [Bioperl-l] Bio::SeqIO can't guess the format of data from a pipe In-Reply-To: References: <2E070303-ADDF-472B-835B-6D3959B83E9A@illinois.edu> <4E58D105.7050805@gmail.com> Message-ID: <4E5A0590.2010805@gmail.com> Yes indeed, that's a very convenient way to implement a format() methods that gets the format of the file. I'll try to implement it today. More logic may be involved because of the formats that take variants, e.g. the FASTQ format (Bio::SeqIO::fastq module) has a 'sanger', 'illumina' and 'solexa' variants. Florent On 27/08/11 13:43, Hilmar Lapp wrote: > The format is already available - it is in essence the class of the SeqIO instance: > > my $format = ref($in); > > Rather than passing that into SeqIO->new(), you can directly instantiate a new object from it: > > my $out = ref($in)->new(-file => ...); > > Would that address what you are trying to accomplish? > > -hilmar > > Sent with a tap. > > On Aug 27, 2011, at 8:12 PM, Florent Angly wrote: > >> My proposal would be to store the format of a file somewhere in the Bio::SeqIO object and create a new get/set method in Bio::SeqIO called format() to store of access its value. The idea would be that the example code above could be rewritten as: >> >> # Open the file and let BioPerl guess its format >> my $in = Bio::SeqIO->new( -file => $input_seqfile ); >> >> # Retrieve the format guessed by BioPerl >> my $format = $in->format( ); >> >> # Open the output file using the same format as the input file >> my $out = Bio::SeqIO->new( -file => ">".$output_seqfile , format => $format ); >> >> # Now do the work... >> >> I think this is more elegant since it is more readable, requires less computation (the file format is guessed once), and is more consistent with other Bio::SeqIO methods like alphabet, that guesses the alphabet but has a get/set method to access it. From cjfields at illinois.edu Sat Aug 27 23:27:34 2011 From: cjfields at illinois.edu (Chris Fields) Date: Sat, 27 Aug 2011 22:27:34 -0500 Subject: [Bioperl-l] Bio::SeqIO can't guess the format of data from a pipe In-Reply-To: <4E5A0590.2010805@gmail.com> References: <2E070303-ADDF-472B-835B-6D3959B83E9A@illinois.edu> <4E58D105.7050805@gmail.com> <4E5A0590.2010805@gmail.com> Message-ID: <8D639B95-0666-4F09-8E9E-88C8CDF76ABC@illinois.edu> There is no reason the variant couldn't also be a method; it's fairly generic to Bio::SeqIO. FASTQ just happens to be the only parser that takes advantage of it (probably b/c I added it when I refactored FASTQ :) See the code for Bio::SeqIO::new to see what is done. Again, like the format it only makes sense as a getter method. chris On Aug 28, 2011, at 4:08 AM, Florent Angly wrote: > > Yes indeed, that's a very convenient way to implement a format() methods that gets the format of the file. I'll try to implement it today. More logic may be involved because of the formats that take variants, e.g. the FASTQ format (Bio::SeqIO::fastq module) has a 'sanger', 'illumina' and 'solexa' variants. > Florent > > > On 27/08/11 13:43, Hilmar Lapp wrote: >> The format is already available - it is in essence the class of the SeqIO instance: >> >> my $format = ref($in); >> >> Rather than passing that into SeqIO->new(), you can directly instantiate a new object from it: >> >> my $out = ref($in)->new(-file => ...); >> >> Would that address what you are trying to accomplish? >> >> -hilmar >> >> Sent with a tap. >> >> On Aug 27, 2011, at 8:12 PM, Florent Angly wrote: >> >>> My proposal would be to store the format of a file somewhere in the Bio::SeqIO object and create a new get/set method in Bio::SeqIO called format() to store of access its value. The idea would be that the example code above could be rewritten as: >>> >>> # Open the file and let BioPerl guess its format >>> my $in = Bio::SeqIO->new( -file => $input_seqfile ); >>> >>> # Retrieve the format guessed by BioPerl >>> my $format = $in->format( ); >>> >>> # Open the output file using the same format as the input file >>> my $out = Bio::SeqIO->new( -file => ">".$output_seqfile , format => $format ); >>> >>> # Now do the work... >>> >>> I think this is more elegant since it is more readable, requires less computation (the file format is guessed once), and is more consistent with other Bio::SeqIO methods like alphabet, that guesses the alphabet but has a get/set method to access it. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From florent.angly at gmail.com Sun Aug 28 18:35:36 2011 From: florent.angly at gmail.com (Florent Angly) Date: Mon, 29 Aug 2011 08:35:36 +1000 Subject: [Bioperl-l] Bio::SeqIO can't guess the format of data from a pipe In-Reply-To: <8D639B95-0666-4F09-8E9E-88C8CDF76ABC@illinois.edu> References: <2E070303-ADDF-472B-835B-6D3959B83E9A@illinois.edu> <4E58D105.7050805@gmail.com> <4E5A0590.2010805@gmail.com> <8D639B95-0666-4F09-8E9E-88C8CDF76ABC@illinois.edu> Message-ID: <4E5AC2B8.9060808@gmail.com> Hi, I implemented the format() getter method in Bio::SeqIO as discussed, essentially following the way proposed by Hilmar. The variant() method is not needed since Bio::SeqIO::fastq already has a get/set method for that. I noticed that there are plenty more Bio*IO modules that could benefit from having a format() method, e.g.: Bio::AlignIO Bio::ClusterIO Bio::FeatureIO Bio::MapIO Bio::OntologyIO Bio::SearchIO Bio::TreeIO Bio::Assembly::IO * The code could be copy-pasted for each of them but it is not very graceful. Is there a way we could have all these IO modules share the same format() method? * Note how the IO class for Bio::Assembly is called Bio::Assembly::IO, and not Bio::AssemblyIO like for other classes. This may be something to change in the future for consistency. Florent On 28/08/11 13:27, Chris Fields wrote: > There is no reason the variant couldn't also be a method; it's fairly generic to Bio::SeqIO. FASTQ just happens to be the only parser that takes advantage of it (probably b/c I added it when I refactored FASTQ :) > > See the code for Bio::SeqIO::new to see what is done. Again, like the format it only makes sense as a getter method. > > chris > > On Aug 28, 2011, at 4:08 AM, Florent Angly wrote: > >> Yes indeed, that's a very convenient way to implement a format() methods that gets the format of the file. I'll try to implement it today. More logic may be involved because of the formats that take variants, e.g. the FASTQ format (Bio::SeqIO::fastq module) has a 'sanger', 'illumina' and 'solexa' variants. >> Florent >> >> >> On 27/08/11 13:43, Hilmar Lapp wrote: >>> The format is already available - it is in essence the class of the SeqIO instance: >>> >>> my $format = ref($in); >>> >>> Rather than passing that into SeqIO->new(), you can directly instantiate a new object from it: >>> >>> my $out = ref($in)->new(-file => ...); >>> >>> Would that address what you are trying to accomplish? >>> >>> -hilmar >>> >>> Sent with a tap. >>> >>> On Aug 27, 2011, at 8:12 PM, Florent Angly wrote: >>> >>>> My proposal would be to store the format of a file somewhere in the Bio::SeqIO object and create a new get/set method in Bio::SeqIO called format() to store of access its value. The idea would be that the example code above could be rewritten as: >>>> >>>> # Open the file and let BioPerl guess its format >>>> my $in = Bio::SeqIO->new( -file => $input_seqfile ); >>>> >>>> # Retrieve the format guessed by BioPerl >>>> my $format = $in->format( ); >>>> >>>> # Open the output file using the same format as the input file >>>> my $out = Bio::SeqIO->new( -file => ">".$output_seqfile , format => $format ); >>>> >>>> # Now do the work... >>>> >>>> I think this is more elegant since it is more readable, requires less computation (the file format is guessed once), and is more consistent with other Bio::SeqIO methods like alphabet, that guesses the alphabet but has a get/set method to access it. >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Sun Aug 28 21:10:27 2011 From: cjfields at illinois.edu (Chris Fields) Date: Sun, 28 Aug 2011 20:10:27 -0500 Subject: [Bioperl-l] Bio::SeqIO can't guess the format of data from a pipe In-Reply-To: <4E5AC2B8.9060808@gmail.com> References: <2E070303-ADDF-472B-835B-6D3959B83E9A@illinois.edu> <4E58D105.7050805@gmail.com> <4E5A0590.2010805@gmail.com> <8D639B95-0666-4F09-8E9E-88C8CDF76ABC@illinois.edu> <4E5AC2B8.9060808@gmail.com> Message-ID: On Aug 28, 2011, at 5:35 PM, Florent Angly wrote: > Hi, > > I implemented the format() getter method in Bio::SeqIO as discussed, essentially following the way proposed by Hilmar. The variant() method is not needed since Bio::SeqIO::fastq already has a get/set method for that. Right, but the method could be used by other modules if it were moved to Bio::SeqIO. for instance. > I noticed that there are plenty more Bio*IO modules that could benefit from having a format() method, e.g.: > Bio::AlignIO > Bio::ClusterIO > Bio::FeatureIO > Bio::MapIO > Bio::OntologyIO > Bio::SearchIO > Bio::TreeIO > Bio::Assembly::IO * > The code could be copy-pasted for each of them but it is not very graceful. Is there a way we could have all these IO modules share the same format() method? Move the method to Bio::Root::IO, the common base class for all of the above. > * Note how the IO class for Bio::Assembly is called Bio::Assembly::IO, and not Bio::AssemblyIO like for other classes. This may be something to change in the future for consistency. > > Florent That's possible; one could take advantage of that for redesign/API issues if it were needed. chris From noncoding at gmail.com Mon Aug 29 06:31:10 2011 From: noncoding at gmail.com (Remo Sanges) Date: Mon, 29 Aug 2011 12:31:10 +0200 Subject: [Bioperl-l] Opportunity: PhD in BIOINFORMATICS at SZN, Naples, Italy In-Reply-To: <7F0AE58E-6052-469B-ACD0-207FAD060472@drycafe.net> References: <7F0AE58E-6052-469B-ACD0-207FAD060472@drycafe.net> Message-ID: <4E5B6A6E.2020508@gmail.com> (Apologies if you have received this already or if this is considered spam. Please feel free to pass on to anyone who might be interested.) The Stazione Zoologica Anton Dohrn in Naples is among the top research institutions in the world in the fields of marine biology and ecology. The new established bioinformatics laboratory is seeking for a candidate interested in the evolution of genome architecture http://bit.ly/okEGvL We are looking for someone who understands basic biological and evolutionary problems and is able to independently accomplish bioinformatics tasks. Candidates will be expected to have knowledge of biology, genetics and functional genomics, to demonstrate the ability to work in a UNIX/Linux environment and to be familiar with a scripting language (e.g. Perl), a database system (e.g. MySQL) and a statistical programming environment (e.g R). Previous experience with comparative genomics and genomics databases as well as an understanding of statistical methods used in the interpretation of biological data is a desirable asset. Wet lab work might be required during the PhD. All the information about the PhD and the guidelines on how to apply are listed on the webpage http://bit.ly/d2WuXk The closing date for applications is 20 September 2011. Kind Regards Remo -- Remo Sanges Bioinformatics - Animal Physiology and Evolution Stazione Zoologica Anton Dohrn Villa Comunale, 80121 Napoli - Italy +39 081 5833428 From locarpau at upvnet.upv.es Mon Aug 29 12:47:13 2011 From: locarpau at upvnet.upv.es (Lorenzo Carretero Paulet) Date: Mon, 29 Aug 2011 18:47:13 +0200 Subject: [Bioperl-l] Saving Codeml Output file In-Reply-To: <9866C4A4-AC36-4A25-B38F-3006A7BB0F11@illinois.edu> References: <9645AF32-5EC3-41FA-9A32-45B6B92E31FD@illinois.edu> <4DF56976.8080704@upvnet.upv.es> <9866C4A4-AC36-4A25-B38F-3006A7BB0F11@illinois.edu> Message-ID: <1314636433.4e5bc291a40c6@webmail.upv.es> Hi all, I'm running codeml from the PAML package using the corresponding Bioperl wrapper. I'd like to save the output file as -outfile => 'mlc', as in: my $codeml_factory = new Bio::Tools::Run::Phylo::PAML::Codeml ( -outfile => 'mlc', -save_tempfiles => 1, -alignment => $codon_MSA, -tree => $biotree, -params => { #'outfile' =>'mlc', 'verbose' => 1, 'noisy' => 9, 'runmode' => 0, #user tree 'seqtype' => 1, 'model' => $model, 'NSsites' => $NSsites, 'fix_omega' => $fix_omega, 'omega' => $omega, 'ncatG' => $ncatG, 'icode' => 0, #* 0:universal code; 1:mammalian mt; 2-10:see below (5:ciliate nuclear) #'fix_alpha' => 0, #'fix_kappa' => 0, #'RateAncestor' => 0, 'CodonFreq' => 2, 'cleandata' => 1, # remove sites with amibguity data (1 yes, 0 no), 'ndata' => 1 }, ); and subsequently parsing it using my $parserF= Bio::Tools::Phylo::PAML->new (-file => "mlc", -dir => "./"); However, I get the following message. ------------- EXCEPTION ------------- MSG: Could not open mlc: No such file or directory STACK Bio::Root::IO::_initialize_io /Library/Perl//5.10.0/Bio/Root/IO.pm:351 STACK Bio::Tools::Phylo::PAML::new /Library/Perl//5.10.0/Bio/Tools/Phylo/PAML.pm:239 STACK main::BranchSiteEvolAnalysis /Users/marioafares/Documents/workspace/BactEvolGen/BactEvolGen.pl:1421 STACK toplevel /Users/marioafares/Documents/workspace/BactEvolGen/BactEvolGen.pl:939 ------------------------------------- what I guess means the output file is not being saved in the previous step. Anyone knows what's wrong. Tnak you very much in advance for your help. Cheers, Lorenzo From David.Messina at sbc.su.se Mon Aug 29 13:43:33 2011 From: David.Messina at sbc.su.se (Dave Messina) Date: Mon, 29 Aug 2011 19:43:33 +0200 Subject: [Bioperl-l] Saving Codeml Output file In-Reply-To: <1314636433.4e5bc291a40c6@webmail.upv.es> References: <9645AF32-5EC3-41FA-9A32-45B6B92E31FD@illinois.edu> <4DF56976.8080704@upvnet.upv.es> <9866C4A4-AC36-4A25-B38F-3006A7BB0F11@illinois.edu> <1314636433.4e5bc291a40c6@webmail.upv.es> Message-ID: Hi Lorenzo, and subsequently parsing it using > my $parserF= Bio::Tools::Phylo::PAML->new (-file => "mlc", -dir => "./"); > > However, I get the following message. > > ------------- EXCEPTION ------------- > MSG: Could not open mlc: No such file or directory > > what I guess means the output file is not being saved in the previous step. > Your interpretation could be correct. I think though that it might be that the -dir parameter you specify, "./", is not correct. Are you seeing the mlc file in the '.' (current working) dir? If I remember correctly, by default the mlc file is created in a temporary directory in /scratch or /tmp, and the save_tempfiles flag simply keeps that temporary directory from being deleted. I don't have the docs in front of me, but I believe there's a way to get the path of the temp directory that B::T::P::PAML is using. If so, you can use that path as the value for the -dir parameter. Let me know if not, though, and we can follow up on this. Dave PS - also, could you verify that you're using the latest versions of bioperl-live and bioperl-run from Github? From Kevin.M.Brown at asu.edu Mon Aug 29 14:09:29 2011 From: Kevin.M.Brown at asu.edu (Kevin Brown) Date: Mon, 29 Aug 2011 11:09:29 -0700 Subject: [Bioperl-l] Saving Codeml Output file In-Reply-To: References: <9645AF32-5EC3-41FA-9A32-45B6B92E31FD@illinois.edu><4DF56976.8080704@upvnet.upv.es><9866C4A4-AC36-4A25-B38F-3006A7BB0F11@illinois.edu><1314636433.4e5bc291a40c6@webmail.upv.es> Message-ID: <1A4207F8295607498283FE9E93B775B407CCB29D@EX02.asurite.ad.asu.edu> Opening a file for output that does not exist requires the > or >> redirector (depending on if you want to overwrite or append output). my $parserF= Bio::Tools::Phylo::PAML->new (-file => ">mlc", -dir => "./"); Kevin Brown Center for Innovations in Medicine Biodesign Institute Arizona State University > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Dave Messina > Sent: Monday, August 29, 2011 10:44 AM > To: Lorenzo Carretero Paulet > Cc: bioperl-l at lists.open-bio.org > Subject: Re: [Bioperl-l] Saving Codeml Output file > > Hi Lorenzo, > > > and subsequently parsing it using > > my $parserF= Bio::Tools::Phylo::PAML->new (-file => "mlc", -dir => > "./"); > > > > However, I get the following message. > > > > ------------- EXCEPTION ------------- > > MSG: Could not open mlc: No such file or directory > > > > > > what I guess means the output file is not being saved in the previous > step. > > > > > Your interpretation could be correct. I think though that it might be > that > the -dir parameter you specify, "./", is not correct. Are you seeing > the mlc > file in the '.' (current working) dir? > > If I remember correctly, by default the mlc file is created in a > temporary > directory in /scratch or /tmp, and the save_tempfiles flag simply keeps > that > temporary directory from being deleted. > > I don't have the docs in front of me, but I believe there's a way to > get the > path of the temp directory that B::T::P::PAML is using. If so, you can > use > that path as the value for the -dir parameter. > > Let me know if not, though, and we can follow up on this. > > Dave > > PS - also, could you verify that you're using the latest versions of > bioperl-live and bioperl-run from Github? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From scott at scottcain.net Mon Aug 29 14:34:41 2011 From: scott at scottcain.net (Scott Cain) Date: Mon, 29 Aug 2011 14:34:41 -0400 Subject: [Bioperl-l] pls help.. In-Reply-To: References: <92CA808D-16F0-4F08-BC44-8A0C06292EA8@scottcain.net> <1D308407-17A9-4203-9D6C-D71FA0FD74D0@illinois.edu> Message-ID: Hi Ravi, Sorry I took a while to get back to you; I was on vacation last week. Also, please keep correspondence on the bioperl mailing list. If you had, perhaps somebody else would have provided another answer by now. I found the bug in the genbank2gff3 script that causes this problem. You have a few options for how to proceed: 1. Split the multi-genbank file into individual files, put them in a directory, and point the script at that directory (with the --dir flag). If you do this, you won't have to do anything with your BioPerl installation. 2. Get a fresh checkout of bioperl-live from git and install BioPerl from it, as I just committed the fix to the master branch. 3. Manually apply the fix that I just put into master. The diff is here: https://github.com/bioperl/bioperl-live/commit/1cff7d541e704a1f35d85bb27a0ab5911d89f8df Scott On Tue, Aug 23, 2011 at 12:55 AM, Ravi Devani wrote: > Yes the script works but have you seen the gff file generated by it. It has > multiple entries for the same features. And the file keeps on growing in > size with thw same features repeated many times. Thats the problem.. > > Thanking you, > Ravi > > > > -- ------------------------------------------------------------------------ Scott Cain, Ph. D.? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? scott at scottcain dot net GMOD Coordinator (http://gmod.org/)? ? ? ? ? ? ? ? ? ?? 216-392-3087 Ontario Institute for Cancer Research From locarpau at upvnet.upv.es Mon Aug 29 14:56:50 2011 From: locarpau at upvnet.upv.es (Lorenzo Carretero Paulet) Date: Mon, 29 Aug 2011 20:56:50 +0200 Subject: [Bioperl-l] Saving Codeml Output file In-Reply-To: References: <9645AF32-5EC3-41FA-9A32-45B6B92E31FD@illinois.edu> <4DF56976.8080704@upvnet.upv.es> <9866C4A4-AC36-4A25-B38F-3006A7BB0F11@illinois.edu> <1314636433.4e5bc291a40c6@webmail.upv.es> Message-ID: <1314644210.4e5be0f277c05@webmail.upv.es> Thanks Dave, Yes. I do not found the output file in the current directory, or in the temp directory. Using my $tmpdir = $codeml_factory->tempdir(); my $parserF= Bio::Tools::Phylo::PAML->new ( -file => "mlc", -dir => "$tmpdir" ); I still get the same error message. I'm using Bioperl version 1.006901. Cheers, Lorenzo Mensaje citado por Dave Messina : > Hi Lorenzo, > > > and subsequently parsing it using > > my $parserF= Bio::Tools::Phylo::PAML->new (-file => "mlc", -dir => "./"); > > > > However, I get the following message. > > > > ------------- EXCEPTION ------------- > > MSG: Could not open mlc: No such file or directory > > > > > > what I guess means the output file is not being saved in the previous step. > > > > > Your interpretation could be correct. I think though that it might be that > the -dir parameter you specify, "./", is not correct. Are you seeing the mlc > file in the '.' (current working) dir? > > If I remember correctly, by default the mlc file is created in a temporary > directory in /scratch or /tmp, and the save_tempfiles flag simply keeps that > temporary directory from being deleted. > > I don't have the docs in front of me, but I believe there's a way to get the > path of the temp directory that B::T::P::PAML is using. If so, you can use > that path as the value for the -dir parameter. > > Let me know if not, though, and we can follow up on this. > > Dave > > PS - also, could you verify that you're using the latest versions of > bioperl-live and bioperl-run from Github? > From locarpau at upvnet.upv.es Mon Aug 29 15:05:49 2011 From: locarpau at upvnet.upv.es (Lorenzo Carretero Paulet) Date: Mon, 29 Aug 2011 21:05:49 +0200 Subject: [Bioperl-l] Saving Codeml Output file In-Reply-To: <1A4207F8295607498283FE9E93B775B407CCB29D@EX02.asurite.ad.asu.edu> References: <9645AF32-5EC3-41FA-9A32-45B6B92E31FD@illinois.edu><4DF56976.8080704@upvnet.upv.es><9866C4A4-AC36-4A25-B38F-3006A7BB0F11@illinois.edu><1314636433.4e5bc291a40c6@webmail.upv.es> <1A4207F8295607498283FE9E93B775B407CCB29D@EX02.asurite.ad.asu.edu> Message-ID: <1314644749.4e5be30d78cb7@webmail.upv.es> Kevin, Still the same. The previous message is preceeded by: Filehandle GEN11 opened only for output at /Library/Perl//5.10.0/Bio/Root/IO.pm line 571 which points to # if the buffer been filled by _pushback then return the buffer # contents, rather than read from the filehandle if( @{$self->{'_readbuffer'} || [] } ) { $line = shift @{$self->{'_readbuffer'}}; } else { $line = <$fh>; } from the inner subroutine _readline of /Bio/Root/IO.pm Best, L Mensaje citado por Kevin Brown : > Opening a file for output that does not exist requires the > or >> > redirector (depending on if you want to overwrite or append output). > > my $parserF= Bio::Tools::Phylo::PAML->new (-file => ">mlc", -dir => > "./"); > > > > Kevin Brown > Center for Innovations in Medicine > Biodesign Institute > Arizona State University > > -----Original Message----- > > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > > bounces at lists.open-bio.org] On Behalf Of Dave Messina > > Sent: Monday, August 29, 2011 10:44 AM > > To: Lorenzo Carretero Paulet > > Cc: bioperl-l at lists.open-bio.org > > Subject: Re: [Bioperl-l] Saving Codeml Output file > > > > Hi Lorenzo, > > > > > > and subsequently parsing it using > > > my $parserF= Bio::Tools::Phylo::PAML->new (-file => "mlc", -dir => > > "./"); > > > > > > However, I get the following message. > > > > > > ------------- EXCEPTION ------------- > > > MSG: Could not open mlc: No such file or directory > > > > > > > > > > what I guess means the output file is not being saved in the > previous > > step. > > > > > > > > > Your interpretation could be correct. I think though that it might be > > that > > the -dir parameter you specify, "./", is not correct. Are you seeing > > the mlc > > file in the '.' (current working) dir? > > > > If I remember correctly, by default the mlc file is created in a > > temporary > > directory in /scratch or /tmp, and the save_tempfiles flag simply > keeps > > that > > temporary directory from being deleted. > > > > I don't have the docs in front of me, but I believe there's a way to > > get the > > path of the temp directory that B::T::P::PAML is using. If so, you can > > use > > that path as the value for the -dir parameter. > > > > Let me know if not, though, and we can follow up on this. > > > > Dave > > > > PS - also, could you verify that you're using the latest versions of > > bioperl-live and bioperl-run from Github? > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From Kevin.M.Brown at asu.edu Mon Aug 29 15:19:53 2011 From: Kevin.M.Brown at asu.edu (Kevin Brown) Date: Mon, 29 Aug 2011 12:19:53 -0700 Subject: [Bioperl-l] Saving Codeml Output file In-Reply-To: <1314636433.4e5bc291a40c6@webmail.upv.es> References: <9645AF32-5EC3-41FA-9A32-45B6B92E31FD@illinois.edu><4DF56976.8080704@upvnet.upv.es><9866C4A4-AC36-4A25-B38F-3006A7BB0F11@illinois.edu> <1314636433.4e5bc291a40c6@webmail.upv.es> Message-ID: <1A4207F8295607498283FE9E93B775B407CCB2D3@EX02.asurite.ad.asu.edu> OK, went back to the original message. And here's where the problem actually originates... my $codeml_factory = new Bio::Tools::Run::Phylo::PAML::Codeml ( # this should cause it to create a file called mlc -outfile => '>mlc', -save_tempfiles => 1, -alignment => $codon_MSA, -tree => $biotree, -params => { 'verbose' => 1, 'noisy' => 9, 'runmode' => 0, #user tree 'seqtype' => 1, 'model' => $model, 'NSsites' => $NSsites, 'fix_omega' => $fix_omega, 'omega' => $omega, 'ncatG' => $ncatG, 'icode' => 0, #* 0:universal code; 1:mammalian mt; 2-10:see below (5:ciliate nuclear) #'fix_alpha' => 0, #'fix_kappa' => 0, #'RateAncestor' => 0, 'CodonFreq' => 2, 'cleandata' => 1, # remove sites with amibguity data (1 yes, 0 no), 'ndata' => 1 }, ); Kevin Brown Center for Innovations in Medicine Biodesign Institute Arizona State University > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Lorenzo Carretero Paulet > Sent: Monday, August 29, 2011 9:47 AM > To: bioperl-l at lists.open-bio.org > Subject: [Bioperl-l] Saving Codeml Output file > > Hi all, > I'm running codeml from the PAML package using the corresponding > Bioperl > wrapper. I'd like to save the output file as -outfile => 'mlc', as in: > my $codeml_factory = new Bio::Tools::Run::Phylo::PAML::Codeml > ( -outfile => 'mlc', > -save_tempfiles => 1, > -alignment => $codon_MSA, > -tree => $biotree, > -params => > { > #'outfile' =>'mlc', > 'verbose' => 1, > 'noisy' => 9, > 'runmode' => 0, #user tree > 'seqtype' => 1, > 'model' => $model, > 'NSsites' => $NSsites, > 'fix_omega' => $fix_omega, > 'omega' => $omega, > 'ncatG' => $ncatG, > 'icode' => 0, #* 0:universal code; 1:mammalian mt; 2-10:see below > (5:ciliate > nuclear) > #'fix_alpha' => 0, > #'fix_kappa' => > 0, #'RateAncestor' => 0, > 'CodonFreq' => 2, > 'cleandata' => > 1, # remove sites with amibguity data (1 yes, 0 no), > 'ndata' => 1 > }, > ); > > and subsequently parsing it using > my $parserF= Bio::Tools::Phylo::PAML->new (-file => "mlc", -dir => > "./"); > > However, I get the following message. > > ------------- EXCEPTION ------------- > MSG: Could not open mlc: No such file or directory > STACK Bio::Root::IO::_initialize_io > /Library/Perl//5.10.0/Bio/Root/IO.pm:351 > STACK Bio::Tools::Phylo::PAML::new > /Library/Perl//5.10.0/Bio/Tools/Phylo/PAML.pm:239 > STACK main::BranchSiteEvolAnalysis > /Users/marioafares/Documents/workspace/BactEvolGen/BactEvolGen.pl:1421 > STACK toplevel > /Users/marioafares/Documents/workspace/BactEvolGen/BactEvolGen.pl:939 > ------------------------------------- > > what I guess means the output file is not being saved in the previous > step. > Anyone knows what's wrong. > Tnak you very much in advance for your help. > Cheers, > Lorenzo > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From locarpau at upvnet.upv.es Mon Aug 29 19:19:46 2011 From: locarpau at upvnet.upv.es (Lorenzo Carretero Paulet) Date: Tue, 30 Aug 2011 01:19:46 +0200 Subject: [Bioperl-l] Saving Codeml Output file In-Reply-To: <1A4207F8295607498283FE9E93B775B407CCB2D3@EX02.asurite.ad.asu.edu> References: <9645AF32-5EC3-41FA-9A32-45B6B92E31FD@illinois.edu><4DF56976.8080704@upvnet.upv.es><9866C4A4-AC36-4A25-B38F-3006A7BB0F11@illinois.edu> <1314636433.4e5bc291a40c6@webmail.upv.es> <1A4207F8295607498283FE9E93B775B407CCB2D3@EX02.asurite.ad.asu.edu> Message-ID: <1314659986.4e5c1e9268078@webmail.upv.es> Kevin, That's pretty reasonable, but unfortunately still doesn't run. Even if I create the file as $outfile and give it as value to the wrapper as -outfile =>$outfile. It seems as if Bio::Tools::Run::Phylo::PAML::Codeml failed at creating the outfile. Did anyone manage to generate the outfile from Bio::Tools::Run::Phylo::PAML::Codeml. Cheers, Lorenzo Mensaje citado por Kevin Brown : > OK, went back to the original message. > > And here's where the problem actually originates... > > my $codeml_factory = new Bio::Tools::Run::Phylo::PAML::Codeml > ( > # this should cause it to create a file > called mlc > -outfile => '>mlc', > -save_tempfiles => 1, > -alignment => > $codon_MSA, > -tree => > $biotree, > -params => > { > 'verbose' => 1, > 'noisy' => 9, > 'runmode' => 0, #user tree > 'seqtype' => 1, > 'model' => $model, > 'NSsites' => $NSsites, > 'fix_omega' => $fix_omega, > 'omega' => $omega, > 'ncatG' => $ncatG, > 'icode' => 0, #* 0:universal code; 1:mammalian mt; 2-10:see > below (5:ciliate nuclear) > #'fix_alpha' => 0, > #'fix_kappa' => 0, > #'RateAncestor' => 0, > 'CodonFreq' => 2, > 'cleandata' => 1, # remove sites with amibguity data (1 yes, 0 > no), > 'ndata' => 1 > }, > ); > > > Kevin Brown > Center for Innovations in Medicine > Biodesign Institute > Arizona State University > > > -----Original Message----- > > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > > bounces at lists.open-bio.org] On Behalf Of Lorenzo Carretero Paulet > > Sent: Monday, August 29, 2011 9:47 AM > > To: bioperl-l at lists.open-bio.org > > Subject: [Bioperl-l] Saving Codeml Output file > > > > Hi all, > > I'm running codeml from the PAML package using the corresponding > > Bioperl > > wrapper. I'd like to save the output file as -outfile => 'mlc', as in: > > my $codeml_factory = new Bio::Tools::Run::Phylo::PAML::Codeml > > ( -outfile => 'mlc', > > -save_tempfiles => 1, > > -alignment => > $codon_MSA, > > -tree => > $biotree, > > -params => > > { > > #'outfile' =>'mlc', > > 'verbose' => 1, > > 'noisy' => 9, > > 'runmode' => 0, #user tree > > 'seqtype' => 1, > > 'model' => $model, > > 'NSsites' => $NSsites, > > 'fix_omega' => $fix_omega, > > 'omega' => $omega, > > 'ncatG' => $ncatG, > > 'icode' => 0, #* 0:universal code; 1:mammalian mt; 2-10:see below > > (5:ciliate > > nuclear) > > #'fix_alpha' => 0, > > #'fix_kappa' => > > 0, #'RateAncestor' > => 0, > > 'CodonFreq' => > 2, > > 'cleandata' => > > 1, # remove sites with amibguity data (1 yes, 0 no), > > 'ndata' => 1 > > > }, > > ); > > > > and subsequently parsing it using > > my $parserF= Bio::Tools::Phylo::PAML->new (-file => "mlc", -dir => > > "./"); > > > > However, I get the following message. > > > > ------------- EXCEPTION ------------- > > MSG: Could not open mlc: No such file or directory > > STACK Bio::Root::IO::_initialize_io > > /Library/Perl//5.10.0/Bio/Root/IO.pm:351 > > STACK Bio::Tools::Phylo::PAML::new > > /Library/Perl//5.10.0/Bio/Tools/Phylo/PAML.pm:239 > > STACK main::BranchSiteEvolAnalysis > > /Users/marioafares/Documents/workspace/BactEvolGen/BactEvolGen.pl:1421 > > STACK toplevel > > /Users/marioafares/Documents/workspace/BactEvolGen/BactEvolGen.pl:939 > > ------------------------------------- > > > > what I guess means the output file is not being saved in the previous > > step. > > Anyone knows what's wrong. > > Tnak you very much in advance for your help. > > Cheers, > > Lorenzo > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From jason.stajich at gmail.com Mon Aug 29 20:05:57 2011 From: jason.stajich at gmail.com (Jason Stajich) Date: Mon, 29 Aug 2011 17:05:57 -0700 Subject: [Bioperl-l] Saving Codeml Output file In-Reply-To: <1314659986.4e5c1e9268078@webmail.upv.es> References: <9645AF32-5EC3-41FA-9A32-45B6B92E31FD@illinois.edu><4DF56976.8080704@upvnet.upv.es><9866C4A4-AC36-4A25-B38F-3006A7BB0F11@illinois.edu> <1314636433.4e5bc291a40c6@webmail.upv.es> <1A4207F8295607498283FE9E93B775B407CCB2D3@EX02.asurite.ad.asu.edu> <1314659986.4e5c1e9268078@webmail.upv.es> Message-ID: I think you are mistaken on how to use the factory running objects and associated parser. You don't have to instantiate a parser as this is what is returned by the run command. The whole point is you don't need to get to the tempdir or specify opening of the mlc file or all the other output files from the program. you get to use the parser to get the data out and then it cleans up afterwards so you can run many iterations of runs in separate folders without having to cleanup afterwards. http://www.bioperl.org/wiki/HOWTO:PAML my $factory = Bio::Tools::Run::Phylo::PAML::Codeml->new( ... ); my ($rc,$parser) = $factory->run( ); if( my $result = $parser->next_result ) { # $result is a Bio::Tools::Phylo::PAML object } On Aug 29, 2011, at 4:19 PM, Lorenzo Carretero Paulet wrote: > Kevin, > That's pretty reasonable, but unfortunately still doesn't run. Even if I create > the file as $outfile and give it as value to the wrapper as -outfile > =>$outfile. It seems as if Bio::Tools::Run::Phylo::PAML::Codeml failed at > creating the outfile. Did anyone manage to generate the outfile from > Bio::Tools::Run::Phylo::PAML::Codeml. > Cheers, > Lorenzo > > Mensaje citado por Kevin Brown : > >> OK, went back to the original message. >> >> And here's where the problem actually originates... >> >> my $codeml_factory = new Bio::Tools::Run::Phylo::PAML::Codeml >> ( >> # this should cause it to create a file >> called mlc >> -outfile => '>mlc', >> -save_tempfiles => 1, >> -alignment => >> $codon_MSA, >> -tree => >> $biotree, >> -params => >> { >> 'verbose' => 1, >> 'noisy' => 9, >> 'runmode' => 0, #user tree >> 'seqtype' => 1, >> 'model' => $model, >> 'NSsites' => $NSsites, >> 'fix_omega' => $fix_omega, >> 'omega' => $omega, >> 'ncatG' => $ncatG, >> 'icode' => 0, #* 0:universal code; 1:mammalian mt; 2-10:see >> below (5:ciliate nuclear) >> #'fix_alpha' => 0, >> #'fix_kappa' => 0, >> #'RateAncestor' => 0, >> 'CodonFreq' => 2, >> 'cleandata' => 1, # remove sites with amibguity data (1 yes, 0 >> no), >> 'ndata' => 1 >> }, >> ); >> >> >> Kevin Brown >> Center for Innovations in Medicine >> Biodesign Institute >> Arizona State University >> >>> -----Original Message----- >>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- >>> bounces at lists.open-bio.org] On Behalf Of Lorenzo Carretero Paulet >>> Sent: Monday, August 29, 2011 9:47 AM >>> To: bioperl-l at lists.open-bio.org >>> Subject: [Bioperl-l] Saving Codeml Output file >>> >>> Hi all, >>> I'm running codeml from the PAML package using the corresponding >>> Bioperl >>> wrapper. I'd like to save the output file as -outfile => 'mlc', as in: >>> my $codeml_factory = new Bio::Tools::Run::Phylo::PAML::Codeml >>> ( -outfile => 'mlc', >>> -save_tempfiles => 1, >>> -alignment => >> $codon_MSA, >>> -tree => >> $biotree, >>> -params => >>> { >>> #'outfile' =>'mlc', >>> 'verbose' => 1, >>> 'noisy' => 9, >>> 'runmode' => 0, #user tree >>> 'seqtype' => 1, >>> 'model' => $model, >>> 'NSsites' => $NSsites, >>> 'fix_omega' => $fix_omega, >>> 'omega' => $omega, >>> 'ncatG' => $ncatG, >>> 'icode' => 0, #* 0:universal code; 1:mammalian mt; 2-10:see below >>> (5:ciliate >>> nuclear) >>> #'fix_alpha' => 0, >>> #'fix_kappa' => >>> 0, #'RateAncestor' >> => 0, >>> 'CodonFreq' => >> 2, >>> 'cleandata' => >>> 1, # remove sites with amibguity data (1 yes, 0 no), >>> 'ndata' => 1 >>> >> }, >>> ); >>> >>> and subsequently parsing it using >>> my $parserF= Bio::Tools::Phylo::PAML->new (-file => "mlc", -dir => >>> "./"); >>> >>> However, I get the following message. >>> >>> ------------- EXCEPTION ------------- >>> MSG: Could not open mlc: No such file or directory >>> STACK Bio::Root::IO::_initialize_io >>> /Library/Perl//5.10.0/Bio/Root/IO.pm:351 >>> STACK Bio::Tools::Phylo::PAML::new >>> /Library/Perl//5.10.0/Bio/Tools/Phylo/PAML.pm:239 >>> STACK main::BranchSiteEvolAnalysis >>> /Users/marioafares/Documents/workspace/BactEvolGen/BactEvolGen.pl:1421 >>> STACK toplevel >>> /Users/marioafares/Documents/workspace/BactEvolGen/BactEvolGen.pl:939 >>> ------------------------------------- >>> >>> what I guess means the output file is not being saved in the previous >>> step. >>> Anyone knows what's wrong. >>> Tnak you very much in advance for your help. >>> Cheers, >>> Lorenzo >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From fs5 at sanger.ac.uk Tue Aug 30 05:45:46 2011 From: fs5 at sanger.ac.uk (Frank Schwach) Date: Tue, 30 Aug 2011 10:45:46 +0100 Subject: [Bioperl-l] primer3redux 0.09 add_targets is not there In-Reply-To: <3BE41688-C163-4EA1-AF6A-34A6052FCFEA@illinois.edu> References: <3BE41688-C163-4EA1-AF6A-34A6052FCFEA@illinois.edu> Message-ID: <1314697546.3797.8.camel@deskpro15336.internal.sanger.ac.uk> Yes, I still have the primer3redux doc on my TODO list. Sorry, haven't had the time to do this lately but will loook into this as soon as I can. Frank On Mon, 2011-08-22 at 15:10 -0500, Chris Fields wrote: > On Aug 22, 2011, at 2:52 PM, Anand Patel wrote: > > > my $primer3 = Bio::Tools::Run::Primer3Redux->new(-outfile => > > "temp.out", -path => "/usr/bin/primer3_core"); > > > > If I use this: > > $primer3->add_targets( > > 'PRIMER_OPT_TM'=>$PRIMER_OPT_TM,'PRIMER_MIN_TM'=>$PRIMER_MIN_TM, > > 'PRIMER_MAX_TM'=>$PRIMER_MAX_TM, > > 'PRIMER_PAIR_MAX_DIFF_TM'=>$PRIMER_MAX_DIFF_TM, > > 'PRIMER_MAX_SIZE'=>$PRIMER_MAX_SIZE,'PRIMER_OPT_SIZE'=>$PRIMER_OPT_SIZE, > > 'PRIMER_MIN_SIZE'=>$PRIMER_MIN_SIZE, > > 'PRIMER_MAX_GC'=>$PRIMER_MAX_GC, > > 'PRIMER_OPT_GC_PERCENT'=>$PRIMER_OPT_GC_PERCENT, > > 'PRIMER_MIN_GC'=>$PRIMER_MIN_GC, > > 'SEQUENCE_TARGET'=>$TARGET, 'PRIMER_PRODUCT_SIZE_RANGE' > > =>$PRIMER_PRODUCT_SIZE_RANGE); > > > > I get: > > Can't locate object method "add_targets" via package > > "Bio::Tools::Run::Primer3Redux" at p3ra.pl line 31, line 1. > > > > On the other hand, if I change that line to: > > $primer3->set_parameters( > > 'PRIMER_OPT_TM'=>$PRIMER_OPT_TM,'PRIMER_MIN_TM'=>$PRIMER_MIN_TM, > > 'PRIMER_MAX_TM'=>$PRIMER_MAX_TM, > > 'PRIMER_PAIR_MAX_DIFF_TM'=>$PRIMER_MAX_DIFF_TM, > > 'PRIMER_MAX_SIZE'=>$PRIMER_MAX_SIZE,'PRIMER_OPT_SIZE'=>$PRIMER_OPT_SIZE, > > 'PRIMER_MIN_SIZE'=>$PRIMER_MIN_SIZE, > > 'PRIMER_MAX_GC'=>$PRIMER_MAX_GC, > > 'PRIMER_OPT_GC_PERCENT'=>$PRIMER_OPT_GC_PERCENT, > > 'PRIMER_MIN_GC'=>$PRIMER_MIN_GC, > > 'SEQUENCE_TARGET'=>$TARGET, 'PRIMER_PRODUCT_SIZE_RANGE' > > =>$PRIMER_PRODUCT_SIZE_RANGE); > > > > It works. When I looked at the source code for Primer3Redux, I > > couldn't find add_targets, but set_parameters looked like it might > > work, so I used that instead, and it worked. > > > > But I see over in the github that there are other issues with the > > documentation (how primer3redux's result object is now 3 deep rather > > than 2 deep). Not sure if this is in that category or not. > > That is true; documentation was to be updated but that hasn't happened yet (haven't had the free time to work specifically on this, and I think fschwach was to work on some HOWTO documentation). I do plan on an update in the next few weeks to address the various Issues on github, if you can file this as well it would help. > > I have to go back and look at the history of add_targets() reative to primer3 bioperl code, but I don't think this was part of the commit history of Bio::Tools::Run::Primer3Redux (maybe for the old version, Bio::Tools::Run::Primer3), so that is probably cruft left over from the update. Would be easy enough to alias it for convenience... > > chris > > > Thanks, > > Anand > ... > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. From manju.rawat2 at gmail.com Tue Aug 30 07:22:33 2011 From: manju.rawat2 at gmail.com (Manju Rawat) Date: Tue, 30 Aug 2011 07:22:33 -0400 Subject: [Bioperl-l] Bioperl query.... Message-ID: Hey Pls help me.. I am very new in Bioperl.. And i want to use blast report in my programming.. But i dnt know how to use it...pls tell me how to use HSP,gaps.etc methods??/ how to use them to extract valus from blast file.. Thanks Manju Rawat From roy.chaudhuri at gmail.com Tue Aug 30 07:25:32 2011 From: roy.chaudhuri at gmail.com (Roy Chaudhuri) Date: Tue, 30 Aug 2011 12:25:32 +0100 Subject: [Bioperl-l] Bioperl query.... In-Reply-To: References: Message-ID: <4E5CC8AC.8050800@gmail.com> Hi Manju, See: http://www.bioperl.org/wiki/HOWTO:SearchIO Cheers, Roy. On 30/08/2011 12:22, Manju Rawat wrote: > Hey Pls help me.. > I am very new in Bioperl.. > And i want to use blast report in my programming.. > But i dnt know how to use it...pls tell me how to use HSP,gaps.etc > methods??/ > how to use them to extract valus from blast file.. > > Thanks > Manju Rawat > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Tue Aug 30 09:54:19 2011 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 30 Aug 2011 08:54:19 -0500 Subject: [Bioperl-l] primer3redux 0.09 add_targets is not there In-Reply-To: <1314697546.3797.8.camel@deskpro15336.internal.sanger.ac.uk> References: <3BE41688-C163-4EA1-AF6A-34A6052FCFEA@illinois.edu> <1314697546.3797.8.camel@deskpro15336.internal.sanger.ac.uk> Message-ID: <8063FB1D-4557-4D1B-B9EF-9833ECD440E9@illinois.edu> S'okay, we're all a bit busy :P chris On Aug 30, 2011, at 4:45 AM, Frank Schwach wrote: > Yes, I still have the primer3redux doc on my TODO list. Sorry, haven't > had the time to do this lately but will loook into this as soon as I > can. > Frank > > > On Mon, 2011-08-22 at 15:10 -0500, Chris Fields wrote: >> On Aug 22, 2011, at 2:52 PM, Anand Patel wrote: >> >>> my $primer3 = Bio::Tools::Run::Primer3Redux->new(-outfile => >>> "temp.out", -path => "/usr/bin/primer3_core"); >>> >>> If I use this: >>> $primer3->add_targets( >>> 'PRIMER_OPT_TM'=>$PRIMER_OPT_TM,'PRIMER_MIN_TM'=>$PRIMER_MIN_TM, >>> 'PRIMER_MAX_TM'=>$PRIMER_MAX_TM, >>> 'PRIMER_PAIR_MAX_DIFF_TM'=>$PRIMER_MAX_DIFF_TM, >>> 'PRIMER_MAX_SIZE'=>$PRIMER_MAX_SIZE,'PRIMER_OPT_SIZE'=>$PRIMER_OPT_SIZE, >>> 'PRIMER_MIN_SIZE'=>$PRIMER_MIN_SIZE, >>> 'PRIMER_MAX_GC'=>$PRIMER_MAX_GC, >>> 'PRIMER_OPT_GC_PERCENT'=>$PRIMER_OPT_GC_PERCENT, >>> 'PRIMER_MIN_GC'=>$PRIMER_MIN_GC, >>> 'SEQUENCE_TARGET'=>$TARGET, 'PRIMER_PRODUCT_SIZE_RANGE' >>> =>$PRIMER_PRODUCT_SIZE_RANGE); >>> >>> I get: >>> Can't locate object method "add_targets" via package >>> "Bio::Tools::Run::Primer3Redux" at p3ra.pl line 31, line 1. >>> >>> On the other hand, if I change that line to: >>> $primer3->set_parameters( >>> 'PRIMER_OPT_TM'=>$PRIMER_OPT_TM,'PRIMER_MIN_TM'=>$PRIMER_MIN_TM, >>> 'PRIMER_MAX_TM'=>$PRIMER_MAX_TM, >>> 'PRIMER_PAIR_MAX_DIFF_TM'=>$PRIMER_MAX_DIFF_TM, >>> 'PRIMER_MAX_SIZE'=>$PRIMER_MAX_SIZE,'PRIMER_OPT_SIZE'=>$PRIMER_OPT_SIZE, >>> 'PRIMER_MIN_SIZE'=>$PRIMER_MIN_SIZE, >>> 'PRIMER_MAX_GC'=>$PRIMER_MAX_GC, >>> 'PRIMER_OPT_GC_PERCENT'=>$PRIMER_OPT_GC_PERCENT, >>> 'PRIMER_MIN_GC'=>$PRIMER_MIN_GC, >>> 'SEQUENCE_TARGET'=>$TARGET, 'PRIMER_PRODUCT_SIZE_RANGE' >>> =>$PRIMER_PRODUCT_SIZE_RANGE); >>> >>> It works. When I looked at the source code for Primer3Redux, I >>> couldn't find add_targets, but set_parameters looked like it might >>> work, so I used that instead, and it worked. >>> >>> But I see over in the github that there are other issues with the >>> documentation (how primer3redux's result object is now 3 deep rather >>> than 2 deep). Not sure if this is in that category or not. >> >> That is true; documentation was to be updated but that hasn't happened yet (haven't had the free time to work specifically on this, and I think fschwach was to work on some HOWTO documentation). I do plan on an update in the next few weeks to address the various Issues on github, if you can file this as well it would help. >> >> I have to go back and look at the history of add_targets() reative to primer3 bioperl code, but I don't think this was part of the commit history of Bio::Tools::Run::Primer3Redux (maybe for the old version, Bio::Tools::Run::Primer3), so that is probably cruft left over from the update. Would be easy enough to alias it for convenience... >> >> chris >> >>> Thanks, >>> Anand >> ... >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > -- > The Wellcome Trust Sanger Institute is operated by Genome Research > Limited, a charity registered in England with number 1021457 and a > company registered in England with number 2742969, whose registered > office is 215 Euston Road, London, NW1 2BE. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From locarpau at upvnet.upv.es Tue Aug 30 10:58:51 2011 From: locarpau at upvnet.upv.es (Lorenzo Carretero Paulet) Date: Tue, 30 Aug 2011 16:58:51 +0200 Subject: [Bioperl-l] Saving Codeml Output file In-Reply-To: References: <9645AF32-5EC3-41FA-9A32-45B6B92E31FD@illinois.edu><4DF56976.8080704@upvnet.upv.es><9866C4A4-AC36-4A25-B38F-3006A7BB0F11@illinois.edu> <1314636433.4e5bc291a40c6@webmail.upv.es> <1A4207F8295607498283FE9E93B775B407CCB2D3@EX02.asurite.ad.asu.edu> <1314659986.4e5c1e9268078@webmail.upv.es> Message-ID: <1314716331.4e5cfaab4958e@webmail.upv.es> Thanks Jason, Ok, I see. That's what I was triying at the beggining. This runs OK in my scripts for branch-specific models. However, when I try branch-site models (NSsites > 0) and try to parse the results using my $model_result= $paml_result->get_NSSite_results I start to have problems. According to Dumper, I'm able to generate a Bio::Tools::Phylo::PAML object $paml_result but this doesn't store any Bio::Tools::Phylo::PAML::ModelResult that could be accessed using get_NSSite_results. See below a little piece of code to illustrate what I'm saying. my $codeml_factory = new Bio::Tools::Run::Phylo::PAML::Codeml ( -alignment => $codon_MSA, -tree => $biotree, -params => { ... ...parameter values ... }, ); my ($rc,$parser) = $codeml_factory->run(); # or run($dna_aln,$biotree) #$codeml_factory->cleanup(); my $paml_result = $parser->next_result; say Dumper $paml_result; #This returns a true Bio::Tools::Phylo::PAML::Result object!!! my $model_result= $paml_result->get_NSSite_results; say Dumper $model_result; #This doesn't return a true Bio::Tools::Phylo::PAML::ModelResult object ($VAR1 = 0;)!!! $ns_string = "model ".$model_result->model_num."\n".$model_result->model_description()."\n".$model_result->time_used."\n"; As no ModelResult object is generated, the script stops returning: Can't call method "model_num" without a package or object reference That's why I was trying to save the mlc output file and parse it, instead of parsing directly the Bio::Tools::Phylo::PAML object. Best, Lorenzo PS: I?m using paml version 4.4b, July 2010 and Bioperl 1.006901. on mac osx Mensaje citado por Jason Stajich : > I think you are mistaken on how to use the factory running objects and > associated parser. > > You don't have to instantiate a parser as this is what is returned by the run > command. The whole point is you don't need to get to the tempdir or specify > opening of the mlc file or all the other output files from the program. you > get to use the parser to get the data out and then it cleans up afterwards so > you can run many iterations of runs in separate folders without having to > cleanup afterwards. > > http://www.bioperl.org/wiki/HOWTO:PAML > > my $factory = Bio::Tools::Run::Phylo::PAML::Codeml->new( ... ); > my ($rc,$parser) = $factory->run( ); > > if( my $result = $parser->next_result ) { > # $result is a Bio::Tools::Phylo::PAML object > } > > > On Aug 29, 2011, at 4:19 PM, Lorenzo Carretero Paulet wrote: > > > Kevin, > > That's pretty reasonable, but unfortunately still doesn't run. Even if I > create > > the file as $outfile and give it as value to the wrapper as -outfile > > =>$outfile. It seems as if Bio::Tools::Run::Phylo::PAML::Codeml failed at > > creating the outfile. Did anyone manage to generate the outfile from > > Bio::Tools::Run::Phylo::PAML::Codeml. > > Cheers, > > Lorenzo > > > > Mensaje citado por Kevin Brown : > > > >> OK, went back to the original message. > >> > >> And here's where the problem actually originates... > >> > >> my $codeml_factory = new Bio::Tools::Run::Phylo::PAML::Codeml > >> ( > >> # this should cause it to create a file > >> called mlc > >> -outfile => '>mlc', > >> -save_tempfiles => 1, > >> -alignment => > >> $codon_MSA, > >> -tree => > >> $biotree, > >> -params => > >> { > >> 'verbose' => 1, > >> 'noisy' => 9, > >> 'runmode' => 0, #user tree > >> 'seqtype' => 1, > >> 'model' => $model, > >> 'NSsites' => $NSsites, > >> 'fix_omega' => $fix_omega, > >> 'omega' => $omega, > >> 'ncatG' => $ncatG, > >> 'icode' => 0, #* 0:universal code; 1:mammalian mt; 2-10:see > >> below (5:ciliate nuclear) > >> #'fix_alpha' => 0, > >> #'fix_kappa' => 0, > >> #'RateAncestor' => 0, > >> 'CodonFreq' => 2, > >> 'cleandata' => 1, # remove sites with amibguity data (1 yes, 0 > >> no), > >> 'ndata' => 1 > >> }, > >> ); > >> > >> > >> Kevin Brown > >> Center for Innovations in Medicine > >> Biodesign Institute > >> Arizona State University > >> > >>> -----Original Message----- > >>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > >>> bounces at lists.open-bio.org] On Behalf Of Lorenzo Carretero Paulet > >>> Sent: Monday, August 29, 2011 9:47 AM > >>> To: bioperl-l at lists.open-bio.org > >>> Subject: [Bioperl-l] Saving Codeml Output file > >>> > >>> Hi all, > >>> I'm running codeml from the PAML package using the corresponding > >>> Bioperl > >>> wrapper. I'd like to save the output file as -outfile => 'mlc', as in: > >>> my $codeml_factory = new Bio::Tools::Run::Phylo::PAML::Codeml > >>> ( -outfile => 'mlc', > >>> -save_tempfiles => 1, > >>> -alignment => > >> $codon_MSA, > >>> -tree => > >> $biotree, > >>> -params => > >>> { > >>> #'outfile' =>'mlc', > >>> 'verbose' => 1, > >>> 'noisy' => 9, > >>> 'runmode' => 0, #user tree > >>> 'seqtype' => 1, > >>> 'model' => $model, > >>> 'NSsites' => $NSsites, > >>> 'fix_omega' => $fix_omega, > >>> 'omega' => $omega, > >>> 'ncatG' => $ncatG, > >>> 'icode' => 0, #* 0:universal code; 1:mammalian mt; 2-10:see below > >>> (5:ciliate > >>> nuclear) > >>> #'fix_alpha' => 0, > >>> #'fix_kappa' => > >>> 0, #'RateAncestor' > >> => 0, > >>> 'CodonFreq' => > >> 2, > >>> 'cleandata' => > >>> 1, # remove sites with amibguity data (1 yes, 0 no), > >>> 'ndata' => 1 > >>> > >> }, > >>> ); > >>> > >>> and subsequently parsing it using > >>> my $parserF= Bio::Tools::Phylo::PAML->new (-file => "mlc", -dir => > >>> "./"); > >>> > >>> However, I get the following message. > >>> > >>> ------------- EXCEPTION ------------- > >>> MSG: Could not open mlc: No such file or directory > >>> STACK Bio::Root::IO::_initialize_io > >>> /Library/Perl//5.10.0/Bio/Root/IO.pm:351 > >>> STACK Bio::Tools::Phylo::PAML::new > >>> /Library/Perl//5.10.0/Bio/Tools/Phylo/PAML.pm:239 > >>> STACK main::BranchSiteEvolAnalysis > >>> /Users/marioafares/Documents/workspace/BactEvolGen/BactEvolGen.pl:1421 > >>> STACK toplevel > >>> /Users/marioafares/Documents/workspace/BactEvolGen/BactEvolGen.pl:939 > >>> ------------------------------------- > >>> > >>> what I guess means the output file is not being saved in the previous > >>> step. > >>> Anyone knows what's wrong. > >>> Tnak you very much in advance for your help. > >>> Cheers, > >>> Lorenzo > >>> _______________________________________________ > >>> Bioperl-l mailing list > >>> Bioperl-l at lists.open-bio.org > >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >> > >> _______________________________________________ > >> Bioperl-l mailing list > >> Bioperl-l at lists.open-bio.org > >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >> > > > > > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From shalabh.sharma7 at gmail.com Tue Aug 30 11:26:00 2011 From: shalabh.sharma7 at gmail.com (shalabh sharma) Date: Tue, 30 Aug 2011 11:26:00 -0400 Subject: [Bioperl-l] Bioperl query.... In-Reply-To: <4E5CC8AC.8050800@gmail.com> References: <4E5CC8AC.8050800@gmail.com> Message-ID: Hi Manju, Just follow the link sent by Roy. It also contain some useful example scripts. What i am suggesting is , you should run a blast on a very small data set that you can inspect easily and manually. Then parse it using SeachIO (follow the link) and you will get a fair idea that how it works. -Shalabh On Tue, Aug 30, 2011 at 7:25 AM, Roy Chaudhuri wrote: > Hi Manju, > > See: > http://www.bioperl.org/wiki/**HOWTO:SearchIO > > Cheers, > Roy. > > > On 30/08/2011 12:22, Manju Rawat wrote: > >> Hey Pls help me.. >> I am very new in Bioperl.. >> And i want to use blast report in my programming.. >> But i dnt know how to use it...pls tell me how to use HSP,gaps.etc >> methods??/ >> how to use them to extract valus from blast file.. >> >> Thanks >> Manju Rawat >> ______________________________**_________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/**mailman/listinfo/bioperl-l >> > > ______________________________**_________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/**mailman/listinfo/bioperl-l > -- Shalabh Sharma Scientific Computing Professional Associate (Bioinformatics Specialist) Department of Marine Sciences University of Georgia Athens, GA 30602-3636 From longbow0 at gmail.com Wed Aug 31 11:48:16 2011 From: longbow0 at gmail.com (longbow leo) Date: Wed, 31 Aug 2011 10:48:16 -0500 Subject: [Bioperl-l] How to color leaves of a tree by Bio::Tree::Draw::Cladogram? Message-ID: Dear all, I am using the module Bio::Tree::Draw::Cladogram to create a tree diagram. But when I tried to color the tree leaves, the diagram was still without any colors. How can I color tree leave? Thanks in advance. Here is my script: ###################################################################### #!/usr/bin/perl use strict; use warnings; use Bio::TreeIO; use Bio::Tree::Draw::Cladogram; my $treei = Bio::TreeIO->new( -fh => \*DATA, -format => 'newick', ); my $tree = $treei->next_tree; # Color node 'B' to red my ($nodeB) = $tree->find_node( -id => 'B' ); $nodeB->add_tag_value('Rcolor', 1); $nodeB->add_tag_value('Gcolor', 0); $nodeB->add_tag_value('Bcolor', 0); my $cg = Bio::Tree::Draw::Cladogram->new( -tree => $tree, ); $cg->print( -file => 'mytree.eps' ); __DATA__ (((A:5,B:5)90:2,C:4)25:3,D:10); ###################################################################### Regards, Haizhou From roy.chaudhuri at gmail.com Wed Aug 31 12:02:30 2011 From: roy.chaudhuri at gmail.com (Roy Chaudhuri) Date: Wed, 31 Aug 2011 17:02:30 +0100 Subject: [Bioperl-l] How to color leaves of a tree by Bio::Tree::Draw::Cladogram? In-Reply-To: References: Message-ID: <4E5E5B16.9070704@gmail.com> Hi Haizhou, I think you need to specify -colors=>1 in your Bio::Tree::Draw::Cladogram constructor: my $cg = Bio::Tree::Draw::Cladogram->new( -tree => $tree, -colors => 1 ); Not sure why this isn't on by default. Roy. On 31/08/2011 16:48, longbow leo wrote: > Dear all, > > I am using the module Bio::Tree::Draw::Cladogram to create a tree diagram. > But when I tried to color the tree leaves, the diagram was still without any > colors. > > How can I color tree leave? Thanks in advance. > > Here is my script: > > ###################################################################### > > > #!/usr/bin/perl > > use strict; > use warnings; > > use Bio::TreeIO; > use Bio::Tree::Draw::Cladogram; > > my $treei = Bio::TreeIO->new( > -fh => \*DATA, > -format => 'newick', > ); > > my $tree = $treei->next_tree; > > # Color node 'B' to red > my ($nodeB) = $tree->find_node( -id => 'B' ); > > $nodeB->add_tag_value('Rcolor', 1); > $nodeB->add_tag_value('Gcolor', 0); > $nodeB->add_tag_value('Bcolor', 0); > > my $cg = Bio::Tree::Draw::Cladogram->new( > -tree => $tree, > ); > > $cg->print( -file => 'mytree.eps' ); > > __DATA__ > (((A:5,B:5)90:2,C:4)25:3,D:10); > > > ###################################################################### > > Regards, > > Haizhou > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Mon Aug 1 00:07:38 2011 From: cjfields at illinois.edu (Chris Fields) Date: Sun, 31 Jul 2011 23:07:38 -0500 Subject: [Bioperl-l] BioPerl Test requirements Message-ID: <79E5ED11-1D3F-436F-AA02-D922EAEB123A@illinois.edu> All, We are currently using a BioPerl-specific module for running tests called Bio::Root::Test. It is essentially a wrapper module, re-exporting all the methods for Test::More, Test::Exception, and Test::Warn. One problem: it currently expects a copy of Test::Warn and Test::Exception in each repository as a fallback. Another problem: these included modules appear to be triggering dependencies with debian packaging. As an example of one hidden dependency, the included Test::Warn requires Array::Compare, which converted to Moose a few years ago, so you automatically have to install the entire Moose dependency tree, even though Bioperl doesn't require it (not a slam on Moose, you really SHOULD be using Moose these days. No, really :). Anway, more recent versions of Test::Warn don't have this requirement, but as we package an old version of this module we get stuck with the dependencies until we (manually) update this for each repository. Ick. I think the best solution is to remove the bioperl-local modules in t/lib and list Test::Most instead as a 'build_requires' in Build.PL, e.g. the module is only necessary for the build phase so is optionally installed. Test::Most essentially does exactly the same thing as Bio::Root::Test and more; it also includes Test::Deep and Test::Diff (Bio::Root::Test has a few additional methods of use as well). As this will require developers to use Test::Most instead, though, I though it would be worth asking on the list to see if there are any objections. Any thoughts? chris From cjfields at illinois.edu Mon Aug 1 00:42:39 2011 From: cjfields at illinois.edu (Chris Fields) Date: Sun, 31 Jul 2011 23:42:39 -0500 Subject: [Bioperl-l] protaparam In-Reply-To: References: Message-ID: <44853A9D-9E78-469E-B8D8-B06EBDB5F780@illinois.edu> Shachi, My guess is this is not a BioPerl-specific issue, but that the web service interface has changed or is no longer active. Unfortunately this is one module that has no tests associated with it, so this passed through the cracks. You are more than welcome to file a bug on this, but if the service is inactive we'll likely immediately deprecate the module. chris On Jul 28, 2011, at 11:46 PM, Shachi Gahoi wrote: > Dear All, > > If anybody know how to rum protparam using bioperl please let me know. > > > Thanks in advance > > -- > Regards, > Shachi > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From jason.stajich at gmail.com Mon Aug 1 03:12:32 2011 From: jason.stajich at gmail.com (Jason Stajich) Date: Sun, 31 Jul 2011 23:12:32 -0800 Subject: [Bioperl-l] Fwd: Bio::Tools::Run::Phylo::Phyml, tree_string References: Message-ID: <3521B67E-D158-492A-8A60-025D6C5C9934@gmail.com> Heikki - can you take a look at this when you get time - I'm unclear what the BIONJ string is used for? Begin forwarded message: > From: Tristan Lefebure > Date: July 27, 2011 6:12:16 AM AKDT > To: bioperl mailing list > Subject: Re: [Bioperl-l] Bio::Tools::Run::Phylo::Phyml, tree_string > > done: > https://redmine.open-bio.org/issues/3273 > > -- > Tristan > > On Tue, Jul 26, 2011 at 9:43 PM, Chris Fields wrote: >> That's an odd one. Could you file this on redmine? >> >> chris >> >> On Jul 26, 2011, at 10:14 AM, Tristan Lefebure wrote: >> >>> Ouups, I found a typo in my post, it should read: >>> >>> I am not quite sure I understand why tree_string() from >>> Bio::Tools::Run::Phylo::Phyml returns >>> a string that looks like that (I removed the end of the tree): >>> >>> BIONJ(((((((('92':0.0114354726,'12':0.0472591023)0.0000000000:0.0000005859,... >>> >>> On Tue, Jul 26, 2011 at 4:47 PM, Tristan Lefebure >>> wrote: >>>> Hi there, >>>> I am not quite sure I understand why tree_string() from Bio::Tools::Run::Phylo::Phyml returns >>>> a string that looks like that (I removed the end of the tree): >>>> >>>> Tree is BIONJ(((((((('92':0.0114354726,'12':0.0472591023)0.0000000000:0.0000005859,... >>>> >>>> Why do we have this 'Tree is BIONJ' thing? >>>> >>>> A quick look at the code in the _run() function gives : >>>> >>>> { >>>> open(my $FH_TREE, "<", $tree_file) >>>> || $self->throw("Phyml call ($command) did not give an output: $?"); >>>> local $/; >>>> $self->{_tree} .= <$FH_TREE>; >>>> } >>>> >>>> Why appending something to $self->{_tree}? What about? >>>> $self->{_tree} = <$FH_TREE>; >>>> >>>> I was about to fill a bug report, but then I saw that in Phyml.t: >>>> >>>> is substr($factory->tree_string, 0, 9), 'BIONJ(SIN', 'tree_string()'; >>>> >>>> Well, I am lost. Any help much appreciated... >>>> >>>> -- >>>> Tristan >>>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From David.Messina at sbc.su.se Mon Aug 1 05:09:47 2011 From: David.Messina at sbc.su.se (Dave Messina) Date: Mon, 1 Aug 2011 11:09:47 +0200 Subject: [Bioperl-l] BioPerl Test requirements In-Reply-To: <79E5ED11-1D3F-436F-AA02-D922EAEB123A@illinois.edu> References: <79E5ED11-1D3F-436F-AA02-D922EAEB123A@illinois.edu> Message-ID: Sounds good, Chris. Go for it. Dave From hlapp at drycafe.net Mon Aug 1 16:30:18 2011 From: hlapp at drycafe.net (Hilmar Lapp) Date: Mon, 1 Aug 2011 16:30:18 -0400 Subject: [Bioperl-l] BioPerl Test requirements In-Reply-To: <79E5ED11-1D3F-436F-AA02-D922EAEB123A@illinois.edu> References: <79E5ED11-1D3F-436F-AA02-D922EAEB123A@illinois.edu> Message-ID: I think the small burden this change incurs for each developer is well outweighed by the reduced maintenance and installation burden. Go for it. -hilmar On Aug 1, 2011, at 12:07 AM, Chris Fields wrote: > All, > > We are currently using a BioPerl-specific module for running tests > called Bio::Root::Test. It is essentially a wrapper module, re- > exporting all the methods for Test::More, Test::Exception, and > Test::Warn. One problem: it currently expects a copy of Test::Warn > and Test::Exception in each repository as a fallback. Another > problem: these included modules appear to be triggering dependencies > with debian packaging. > > As an example of one hidden dependency, the included Test::Warn > requires Array::Compare, which converted to Moose a few years ago, > so you automatically have to install the entire Moose dependency > tree, even though Bioperl doesn't require it (not a slam on Moose, > you really SHOULD be using Moose these days. No, really :). > > Anway, more recent versions of Test::Warn don't have this > requirement, but as we package an old version of this module we get > stuck with the dependencies until we (manually) update this for each > repository. Ick. > > I think the best solution is to remove the bioperl-local modules in > t/lib and list Test::Most instead as a 'build_requires' in Build.PL, > e.g. the module is only necessary for the build phase so is > optionally installed. Test::Most essentially does exactly the same > thing as Bio::Root::Test and more; it also includes Test::Deep and > Test::Diff (Bio::Root::Test has a few additional methods of use as > well). > > As this will require developers to use Test::Most instead, though, I > though it would be worth asking on the list to see if there are any > objections. Any thoughts? > > > chris > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : =========================================================== From cjfields at illinois.edu Mon Aug 1 16:34:56 2011 From: cjfields at illinois.edu (Chris Fields) Date: Mon, 1 Aug 2011 15:34:56 -0500 Subject: [Bioperl-l] BioPerl Test requirements In-Reply-To: References: <79E5ED11-1D3F-436F-AA02-D922EAEB123A@illinois.edu> Message-ID: <0D28A228-53D1-4843-B99D-9F8A48132EA2@illinois.edu> Okay, will do. I'll initially test on a branch and then pull in. Thanks for the feedback Hilmar and Dave! chris On Aug 1, 2011, at 3:30 PM, Hilmar Lapp wrote: > I think the small burden this change incurs for each developer is well outweighed by the reduced maintenance and installation burden. Go for it. > > -hilmar > > On Aug 1, 2011, at 12:07 AM, Chris Fields wrote: > >> All, >> >> We are currently using a BioPerl-specific module for running tests called Bio::Root::Test. It is essentially a wrapper module, re-exporting all the methods for Test::More, Test::Exception, and Test::Warn. One problem: it currently expects a copy of Test::Warn and Test::Exception in each repository as a fallback. Another problem: these included modules appear to be triggering dependencies with debian packaging. >> >> As an example of one hidden dependency, the included Test::Warn requires Array::Compare, which converted to Moose a few years ago, so you automatically have to install the entire Moose dependency tree, even though Bioperl doesn't require it (not a slam on Moose, you really SHOULD be using Moose these days. No, really :). >> >> Anway, more recent versions of Test::Warn don't have this requirement, but as we package an old version of this module we get stuck with the dependencies until we (manually) update this for each repository. Ick. >> >> I think the best solution is to remove the bioperl-local modules in t/lib and list Test::Most instead as a 'build_requires' in Build.PL, e.g. the module is only necessary for the build phase so is optionally installed. Test::Most essentially does exactly the same thing as Bio::Root::Test and more; it also includes Test::Deep and Test::Diff (Bio::Root::Test has a few additional methods of use as well). >> >> As this will require developers to use Test::Most instead, though, I though it would be worth asking on the list to see if there are any objections. Any thoughts? >> >> >> chris >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : > =========================================================== > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From hlapp at drycafe.net Mon Aug 1 18:36:27 2011 From: hlapp at drycafe.net (Hilmar Lapp) Date: Mon, 1 Aug 2011 18:36:27 -0400 Subject: [Bioperl-l] Job opportunity: User Interface Design and Web Application Developer Message-ID: <7F0AE58E-6052-469B-ACD0-207FAD060472@drycafe.net> (Apologies if you have received this already or if this is considered spam - we're trying to reach out as broad as possible and I know that quite a few in the Bio* communities would be well qualified. Please feel free to pass on to anyone who might be interested, or might know someone who is.) User Interface Design and Web Application Developer The National Evolutionary Synthesis Center (NESCent) seeks a creative and enthusiastic individual to design user interfaces and web applications for scientific applications that manage, analyze, visualize and share data in support of evolutionary research. The incumbent will work as part of a small informatics team in close collaboration with domain scientists. NESCent (http://nescent.org) is an NSF-funded center dedicated to cross-disciplinary research in evolutionary science. Our informatics team works closely with visiting and resident scientists to support their custom software and database development needs (http://informatics.nescent.org ), and collaborates broadly with other biodiversity informatics projects. All NESCent software products are open-source, and the Center has a number of initiatives to actively promote collaborative development of community software resources. Above all, we are enthusiastic about our work, about the mission of the Center, and about the contribution of informatics to that mission. Job description: The incumbent will design and develop user interfaces and web applications for databases and other software tools for sponsored scientists and staff. The job responsibilities include all stages of the software development process, including requirements gathering, design, implementation, release packaging and documentation, as part of a small team (typically 2-3 individuals). We expect the incumbent to present their work at conferences and contribute to publications with scientific collaborators; interact regularly with visiting and resident scientists, other members of the informatics team and Center staff; and generally serve as an expert resource for Center personnel. The position provides opportunities for professional development and encourages research into new technologies. Most informatics staff work at our Durham NC offices, located adjacent to Duke University, but we support a wide range of technologies for virtual communication with off-site staff and collaborators. Salary range: $70,000 - $80,000, depending on education and experience Required Qualifications: * Demonstrated success collaborating with clients on custom software solutions * Experience with various stages of the software development cycle * Expertise in development and testing of user interface designs * Excellent communication skills, both virtual and face-to-face Preferred Qualifications: * M.S. or Ph.D. in Computer Science, Bioinformatics or related field * Demonstrated interest in science, particularly biology * Expertise in dynamic and interactive web technologies (JavaScript, CGI) * Expertise in rapid application development and respective programming technologies and languages (e.g., modern scripting languages and web-application frameworks such as Python/Django, Ruby/ Ruby-on-Rails, and Perl/Catalyst). * Expertise in graphic design * Expertise in data visualization and/or scientific data integration * Expertise in software usability design and assessment * Expertise in web service (SOAP, REST, XML, JSON) and semantic web technologies * Fluency in Java programming * Prior experience in relational database programming (PostgreSQL or MySQL) * Experience with open-source, and collaborative, software development How to apply: Please send cover letter, resume and contact information for three references to Dr. Karen Cranston, Training Coordinator and Bioinformatics Project Manager (karen.cranston at nescent.org); Please also complete the online application at the University of North Carolina HR website: http://bit.ly/r9HQ8r. Informal inquires or requests for additional information may be directed to Dr. Cranston by email or phone (+1-919-613-2275). Closing date is August 15, 2011. -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : =========================================================== From florent.angly at gmail.com Mon Aug 1 20:09:51 2011 From: florent.angly at gmail.com (Florent Angly) Date: Tue, 02 Aug 2011 10:09:51 +1000 Subject: [Bioperl-l] BioPerl Test requirements In-Reply-To: <0D28A228-53D1-4843-B99D-9F8A48132EA2@illinois.edu> References: <79E5ED11-1D3F-436F-AA02-D922EAEB123A@illinois.edu> <0D28A228-53D1-4843-B99D-9F8A48132EA2@illinois.edu> Message-ID: <4E37404F.1040001@gmail.com> If Test::Most gives more testing capabilities and makes packaging Bioperl easier, I think it's pretty sweet! Florent On 02/08/11 06:34, Chris Fields wrote: > Okay, will do. I'll initially test on a branch and then pull in. Thanks for the feedback Hilmar and Dave! > > chris > > On Aug 1, 2011, at 3:30 PM, Hilmar Lapp wrote: > >> I think the small burden this change incurs for each developer is well outweighed by the reduced maintenance and installation burden. Go for it. >> >> -hilmar >> >> On Aug 1, 2011, at 12:07 AM, Chris Fields wrote: >> >>> All, >>> >>> We are currently using a BioPerl-specific module for running tests called Bio::Root::Test. It is essentially a wrapper module, re-exporting all the methods for Test::More, Test::Exception, and Test::Warn. One problem: it currently expects a copy of Test::Warn and Test::Exception in each repository as a fallback. Another problem: these included modules appear to be triggering dependencies with debian packaging. >>> >>> As an example of one hidden dependency, the included Test::Warn requires Array::Compare, which converted to Moose a few years ago, so you automatically have to install the entire Moose dependency tree, even though Bioperl doesn't require it (not a slam on Moose, you really SHOULD be using Moose these days. No, really :). >>> >>> Anway, more recent versions of Test::Warn don't have this requirement, but as we package an old version of this module we get stuck with the dependencies until we (manually) update this for each repository. Ick. >>> >>> I think the best solution is to remove the bioperl-local modules in t/lib and list Test::Most instead as a 'build_requires' in Build.PL, e.g. the module is only necessary for the build phase so is optionally installed. Test::Most essentially does exactly the same thing as Bio::Root::Test and more; it also includes Test::Deep and Test::Diff (Bio::Root::Test has a few additional methods of use as well). >>> >>> As this will require developers to use Test::Most instead, though, I though it would be worth asking on the list to see if there are any objections. Any thoughts? >>> >>> >>> chris >>> >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> -- >> =========================================================== >> : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : >> =========================================================== >> >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From hartzell at alerce.com Mon Aug 1 20:06:54 2011 From: hartzell at alerce.com (George Hartzell) Date: Mon, 1 Aug 2011 17:06:54 -0700 Subject: [Bioperl-l] BioPerl Test requirements In-Reply-To: <79E5ED11-1D3F-436F-AA02-D922EAEB123A@illinois.edu> References: <79E5ED11-1D3F-436F-AA02-D922EAEB123A@illinois.edu> Message-ID: <20023.16286.89015.854814@gargle.gargle.HOWL> Sounds great. g. From carandraug+dev at gmail.com Tue Aug 2 10:00:32 2011 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Tue, 2 Aug 2011 15:00:32 +0100 Subject: [Bioperl-l] wiki administrator needed Message-ID: Hi! I have a problem with the bioperl wiki and have sent a support request to 'support at open-bio.org' as instructed here (http://www.bioperl.org/wiki/About_site#Help_with_Wiki_Problems ). I got the ticket ID #966. This was 2 weeks ago. Can someone with administrator rights on the wiki do something about it? Thanks in advance, Carn? Draug From p.j.a.cock at googlemail.com Tue Aug 2 10:56:30 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Tue, 2 Aug 2011 15:56:30 +0100 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: References: Message-ID: 2011/8/2 Carn? Draug : > Hi! > > I have a problem with the bioperl wiki and have sent a support request > to 'support at open-bio.org' as instructed here > (http://www.bioperl.org/wiki/About_site#Help_with_Wiki_Problems ). I > got the ticket ID #966. This was 2 weeks ago. Can someone with > administrator rights on the wiki do something about it? > > Thanks in advance, > Carn? Draug What was the problem with the wiki (for the benefit of those of us who might be able to fix it but are not on the support system and didn't get your email)? Peter From carandraug+dev at gmail.com Tue Aug 2 11:06:10 2011 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Tue, 2 Aug 2011 16:06:10 +0100 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: References: Message-ID: 2011/8/2 Peter Cock : > 2011/8/2 Carn? Draug : >> I have a problem with the bioperl wiki and have sent a support request >> to 'support at open-bio.org' as instructed here >> (http://www.bioperl.org/wiki/About_site#Help_with_Wiki_Problems ). I >> got the ticket ID #966. This was 2 weeks ago. Can someone with >> administrator rights on the wiki do something about it? > > What was the problem with the wiki (for the benefit of those > of us who might be able to fix it but are not on the support > system and didn't get your email)? Guess there should be no problem mentioning this on this open mailing list. Here's the e-mail I sent back then: When logging with OpenID, I accidentally created a new account. Now I can't use that OpenID for my real account since it's connected to that other account. It also doesn't let me remove that OpenID from that account. My real account has the nickname 'Carandraug'. The account I created by accident has the nickname '~carandraug' (because I was trying to connect my account with the OpenID of https://launchpad.net/~carandraug Could someone please remove the '~carandraug' account? I couldn't find a button to do so. From hlapp at drycafe.net Tue Aug 2 12:25:48 2011 From: hlapp at drycafe.net (Hilmar Lapp) Date: Tue, 2 Aug 2011 12:25:48 -0400 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: References: Message-ID: <0E50C8F0-2FCD-4AAB-83DD-DF6229E382C5@drycafe.net> I don't think the wiki allows removing of accounts (only blocking). Someone would have to go into the MySQL database and do that. -hilmar On Aug 2, 2011, at 11:06 AM, Carn? Draug wrote: > 2011/8/2 Peter Cock : >> 2011/8/2 Carn? Draug : >>> I have a problem with the bioperl wiki and have sent a support >>> request >>> to 'support at open-bio.org' as instructed here >>> (http://www.bioperl.org/wiki/About_site#Help_with_Wiki_Problems ). I >>> got the ticket ID #966. This was 2 weeks ago. Can someone with >>> administrator rights on the wiki do something about it? >> >> What was the problem with the wiki (for the benefit of those >> of us who might be able to fix it but are not on the support >> system and didn't get your email)? > > Guess there should be no problem mentioning this on this open mailing > list. Here's the e-mail I sent back then: > > When logging with OpenID, I accidentally created a new account. Now I > can't use that OpenID for my real account since it's connected to that > other account. It also doesn't let me remove that OpenID from that > account. > > My real account has the nickname 'Carandraug'. > > The account I created by accident has the nickname '~carandraug' > (because I was trying to connect my account with the OpenID of > https://launchpad.net/~carandraug > > Could someone please remove the '~carandraug' account? I couldn't find > a button to do so. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : =========================================================== From p.j.a.cock at googlemail.com Tue Aug 2 12:27:11 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Tue, 2 Aug 2011 17:27:11 +0100 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: <0E50C8F0-2FCD-4AAB-83DD-DF6229E382C5@drycafe.net> References: <0E50C8F0-2FCD-4AAB-83DD-DF6229E382C5@drycafe.net> Message-ID: 2011/8/2 Hilmar Lapp : > I don't think the wiki allows removing of accounts (only blocking). Someone > would have to go into the MySQL database and do that. The MediaWiki FAQ says don't do that, but does mention an optional add-on for merging wiki user accounts. We could block the unwanted account instead. Peter From cjfields at illinois.edu Tue Aug 2 12:35:36 2011 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 2 Aug 2011 11:35:36 -0500 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: References: <0E50C8F0-2FCD-4AAB-83DD-DF6229E382C5@drycafe.net> Message-ID: I don't know if blocking that account will solve to OpenID problem (that it is associated with the bad account), but maybe merging that account and Carn?'s good one will work. Maybe it's worth looking at the add-on. chris On Aug 2, 2011, at 11:27 AM, Peter Cock wrote: > 2011/8/2 Hilmar Lapp : >> I don't think the wiki allows removing of accounts (only blocking). Someone >> would have to go into the MySQL database and do that. > > The MediaWiki FAQ says don't do that, but does mention an > optional add-on for merging wiki user accounts. > > We could block the unwanted account instead. > > Peter > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Tue Aug 2 12:38:01 2011 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 2 Aug 2011 11:38:01 -0500 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: References: <0E50C8F0-2FCD-4AAB-83DD-DF6229E382C5@drycafe.net> Message-ID: Carn?, Try logging in with the bad account, then go under 'my preferences'. There is an OpenID tag; this lists your OpenIDs, along with a 'delete' button. See if deleting the OpenID helps. chris On Aug 2, 2011, at 11:27 AM, Peter Cock wrote: > 2011/8/2 Hilmar Lapp : >> I don't think the wiki allows removing of accounts (only blocking). Someone >> would have to go into the MySQL database and do that. > > The MediaWiki FAQ says don't do that, but does mention an > optional add-on for merging wiki user accounts. > > We could block the unwanted account instead. > > Peter > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From carandraug+dev at gmail.com Tue Aug 2 12:58:41 2011 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Tue, 2 Aug 2011 17:58:41 +0100 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: References: <0E50C8F0-2FCD-4AAB-83DD-DF6229E382C5@drycafe.net> Message-ID: On 2 August 2011 17:38, Chris Fields wrote: > Try logging in with the bad account, then go under 'my preferences'. ?There is an OpenID tag; this lists your OpenIDs, along with a 'delete' button. ?See if deleting the OpenID helps. I had try that the first time. However, it didn't let me do it because that OpenID was the one used to create the account. Carn? From ihok at hotmail.com Tue Aug 2 13:29:43 2011 From: ihok at hotmail.com (Jack Tanner) Date: Tue, 2 Aug 2011 13:29:43 -0400 Subject: [Bioperl-l] fastq quality with initial @ Message-ID: i've got a fastq file with PHRED quality strings that sometimes start with '@'. this breaks the _index_file routine in Bio/Index/Fastq.pm. i would've filed this in bugzilla, but i'm not authorized to do that. From cjfields at illinois.edu Tue Aug 2 14:59:00 2011 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 2 Aug 2011 13:59:00 -0500 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: References: <0E50C8F0-2FCD-4AAB-83DD-DF6229E382C5@drycafe.net> Message-ID: Let's see if we can get the merge account add-in working, then. chris On Aug 2, 2011, at 11:58 AM, Carn? Draug wrote: > On 2 August 2011 17:38, Chris Fields wrote: >> Try logging in with the bad account, then go under 'my preferences'. There is an OpenID tag; this lists your OpenIDs, along with a 'delete' button. See if deleting the OpenID helps. > > I had try that the first time. However, it didn't let me do it because > that OpenID was the one used to create the account. > > Carn? > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Tue Aug 2 15:00:47 2011 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 2 Aug 2011 14:00:47 -0500 Subject: [Bioperl-l] fastq quality with initial @ In-Reply-To: References: Message-ID: <441DB637-5586-488F-8943-FEA4D56C276B@illinois.edu> On Aug 2, 2011, at 12:29 PM, Jack Tanner wrote: > > i've got a fastq file with PHRED quality strings that sometimes start with '@'. this breaks the _index_file routine in Bio/Index/Fastq.pm. > i would've filed this in bugzilla, but i'm not authorized to do that. We no longer use bugzilla (as of v 1.6.900); see here: http://www.bioperl.org/wiki/Bugs Just register for an account and submit. I would check the latest code before doing so, just in case it has been fixed. chris From bosborne11 at verizon.net Tue Aug 2 16:24:54 2011 From: bosborne11 at verizon.net (Brian Osborne) Date: Tue, 02 Aug 2011 16:24:54 -0400 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: References: <0E50C8F0-2FCD-4AAB-83DD-DF6229E382C5@drycafe.net> Message-ID: Chris, This is the one I've used: http://www.mediawiki.org/wiki/Extension:User_Merge_and_Delete BIO On Aug 2, 2011, at 2:59 PM, Chris Fields wrote: > Let's see if we can get the merge account add-in working, then. > > chris > > On Aug 2, 2011, at 11:58 AM, Carn? Draug wrote: > >> On 2 August 2011 17:38, Chris Fields wrote: >>> Try logging in with the bad account, then go under 'my preferences'. There is an OpenID tag; this lists your OpenIDs, along with a 'delete' button. See if deleting the OpenID helps. >> >> I had try that the first time. However, it didn't let me do it because >> that OpenID was the one used to create the account. >> >> Carn? >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Tue Aug 2 18:01:42 2011 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 2 Aug 2011 17:01:42 -0500 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: References: <0E50C8F0-2FCD-4AAB-83DD-DF6229E382C5@drycafe.net> Message-ID: <0CE0963F-3D4C-4EDB-A77F-859598E24DE2@illinois.edu> Carn?, I installed the add-in, merged the old account (~carandraug) into the one specified (Carandraug ), and deleted the old account. See if that works. chris On Aug 2, 2011, at 1:59 PM, Chris Fields wrote: > Let's see if we can get the merge account add-in working, then. > > chris > > On Aug 2, 2011, at 11:58 AM, Carn? Draug wrote: > >> On 2 August 2011 17:38, Chris Fields wrote: >>> Try logging in with the bad account, then go under 'my preferences'. There is an OpenID tag; this lists your OpenIDs, along with a 'delete' button. See if deleting the OpenID helps. >> >> I had try that the first time. However, it didn't let me do it because >> that OpenID was the one used to create the account. >> >> Carn? >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From carandraug+dev at gmail.com Tue Aug 2 18:19:38 2011 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Tue, 2 Aug 2011 23:19:38 +0100 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: <0CE0963F-3D4C-4EDB-A77F-859598E24DE2@illinois.edu> References: <0E50C8F0-2FCD-4AAB-83DD-DF6229E382C5@drycafe.net> <0CE0963F-3D4C-4EDB-A77F-859598E24DE2@illinois.edu> Message-ID: On 2 August 2011 23:01, Chris Fields wrote: > Carn?, > > I installed the add-in, merged the old account (~carandraug) into the one specified (Carandraug ), and deleted the old account. ?See if that works. > > chris When I try to add this OpenID to my account, I still get the error: "That is someone else's OpenID." If I try to log in with this OpenID, after saying that I'm logged in successfully, the site still looks as if I'm not logged in, with a button to 'log in' and an IP address instead of a username. Another problem that I have when logging is that sometimes mediawiki sends 'https://login.launchpad.net/ id/y7xtYzD' instead of 'https://login.launchpad.net/~carandraug' to the launchpad server. I don't know what's causing this. Trying to backspace and delete what may be invisible characters before and after the string sometimes solves this. This happens even though I type this character by character so if there's any invisble stuff on the form it must be there before. This occurs when using Iceweasel 3.5 (in Debian), Firefox 3.6 (in Ubuntu) and Firefox 5 (in MacOSX). Carn? Draug From cjfields at illinois.edu Tue Aug 2 18:39:19 2011 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 2 Aug 2011 17:39:19 -0500 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: References: <0E50C8F0-2FCD-4AAB-83DD-DF6229E382C5@drycafe.net> <0CE0963F-3D4C-4EDB-A77F-859598E24DE2@illinois.edu> Message-ID: On Aug 2, 2011, at 5:19 PM, Carn? Draug wrote: > On 2 August 2011 23:01, Chris Fields wrote: >> Carn?, >> >> I installed the add-in, merged the old account (~carandraug) into the one specified (Carandraug ), and deleted the old account. See if that works. >> >> chris > > When I try to add this OpenID to my account, I still get the error: > "That is someone else's OpenID." Apparently UserMerge doesn't clean up empty OpenID. I found that one (login.launchpad.net/~carandraug) and manually deleted it. The user ID it was associated with no longer existed in the user tables. Kinda wondered if that would happen... > If I try to log in with this OpenID, after saying that I'm logged in > successfully, the site still looks as if I'm not logged in, with a > button to 'log in' and an IP address instead of a username. > > Another problem that I have when logging is that sometimes mediawiki > sends 'https://login.launchpad.net/ id/y7xtYzD' instead of > 'https://login.launchpad.net/~carandraug' to the launchpad server. I > don't know what's causing this. Trying to backspace and delete what > may be invisible characters before and after the string sometimes > solves this. This happens even though I type this character by > character so if there's any invisble stuff on the form it must be > there before. This occurs when using Iceweasel 3.5 (in Debian), > Firefox 3.6 (in Ubuntu) and Firefox 5 (in MacOSX). > > Carn? Draug Not sure myself, sounds like a MW bug. See if the OpenID works first, then maybe we can address that. chris From carandraug+dev at gmail.com Tue Aug 2 18:56:49 2011 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Tue, 2 Aug 2011 23:56:49 +0100 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: References: <0E50C8F0-2FCD-4AAB-83DD-DF6229E382C5@drycafe.net> <0CE0963F-3D4C-4EDB-A77F-859598E24DE2@illinois.edu> Message-ID: 2011/8/2 Chris Fields : > On Aug 2, 2011, at 5:19 PM, Carn? Draug wrote: >> On 2 August 2011 23:01, Chris Fields wrote: >>> Carn?, >>> >>> I installed the add-in, merged the old account (~carandraug) into the one specified (Carandraug ), and deleted the old account. ?See if that works. >>> >> >> When I try to add this OpenID to my account, I still get the error: >> "That is someone else's OpenID." > > Apparently UserMerge doesn't clean up empty OpenID. ?I found that one (login.launchpad.net/~carandraug) and manually deleted it. ?The user ID it was associated with no longer existed in the user tables. This is solved. I connected my account with this OpenID and can now log in with it. Thank you >> Another problem that I have when logging is that sometimes mediawiki >> sends 'https://login.launchpad.net/ id/y7xtYzD' instead of >> 'https://login.launchpad.net/~carandraug' to the launchpad server. I >> don't know what's causing this. Trying to backspace and delete what >> may be invisible characters before and after the string sometimes >> solves this. This happens even though I type this character by >> character so if there's any invisble stuff on the form it must be >> there before. This occurs when using Iceweasel 3.5 (in Debian), >> Firefox 3.6 (in Ubuntu) and Firefox 5 (in MacOSX). This still happens sometimes. It just happened now. I had also fill a support request about this issue some weeks ago (ticket #965). No idea what's been causing this. Carn? From cjfields at illinois.edu Tue Aug 2 21:55:23 2011 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 2 Aug 2011 20:55:23 -0500 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: References: <0E50C8F0-2FCD-4AAB-83DD-DF6229E382C5@drycafe.net> <0CE0963F-3D4C-4EDB-A77F-859598E24DE2@illinois.edu> Message-ID: On Aug 2, 2011, at 5:56 PM, Carn? Draug wrote: > 2011/8/2 Chris Fields : >> On Aug 2, 2011, at 5:19 PM, Carn? Draug wrote: >>> On 2 August 2011 23:01, Chris Fields wrote: >>>> Carn?, >>>> >>>> I installed the add-in, merged the old account (~carandraug) into the one specified (Carandraug ), and deleted the old account. See if that works. >>>> >>> >>> When I try to add this OpenID to my account, I still get the error: >>> "That is someone else's OpenID." >> >> Apparently UserMerge doesn't clean up empty OpenID. I found that one (login.launchpad.net/~carandraug) and manually deleted it. The user ID it was associated with no longer existed in the user tables. > > This is solved. I connected my account with this OpenID and can now > log in with it. Thank you No problem. Apparently there is a bug fix in the more recent versions of OpenID and UserMerge, I'll add a redmine task to make sure they get updated (have my hands full right now, and OpenID can sometimes be tricky to debug). >>> Another problem that I have when logging is that sometimes mediawiki >>> sends 'https://login.launchpad.net/ id/y7xtYzD' instead of >>> 'https://login.launchpad.net/~carandraug' to the launchpad server. I >>> don't know what's causing this. Trying to backspace and delete what >>> may be invisible characters before and after the string sometimes >>> solves this. This happens even though I type this character by >>> character so if there's any invisble stuff on the form it must be >>> there before. This occurs when using Iceweasel 3.5 (in Debian), >>> Firefox 3.6 (in Ubuntu) and Firefox 5 (in MacOSX). > > This still happens sometimes. It just happened now. I had also fill a > support request about this issue some weeks ago (ticket #965). No idea > what's been causing this. > > Carn? Okay, as long as it's noted somewhere. chris From kai.blin at biotech.uni-tuebingen.de Wed Aug 3 04:55:04 2011 From: kai.blin at biotech.uni-tuebingen.de (Kai Blin) Date: Wed, 03 Aug 2011 10:55:04 +0200 Subject: [Bioperl-l] Bio::Root::IO _readline/_pushback behavior Message-ID: <4E390CE8.2050100@biotech.uni-tuebingen.de> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi folks, as I mentioned on https://redmine.open-bio.org/issues/3264 there is something odd going on with Bio::Root::IO's _readline/_pushback functions. This seems to be intentional, at least there is a test case asserting the behaviour I'm seeing. It his however very confusing to the unexpecting programmer using the code. One assumption I'd immediately make would be that if I have code that does a $foo = $io->_readline; $io->_pushback($foo); $bar = $io->_readline;, $foo will be the same string as $bar, regardless what other pieces of the code did. Currently, this is not the case, because the readbuffer that _pushback pushes back into has new strings appended to the end but readline removes them from the front. This easily violates the "principle of least surprise", so I think we should change the readbuffer to a stack. As far as I can tell, changing the _pushback function to "unshift" instead of "push" to the readbuffer breaks only the Root/RootIO.t test designed to test the old behaviour. I don't see any other tests failing on my system that don't fail without this patch. Any comments from the core devs? Cheers, Kai - -- Dipl.-Inform. Kai Blin kai.blin at biotech.uni-tuebingen.de Institute for Microbiology and Infection Medicine Division of Microbiology/Biotechnology Eberhard-Karls-Universit?t T?bingen Auf der Morgenstelle 28 Phone : ++49 7071 29-78841 D-72076 T?bingen Fax : ++49 7071 29-5979 Germany Homepage: http://www.mikrobio.uni-tuebingen.de/ag_wohlleben -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJOOQzoAAoJEKM5lwBiwTTPO6QIAMDN1bAm1FFD98F0rhN7TCpW sV2sLkQDESK9YjCxp3kAqCpg7ZCArcA5l7HmEdAZFTzdFnsfnvKJmNB86C30QXJs 6XcYSbvBIPQdhjK7WIhG2pANItiTxKTGgXDZklVjgj2dVT4kSkCgdGYAAMssT1hn n1/jkBJu5uuCq43Wv5Ia+wEhdN0M+xgKc9x7MF/ikO2qr6x24odMNTW8VgyLsYie p9M68U23aStip2rxV1hrhZzbnjLz66V6O9fIEHmm5CYLfcGXkcrclzLIeptepSj1 bj/7dWIdXy8VnoSNx4RbckHSkMbdIkmyPKzmoYFN7p3FvmrSXsOmB6nfD0hEkbY= =S5ff -----END PGP SIGNATURE----- From shelly.mh at gmail.com Tue Aug 2 06:19:33 2011 From: shelly.mh at gmail.com (Shelly M) Date: Tue, 2 Aug 2011 13:19:33 +0300 Subject: [Bioperl-l] question regarding Bio::DB::CUTG Message-ID: Hello, My name is Shelly and I'm a student at the Hebrew university of Jerusalem. I'm trying to use the package Bio::DB::CUTG but I have some trouble retrieving the right table for a given organism. For example, if I write my $cdtable = Bio::DB::CUTG->new(-sp =>'Mus musculus'); I get a warring message :MSG: too many species - not a unique species id, and it return _species => mitochondrion Mus musculus. So my question is what is the exact format for retrieving the the specific organism? Thanks a lot for the help, Shelly From maximilien1er at gmail.com Tue Aug 2 22:50:44 2011 From: maximilien1er at gmail.com (=?ISO-8859-1?Q?Maxime_D=E9raspe?=) Date: Tue, 2 Aug 2011 19:50:44 -0700 (PDT) Subject: [Bioperl-l] Genbank file : bad features (tag) order with /translation Message-ID: Hi, when I parse a genbank file no matter what I do, the / translation="MKAV.." tag value of a CDS never appear in the last place as it should be. Other tags like /note= /product comes after / translation which it's not the usual practice with genbank file. Could anyone have an idea how to deal with it... put /translation tag value in the last place when I write the genbank file. Thank you ! Max From shachigahoimbi at gmail.com Wed Aug 3 02:00:44 2011 From: shachigahoimbi at gmail.com (Shachi Gahoi) Date: Wed, 3 Aug 2011 11:30:44 +0530 Subject: [Bioperl-l] How to show branch length value in tree Message-ID: Dear All I am using Bio::Tree modules for constructing and drawing tree. *I am unable to show branch length value in tree. * Please tell me How can I do this, if anybody knows. Here is my script which i am using...and i also attached generated tree. Thanks in advance ################################################################################################ use Bio::AlignIO; use Bio::Align::ProteinStatistics; use Bio::Tree::DistanceFactory; use Bio::TreeIO; use Bio::Tree::Draw::Cladogram; # for a dna alignment # can also use ProteinStatistics my $alnio = Bio::AlignIO->new(-file => 'ADP.aln', -format=>'clustalw'); my $dfactory = Bio::Tree::DistanceFactory->new(-method => 'UPGMA'); my $stats = Bio::Align::ProteinStatistics->new; my $treeout = Bio::TreeIO->new(-format => 'newick', -file =>'>ADP1.dnd'); while( my $aln = $alnio->next_aln ) { my $mat = $stats->distance(-method => 'Kimura', -align => $aln); my $tree = $dfactory->make_tree($mat); $treeout->write_tree($tree); } my $dir = shift || '.'; opendir(DIR, $dir) || die $!; for my $file ( readdir(DIR) ) { next unless $file =~ /(\S+)\.dnd$/; my $stem = $1; my $treeio = Bio::TreeIO->new('-format' => 'newick', '-file' => "$dir/$file"); if( my $t1 = $treeio->next_tree ) { my $obj1 = Bio::Tree::Draw::Cladogram->new(-bootstrap => 1, -tree => $t1, -compact => 0); $obj1->print(-file => "$dir/$stem.eps"); } } ######################################################################################################## -- Regards, Shachi -------------- next part -------------- A non-text attachment was scrubbed... Name: ADP1.dnd Type: application/octet-stream Size: 1369 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ADP1.eps Type: application/postscript Size: 17718 bytes Desc: not available URL: From cjfields at illinois.edu Wed Aug 3 09:10:20 2011 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 3 Aug 2011 08:10:20 -0500 Subject: [Bioperl-l] Question to Bio::SearchIO::infernal.pm In-Reply-To: <4E32E14B020000EE00004F57@gwia1.boku.ac.at> References: <4E32E14B020000EE00004F57@gwia1.boku.ac.at> Message-ID: Nadine, Hard to guess w/o seeing the report, but I'm not terribly surprised. I believe I only coded for simple 1 CM reports, IIRC. You'll have to file this as a bug on redmine along with an example. chris On Jul 29, 2011, at 9:35 AM, Nadine Elpida Tatto wrote: > Hi There! > > > > I was wondering if you would or can help me. > > > I have an infernal report containing about 2000 CMs from an infernal run against Rfam.cm. To parse this report I wanted to use Bio::SearchIO::infernal.pm. Unfortunately this turned out to be a problem for me, because "$parser->next_result" only delivers the result for the first CM in the report and nothing more. > > > My code: > #!/usr/bin/perl -w > > > use strict;use Data::Dumper; > use Bio::SearchIO; > > > my $infile = $ARGV[0]; # infernal report > my $parser = Bio::SearchIO->new(-format => 'Infernal', > -file => $infile); > > > while( my $result = $parser->next_result ) { > print $result->query_name . "\n"; > } > > > exit; > > > > > The output: > > > ntatto:~$ ./infernalParser.pl infernal.output > 5S_rRNA > ntatto:~$ > > > > > I would expect the following (like parsing a blast report): > > > ntatto:~$ ./infernalParser.pl infernal.output > 5S_rRNA > 5_8S_rRNA > U1 > ... > ntatto:~$ > > > > I would be glad for help. > > > Thank you in advance. > > > Best Regards > > > N Tatto > > > > > > > > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From p.j.a.cock at googlemail.com Wed Aug 3 09:46:06 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 3 Aug 2011 14:46:06 +0100 Subject: [Bioperl-l] Genbank file : bad features (tag) order with /translation In-Reply-To: References: Message-ID: 2011/8/3 Maxime D?raspe : > Hi, > > when I parse a genbank file no matter what I do, the / > translation="MKAV.." tag value of a CDS never appear in the last place > as it should be. Other tags like /note= /product comes after / > translation which it's not the usual practice with genbank file. Could > anyone have an idea how to deal with it... put /translation tag value > in the last place when I write the genbank file. > > Thank you ! > > Max Hi Max, I'm not aware of anything in the feature table specification about the order of the feature qualifiers (the "tags" like /note and /product). See http://www.ncbi.nlm.nih.gov/collab/FT/ I suspect BioPerl is using a hash (Biopython uses a dictionary) for the feature qualifiers, which would discard the order. Why do you care about the order? Peter From roy.chaudhuri at gmail.com Wed Aug 3 09:58:22 2011 From: roy.chaudhuri at gmail.com (Roy Chaudhuri) Date: Wed, 03 Aug 2011 14:58:22 +0100 Subject: [Bioperl-l] How to show branch length value in tree In-Reply-To: References: Message-ID: <4E3953FE.5080304@gmail.com> Hi Shachi, I don't think you can draw labels on branches using Bio::Tree::Draw::Cladogram. However, it will draw node labels, so you could copy the branch lengths over to the node ids: my $obj1 = Bio::Tree::Draw::Cladogram->new(-bootstrap => 1, -tree => $t1, -compact => 0); for my $node ($tree->get_nodes) { $node->id($node->branch_length) if defined $node->branch_length; } $obj1->print(-file => "$dir/$stem.eps") Incidentally, in your script you write the tree out to a file, then read it back in using TreeIO. This is unnecessary, you can use $tree directly as input to Bio::Tree::Draw::Cladogram. Alternatively, you could write out a newick file and use non-Bioperl software such as njplot or MEGA to draw your tree with labelled branch lengths. Cheers, Roy. On 03/08/2011 07:00, Shachi Gahoi wrote: > Dear All > > I am using Bio::Tree modules for constructing and drawing tree. *I am unable > to show branch length value in tree. > * > Please tell me How can I do this, if anybody knows. > > Here is my script which i am using...and i also attached generated tree. > > Thanks in advance > > ################################################################################################ > > use Bio::AlignIO; > use Bio::Align::ProteinStatistics; > use Bio::Tree::DistanceFactory; > use Bio::TreeIO; > use Bio::Tree::Draw::Cladogram; > > # for a dna alignment > # can also use ProteinStatistics > > my $alnio = Bio::AlignIO->new(-file => 'ADP.aln', -format=>'clustalw'); > > my $dfactory = Bio::Tree::DistanceFactory->new(-method => 'UPGMA'); > > my $stats = Bio::Align::ProteinStatistics->new; > > my $treeout = Bio::TreeIO->new(-format => 'newick', -file =>'>ADP1.dnd'); > > while( my $aln = $alnio->next_aln ) > { > my $mat = $stats->distance(-method => 'Kimura', -align => $aln); > > my $tree = $dfactory->make_tree($mat); > $treeout->write_tree($tree); > } > > my $dir = shift || '.'; > > opendir(DIR, $dir) || die $!; > for my $file ( readdir(DIR) ) > { > next unless $file =~ /(\S+)\.dnd$/; > my $stem = $1; > my $treeio = Bio::TreeIO->new('-format' => 'newick', > '-file' => "$dir/$file"); > > if( my $t1 = $treeio->next_tree ) > { > my $obj1 = Bio::Tree::Draw::Cladogram->new(-bootstrap => 1, > -tree => $t1, > -compact => 0); > $obj1->print(-file => "$dir/$stem.eps"); > } > } > > ######################################################################################################## > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From roy.chaudhuri at gmail.com Wed Aug 3 10:01:18 2011 From: roy.chaudhuri at gmail.com (Roy Chaudhuri) Date: Wed, 03 Aug 2011 15:01:18 +0100 Subject: [Bioperl-l] How to show branch length value in tree In-Reply-To: <4E3953FE.5080304@gmail.com> References: <4E3953FE.5080304@gmail.com> Message-ID: <4E3954AE.2080401@gmail.com> Sorry, the code had a typo, it should be: my $obj1 = Bio::Tree::Draw::Cladogram->new(-bootstrap => 1, -tree => $t1, -compact => 0); for my $node ($t1->get_nodes) { $node->id($node->branch_length) if defined $node->branch_length; } $obj1->print(-file => "$dir/$stem.eps") On 03/08/2011 14:58, Roy Chaudhuri wrote: > Hi Shachi, > > I don't think you can draw labels on branches using > Bio::Tree::Draw::Cladogram. However, it will draw node labels, so you > could copy the branch lengths over to the node ids: > > my $obj1 = Bio::Tree::Draw::Cladogram->new(-bootstrap => 1, > -tree => $t1, > -compact => 0); > for my $node ($tree->get_nodes) { > $node->id($node->branch_length) if defined $node->branch_length; > } > $obj1->print(-file => "$dir/$stem.eps") > > Incidentally, in your script you write the tree out to a file, then read > it back in using TreeIO. This is unnecessary, you can use $tree directly > as input to Bio::Tree::Draw::Cladogram. > > Alternatively, you could write out a newick file and use non-Bioperl > software such as njplot or MEGA to draw your tree with labelled branch > lengths. > > Cheers, > Roy. > > On 03/08/2011 07:00, Shachi Gahoi wrote: >> Dear All >> >> I am using Bio::Tree modules for constructing and drawing tree. *I am unable >> to show branch length value in tree. >> * >> Please tell me How can I do this, if anybody knows. >> >> Here is my script which i am using...and i also attached generated tree. >> >> Thanks in advance >> >> ################################################################################################ >> >> use Bio::AlignIO; >> use Bio::Align::ProteinStatistics; >> use Bio::Tree::DistanceFactory; >> use Bio::TreeIO; >> use Bio::Tree::Draw::Cladogram; >> >> # for a dna alignment >> # can also use ProteinStatistics >> >> my $alnio = Bio::AlignIO->new(-file => 'ADP.aln', -format=>'clustalw'); >> >> my $dfactory = Bio::Tree::DistanceFactory->new(-method => 'UPGMA'); >> >> my $stats = Bio::Align::ProteinStatistics->new; >> >> my $treeout = Bio::TreeIO->new(-format => 'newick', -file =>'>ADP1.dnd'); >> >> while( my $aln = $alnio->next_aln ) >> { >> my $mat = $stats->distance(-method => 'Kimura', -align => $aln); >> >> my $tree = $dfactory->make_tree($mat); >> $treeout->write_tree($tree); >> } >> >> my $dir = shift || '.'; >> >> opendir(DIR, $dir) || die $!; >> for my $file ( readdir(DIR) ) >> { >> next unless $file =~ /(\S+)\.dnd$/; >> my $stem = $1; >> my $treeio = Bio::TreeIO->new('-format' => 'newick', >> '-file' => "$dir/$file"); >> >> if( my $t1 = $treeio->next_tree ) >> { >> my $obj1 = Bio::Tree::Draw::Cladogram->new(-bootstrap => 1, >> -tree => $t1, >> -compact => 0); >> $obj1->print(-file => "$dir/$stem.eps"); >> } >> } >> >> ######################################################################################################## >> >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > From cjfields at illinois.edu Wed Aug 3 10:08:33 2011 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 3 Aug 2011 09:08:33 -0500 Subject: [Bioperl-l] Genbank file : bad features (tag) order with /translation In-Reply-To: References: Message-ID: <4585DD3A-8E0A-4820-BA34-8154146A0BC8@illinois.edu> On Aug 3, 2011, at 8:46 AM, Peter Cock wrote: > 2011/8/3 Maxime D?raspe : >> Hi, >> >> when I parse a genbank file no matter what I do, the / >> translation="MKAV.." tag value of a CDS never appear in the last place >> as it should be. Other tags like /note= /product comes after / >> translation which it's not the usual practice with genbank file. Could >> anyone have an idea how to deal with it... put /translation tag value >> in the last place when I write the genbank file. >> >> Thank you ! >> >> Max > > Hi Max, > > I'm not aware of anything in the feature table specification > about the order of the feature qualifiers (the "tags" like /note > and /product). See http://www.ncbi.nlm.nih.gov/collab/FT/ > > I suspect BioPerl is using a hash (Biopython uses a dictionary) > for the feature qualifiers, which would discard the order. > > Why do you care about the order? > > Peter Yes, it uses a hash based on the feature tags. Not sure how Biopython handles it but my guess is something similar (Peter?). The output order was never a chief concern of ours. To tell the truth our main focus has never been simple conversion, except to transform data into a format that is more manageable/normalized. For those interested in making this change, all the code for printing features is in one method in Bio::SeqIO::genbank, _print_GenBank_FTHelper(). The best way to handle this would be to allow an optional coderef/callback that takes the feature (or the tags) and allows custom sorting and printing; I don't want to get into messy semantics on how to specifically sort tags, best to let the user decide. chris From cjfields at illinois.edu Wed Aug 3 10:16:37 2011 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 3 Aug 2011 09:16:37 -0500 Subject: [Bioperl-l] Bio::Root::IO _readline/_pushback behavior In-Reply-To: <4E390CE8.2050100@biotech.uni-tuebingen.de> References: <4E390CE8.2050100@biotech.uni-tuebingen.de> Message-ID: On Aug 3, 2011, at 3:55 AM, Kai Blin wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hi folks, > > as I mentioned on https://redmine.open-bio.org/issues/3264 there is > something odd going on with Bio::Root::IO's _readline/_pushback > functions. This seems to be intentional, at least there is a test case > asserting the behaviour I'm seeing. It his however very confusing to the > unexpecting programmer using the code. > > One assumption I'd immediately make would be that if I have code that > does a $foo = $io->_readline; $io->_pushback($foo); $bar = > $io->_readline;, $foo will be the same string as $bar, regardless what > other pieces of the code did. Currently, this is not the case, because > the readbuffer that _pushback pushes back into has new strings appended > to the end but readline removes them from the front. I think this test is performed in the regressions already, but if not then it is more than welcome. > This easily violates the "principle of least surprise", so I think we > should change the readbuffer to a stack. As far as I can tell, changing > the _pushback function to "unshift" instead of "push" to the readbuffer > breaks only the Root/RootIO.t test designed to test the old behaviour. I > don't see any other tests failing on my system that don't fail without > this patch. > > Any comments from the core devs? I don't have a problem with that beyond the change to the RootIO.t tests (it implies a specific behavior that some developers expect, so is a very subtle API change). However, this is how one would expect it, to be more like an 'unread' stack instead of a queue. In fact, there is a module I used for Biome's pushback/readline called IO::Unread that implements an IO layer for mimicing this behavior, might be worth looking into. > Cheers, > Kai chris Christopher Fields Senior Research Scientist National Center for Supercomputing Applications Institute for Genomic Biology University of Illinois Urbana-Champaign 1206 W. Gregory Dr. , MC-195 Urbana, IL 61801 From p.j.a.cock at googlemail.com Wed Aug 3 10:45:21 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 3 Aug 2011 15:45:21 +0100 Subject: [Bioperl-l] Genbank file : bad features (tag) order with /translation In-Reply-To: <4585DD3A-8E0A-4820-BA34-8154146A0BC8@illinois.edu> References: <4585DD3A-8E0A-4820-BA34-8154146A0BC8@illinois.edu> Message-ID: On Wed, Aug 3, 2011 at 3:08 PM, Chris Fields wrote: > > Yes, it uses a hash based on the feature tags. ?Not sure how Biopython > handles it but my guess is something similar (Peter?). Yes, we key on the feature qualifier (e.g. note or product) and the values are a list of qualifier values (e.g. you can have two notes). > The output order was never a chief concern of ours. ?To tell the truth > our main focus has never been simple conversion, except to transform > data into a format that is more manageable/normalized. > > For those interested in making this change, all the code ?for printing > features is in one method in Bio::SeqIO::genbank, _print_GenBank_FTHelper(). >?The best way to handle this would be to allow an optional coderef/callback > that takes the feature (or the tags) and allows custom sorting and printing; > I don't want to get into messy semantics on how to specifically sort tags, > best to let the user decide. For Biopython switching from the default dictionary (hash type) to an order preserving dictionary would be one option. I too have no wish to try and implement qualifier sorting without an explicit standard. Peter From maximilien1er at gmail.com Wed Aug 3 10:48:05 2011 From: maximilien1er at gmail.com (=?ISO-8859-1?Q?Maxime_D=E9raspe?=) Date: Wed, 3 Aug 2011 07:48:05 -0700 (PDT) Subject: [Bioperl-l] Genbank file : bad features (tag) order with /translation In-Reply-To: References: Message-ID: <7717ecd0-c9b6-4ecc-895f-db60ca89e259@l7g2000vbz.googlegroups.com> > Hi Max, > > I'm not aware of anything in the feature table specification > about the order of the feature qualifiers (the "tags" like /note > and /product). Seehttp://www.ncbi.nlm.nih.gov/collab/FT/ > > I suspect BioPerl is using a hash (Biopython uses a dictionary) > for the feature qualifiers, which would discard the order. > > Why do you care about the order? > > Peter > Hi Peter, I care about the order for the submission to ncbi. But I guess they will reformat the file before getting it in their database. It's also visually better when the translation of the protein comes in the end of the annotation for the CDS and not before /product, /note .... Anyway maybe I'll reformat the file in sequin table for a direct submission to ncbi with sequin. Thank you. Max From p.j.a.cock at googlemail.com Wed Aug 3 12:00:01 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 3 Aug 2011 17:00:01 +0100 Subject: [Bioperl-l] Genbank file : bad features (tag) order with /translation In-Reply-To: <7717ecd0-c9b6-4ecc-895f-db60ca89e259@l7g2000vbz.googlegroups.com> References: <7717ecd0-c9b6-4ecc-895f-db60ca89e259@l7g2000vbz.googlegroups.com> Message-ID: 2011/8/3 Maxime D?raspe : >> >> Why do you care about the order? >> > > Hi Peter, > > I care about the order for the submission to ncbi. Do the NCBI have some guidelines which ask for a particular order? > But I guess they > will reformat the file before getting it in their database. They seem to generate the official GenBank files from their database - so I doubt the input order matters. > It's also > visually better when the translation of the protein comes in the end > of the annotation for the CDS and not before /product, /note .... I do see your point, but if that were the only motivation I wouldn't want to make generating GenBank output any more complicated than it already is. > Anyway maybe I'll reformat the file in sequin table for a direct > submission to ncbi with sequin. > > Thank you. > > Max Peter From cjfields at illinois.edu Wed Aug 3 12:52:02 2011 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 3 Aug 2011 11:52:02 -0500 Subject: [Bioperl-l] Genbank file : bad features (tag) order with /translation In-Reply-To: References: <7717ecd0-c9b6-4ecc-895f-db60ca89e259@l7g2000vbz.googlegroups.com> Message-ID: On Aug 3, 2011, at 11:00 AM, Peter Cock wrote: > 2011/8/3 Maxime D?raspe : >>> >>> Why do you care about the order? >>> >> >> Hi Peter, >> >> I care about the order for the submission to ncbi. > > Do the NCBI have some guidelines which ask for a particular order? No, beyond the feature table there is no specification that indicates such that I am aware of. Submitted data is tabular; sequin is a nicer GUI API for getting data into a useful format for submission to NCBI, where data is converted to ASN.1 I believe. >> But I guess they >> will reformat the file before getting it in their database. > > They seem to generate the official GenBank files from their > database - so I doubt the input order matters. Yep, that's correct. If NCBI ruled the world everyone would be using ASN.1 (b/c that's what they use internally). >> It's also >> visually better when the translation of the protein comes in the end >> of the annotation for the CDS and not before /product, /note .... > > I do see your point, but if that were the only motivation I wouldn't > want to make generating GenBank output any more complicated > than it already is. ... >> Anyway maybe I'll reformat the file in sequin table for a direct >> submission to ncbi with sequin. >> >> Thank you. >> >> Max > > Peter Maxime, I find most users try to avoid using GenBank format except when absolutely needed. There is a very good reason Sequin and tbl2asn are used by NCBI for submissions; they end up generating simple tabular data that is easier to feed into their internal ASN.1 format. Genbank is a nice human-readable format, but structure-wise I find it's a pain to deal with, not to mention the variant third-party 'genbank' data that users want us to handle. We try to support generation of output within reason, but that's never been our primary goal. As long as the output generated is capable of being re-read by our parsers with the data intact and generates sane data we're pretty happy. Saying that, any additions to deal with this are perfectly welcome (I pointed out one mechanism that could be used), but they would have to address the concerns Peter and I alluded to previously, and it would be nice to evaluate how any changes affect performance. You are more than welcome to submit this as a feature request using our redmine server (including patches if you do this yourself): https://redmine.open-bio.org/ chris From cjfields at illinois.edu Wed Aug 3 13:10:31 2011 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 3 Aug 2011 12:10:31 -0500 Subject: [Bioperl-l] Genbank file : bad features (tag) order with /translation In-Reply-To: <5D9B8006-A20E-4AAE-88EF-0A1DCA56B26E@verizon.net> References: <7717ecd0-c9b6-4ecc-895f-db60ca89e259@l7g2000vbz.googlegroups.com> <5D9B8006-A20E-4AAE-88EF-0A1DCA56B26E@verizon.net> Message-ID: <51452A39-42B7-4BBF-9F50-A37419E75454@illinois.edu> IMHO I find genbank too unwieldy, but it's nice to know the output works for NCBI submission. chris On Aug 3, 2011, at 12:06 PM, Brian Osborne wrote: > Peter, > > I currently use BioPerl and SeqIO::genbank to create the *gbf files for NCBI submission, they've always accepted them. In fact I think they don't even use them, I believe they use the *tbl, *fsa, and *agp files and the ASN file as data sources. > > Brian O > > On Aug 3, 2011, at 12:52 PM, Chris Fields wrote: > >> On Aug 3, 2011, at 11:00 AM, Peter Cock wrote: >> >>> 2011/8/3 Maxime D?raspe : >>>>> >>>>> Why do you care about the order? >>>>> >>>> >>>> Hi Peter, >>>> >>>> I care about the order for the submission to ncbi. >>> >>> Do the NCBI have some guidelines which ask for a particular order? >> >> No, beyond the feature table there is no specification that indicates such that I am aware of. Submitted data is tabular; sequin is a nicer GUI API for getting data into a useful format for submission to NCBI, where data is converted to ASN.1 I believe. >> >>>> But I guess they >>>> will reformat the file before getting it in their database. >>> >>> They seem to generate the official GenBank files from their >>> database - so I doubt the input order matters. >> >> Yep, that's correct. If NCBI ruled the world everyone would be using ASN.1 (b/c that's what they use internally). >> >>>> It's also >>>> visually better when the translation of the protein comes in the end >>>> of the annotation for the CDS and not before /product, /note .... >>> >>> I do see your point, but if that were the only motivation I wouldn't >>> want to make generating GenBank output any more complicated >>> than it already is. >> ... >>>> Anyway maybe I'll reformat the file in sequin table for a direct >>>> submission to ncbi with sequin. >>>> >>>> Thank you. >>>> >>>> Max >>> >>> Peter >> >> >> Maxime, I find most users try to avoid using GenBank format except when absolutely needed. There is a very good reason Sequin and tbl2asn are used by NCBI for submissions; they end up generating simple tabular data that is easier to feed into their internal ASN.1 format. Genbank is a nice human-readable format, but structure-wise I find it's a pain to deal with, not to mention the variant third-party 'genbank' data that users want us to handle. >> >> We try to support generation of output within reason, but that's never been our primary goal. As long as the output generated is capable of being re-read by our parsers with the data intact and generates sane data we're pretty happy. >> >> Saying that, any additions to deal with this are perfectly welcome (I pointed out one mechanism that could be used), but they would have to address the concerns Peter and I alluded to previously, and it would be nice to evaluate how any changes affect performance. You are more than welcome to submit this as a feature request using our redmine server (including patches if you do this yourself): >> >> https://redmine.open-bio.org/ >> >> chris >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > From bosborne11 at verizon.net Wed Aug 3 13:06:05 2011 From: bosborne11 at verizon.net (Brian Osborne) Date: Wed, 03 Aug 2011 13:06:05 -0400 Subject: [Bioperl-l] Genbank file : bad features (tag) order with /translation In-Reply-To: References: <7717ecd0-c9b6-4ecc-895f-db60ca89e259@l7g2000vbz.googlegroups.com> Message-ID: <5D9B8006-A20E-4AAE-88EF-0A1DCA56B26E@verizon.net> Peter, I currently use BioPerl and SeqIO::genbank to create the *gbf files for NCBI submission, they've always accepted them. In fact I think they don't even use them, I believe they use the *tbl, *fsa, and *agp files and the ASN file as data sources. Brian O On Aug 3, 2011, at 12:52 PM, Chris Fields wrote: > On Aug 3, 2011, at 11:00 AM, Peter Cock wrote: > >> 2011/8/3 Maxime D?raspe : >>>> >>>> Why do you care about the order? >>>> >>> >>> Hi Peter, >>> >>> I care about the order for the submission to ncbi. >> >> Do the NCBI have some guidelines which ask for a particular order? > > No, beyond the feature table there is no specification that indicates such that I am aware of. Submitted data is tabular; sequin is a nicer GUI API for getting data into a useful format for submission to NCBI, where data is converted to ASN.1 I believe. > >>> But I guess they >>> will reformat the file before getting it in their database. >> >> They seem to generate the official GenBank files from their >> database - so I doubt the input order matters. > > Yep, that's correct. If NCBI ruled the world everyone would be using ASN.1 (b/c that's what they use internally). > >>> It's also >>> visually better when the translation of the protein comes in the end >>> of the annotation for the CDS and not before /product, /note .... >> >> I do see your point, but if that were the only motivation I wouldn't >> want to make generating GenBank output any more complicated >> than it already is. > ... >>> Anyway maybe I'll reformat the file in sequin table for a direct >>> submission to ncbi with sequin. >>> >>> Thank you. >>> >>> Max >> >> Peter > > > Maxime, I find most users try to avoid using GenBank format except when absolutely needed. There is a very good reason Sequin and tbl2asn are used by NCBI for submissions; they end up generating simple tabular data that is easier to feed into their internal ASN.1 format. Genbank is a nice human-readable format, but structure-wise I find it's a pain to deal with, not to mention the variant third-party 'genbank' data that users want us to handle. > > We try to support generation of output within reason, but that's never been our primary goal. As long as the output generated is capable of being re-read by our parsers with the data intact and generates sane data we're pretty happy. > > Saying that, any additions to deal with this are perfectly welcome (I pointed out one mechanism that could be used), but they would have to address the concerns Peter and I alluded to previously, and it would be nice to evaluate how any changes affect performance. You are more than welcome to submit this as a feature request using our redmine server (including patches if you do this yourself): > > https://redmine.open-bio.org/ > > chris > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From lskatz at gmail.com Wed Aug 3 17:01:24 2011 From: lskatz at gmail.com (Lee Katz) Date: Wed, 3 Aug 2011 17:01:24 -0400 Subject: [Bioperl-l] SeqIO: paired end reads Message-ID: Hi all! I was wondering how to construct paired end reads from scratch. I know the locations of certain sequences across the genome with a high degree of confidence and so I want to give them to my assembler as paired end reads, along with my other sequence runs (454 and Illumina runs). I plan to use Newbler. My only problem is that I do not know the correct format in order to specify distance and sequences for a paired end reads run, and so I hope that there is a SeqIO solution. At the least, I hope that one bioperl member can point me to where the definition of the paired end reads file format is...? Thank you! --Lee From jason.stajich at gmail.com Wed Aug 3 17:17:01 2011 From: jason.stajich at gmail.com (Jason Stajich) Date: Wed, 3 Aug 2011 13:17:01 -0800 Subject: [Bioperl-l] SeqIO: paired end reads In-Reply-To: References: Message-ID: <57EA9809-E999-43EF-B340-9A552A4A3FB6@gmail.com> it depends on the assembler - For Illumina usually the paired ends end with /1 /2 and they have the same ID but are in two different files. Depends on if you are using interleaved paired reads or in two separate files. some just expect the paired reads to be mated by virtue of being in same order in two files. the ABYSS and Velvet manuals both explain what is expected so you will want to check on what are Newbler's assumptions on how the paired ends are encoded. There are simulator tools if that is what you are trying to do in the end? checkout wgsim which comes with samtools or try dnaa On Aug 3, 2011, at 1:01 PM, Lee Katz wrote: > Hi all! I was wondering how to construct paired end reads from scratch. I > know the locations of certain sequences across the genome with a high degree > of confidence and so I want to give them to my assembler as paired end > reads, along with my other sequence runs (454 and Illumina runs). I plan to > use Newbler. > > My only problem is that I do not know the correct format in order to specify > distance and sequences for a paired end reads run, and so I hope that there > is a SeqIO solution. At the least, I hope that one bioperl member can point > me to where the definition of the paired end reads file format is...? > > Thank you! > > --Lee > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From roy.chaudhuri at gmail.com Thu Aug 4 07:22:23 2011 From: roy.chaudhuri at gmail.com (Roy Chaudhuri) Date: Thu, 04 Aug 2011 12:22:23 +0100 Subject: [Bioperl-l] How to show branch length value in tree In-Reply-To: References: <4E3953FE.5080304@gmail.com> <4E3954AE.2080401@gmail.com> Message-ID: <4E3A80EF.2010409@gmail.com> Hi Shachi, Please keep replies on the mailing list, that way others can follow the discussion. As I mentioned, it is not possible to draw njplot-style trees with labelled branches using Bio::Tree::Draw::Cladogram, it currently only labels nodes (you could perhaps add branch labels as a feature request on Redmine). The code I gave overwrites the existing "leaf" node ids (the accessions) with branch lengths, if you want to also keep the existing labels you could try something like: for my $node ($t1->get_nodes) { if ($node->is_Leaf) { $node->id($node->branch_length.' '.$node->id); } else { $node->id($node->branch_length) } } Cheers, Roy. On 04/08/2011 05:36, Shachi Gahoi wrote: > Thank You so much. Now branch length is coming in tree. > > But I want Accesssion number in place of node id. > > I attached snapshot of tree as I want. Please tell me how can I do this. > > > > > On Wed, Aug 3, 2011 at 7:31 PM, Roy Chaudhuri > wrote: > > Sorry, the code had a typo, it should be: > > > my $obj1 = Bio::Tree::Draw::Cladogram->__new(-bootstrap => 1, > -tree => $t1, > -compact => 0); > for my $node ($t1->get_nodes) { > > $node->id($node->branch___length) if defined $node->branch_length; > } > $obj1->print(-file => "$dir/$stem.eps") > > On 03/08/2011 14:58, Roy Chaudhuri wrote: > > Hi Shachi, > > I don't think you can draw labels on branches using > Bio::Tree::Draw::Cladogram. However, it will draw node labels, > so you > could copy the branch lengths over to the node ids: > > my $obj1 = Bio::Tree::Draw::Cladogram->__new(-bootstrap => 1, > -tree => $t1, > -compact => 0); > for my $node ($tree->get_nodes) { > $node->id($node->branch___length) if defined > $node->branch_length; > } > $obj1->print(-file => "$dir/$stem.eps") > > Incidentally, in your script you write the tree out to a file, > then read > it back in using TreeIO. This is unnecessary, you can use $tree > directly > as input to Bio::Tree::Draw::Cladogram. > > Alternatively, you could write out a newick file and use non-Bioperl > software such as njplot or MEGA to draw your tree with labelled > branch > lengths. > > Cheers, > Roy. > > On 03/08/2011 07:00, Shachi Gahoi wrote: > > Dear All > > I am using Bio::Tree modules for constructing and drawing > tree. *I am unable > to show branch length value in tree. > * > Please tell me How can I do this, if anybody knows. > > Here is my script which i am using...and i also attached > generated tree. > > Thanks in advance > > ##############################__##############################__##############################__###### > > use Bio::AlignIO; > use Bio::Align::ProteinStatistics; > use Bio::Tree::DistanceFactory; > use Bio::TreeIO; > use Bio::Tree::Draw::Cladogram; > > # for a dna alignment > # can also use ProteinStatistics > > my $alnio = Bio::AlignIO->new(-file => 'ADP.aln', > -format=>'clustalw'); > > my $dfactory = Bio::Tree::DistanceFactory->__new(-method => > 'UPGMA'); > > my $stats = Bio::Align::ProteinStatistics-__>new; > > my $treeout = Bio::TreeIO->new(-format => 'newick', -file > =>'>ADP1.dnd'); > > while( my $aln = $alnio->next_aln ) > { > my $mat = $stats->distance(-method => 'Kimura', -align > => $aln); > > my $tree = $dfactory->make_tree($mat); > $treeout->write_tree($tree); > } > > my $dir = shift || '.'; > > opendir(DIR, $dir) || die $!; > for my $file ( readdir(DIR) ) > { > next unless $file =~ /(\S+)\.dnd$/; > my $stem = $1; > my $treeio = Bio::TreeIO->new('-format' => 'newick', > '-file' => "$dir/$file"); > > if( my $t1 = $treeio->next_tree ) > { > my $obj1 = > Bio::Tree::Draw::Cladogram->__new(-bootstrap => 1, > -tree > => $t1, > > -compact => 0); > $obj1->print(-file => "$dir/$stem.eps"); > } > } > > ##############################__##############################__##############################__############## > > > > > _________________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/__mailman/listinfo/bioperl-l > > > > > > > > -- > Regards, > Shachi From razi.khaja at gmail.com Thu Aug 4 13:39:28 2011 From: razi.khaja at gmail.com (Razi Khaja) Date: Thu, 4 Aug 2011 13:39:28 -0400 Subject: [Bioperl-l] BioPerl on GitHub will not install Message-ID: All, I just checked out the latest development version of BioPerl from GitHub and found that it does not install because bp_das_server.pl is missing. Building BioPerl 'blib/script/bp_das_server.pl' and 'blib/script/bp_das_server.pl' are identical (not copied) at /opt/bioperl-live/Bio/Root/Build.pm line 219 Use of uninitialized value in utime at /usr/lib/perl5/5.8.8/File/Copy.pm line 218. Use of uninitialized value in utime at /usr/lib/perl5/5.8.8/File/Copy.pm line 218. Can't rename 'blib/script/bp_das_server.pl' to 'blib/script/bp_das_server.pl': No such file or directory at /opt/bioperl-live/Bio/Root/Build.pm line 219. After copying the bp_das_server.pl that I had from a previous installation to 'blib/script', I was able to ./Build test and ./Build install the development version I checked out. Could someone test out this problem and fix it on github? if it really is a problem? Thanks, Razi From cjfields at illinois.edu Thu Aug 4 13:42:48 2011 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 4 Aug 2011 12:42:48 -0500 Subject: [Bioperl-l] [Bioperl-guts-l] BioPerl on GitHub will not install In-Reply-To: References: Message-ID: <007DAD37-BC86-4D1F-8C40-816890661F7D@illinois.edu> Yes, I can replicate that. It's from the recent renaming for scripts. I'll look into it. chris On Aug 4, 2011, at 12:39 PM, Razi Khaja wrote: > All, > > I just checked out the latest development version of BioPerl from GitHub and > found that it does not install because bp_das_server.pl is missing. > > Building BioPerl > 'blib/script/bp_das_server.pl' and 'blib/script/bp_das_server.pl' are > identical (not copied) at /opt/bioperl-live/Bio/Root/Build.pm line 219 > Use of uninitialized value in utime at /usr/lib/perl5/5.8.8/File/Copy.pm > line 218. > Use of uninitialized value in utime at /usr/lib/perl5/5.8.8/File/Copy.pm > line 218. > Can't rename 'blib/script/bp_das_server.pl' to 'blib/script/bp_das_server.pl': > No such file or directory at /opt/bioperl-live/Bio/Root/Build.pm line 219. > > After copying the bp_das_server.pl that I had from a previous installation > to 'blib/script', I was able to ./Build test and ./Build install the > development version I checked out. > > Could someone test out this problem and fix it on github? if it really is a > problem? > > Thanks, > > Razi > _______________________________________________ > Bioperl-guts-l mailing list > Bioperl-guts-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-guts-l From hlapp at drycafe.net Thu Aug 4 17:31:52 2011 From: hlapp at drycafe.net (Hilmar Lapp) Date: Thu, 4 Aug 2011 17:31:52 -0400 Subject: [Bioperl-l] Bio::Root::IO _readline/_pushback behavior In-Reply-To: References: <4E390CE8.2050100@biotech.uni-tuebingen.de> Message-ID: I agree. In fact I'm surprised that $io->_pushback() does not act like unshift() - that's I thought how it is used. -hilmar On Aug 3, 2011, at 10:16 AM, Chris Fields wrote: > On Aug 3, 2011, at 3:55 AM, Kai Blin wrote: > >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA1 >> >> Hi folks, >> >> as I mentioned on https://redmine.open-bio.org/issues/3264 there is >> something odd going on with Bio::Root::IO's _readline/_pushback >> functions. This seems to be intentional, at least there is a test >> case >> asserting the behaviour I'm seeing. It his however very confusing >> to the >> unexpecting programmer using the code. >> >> One assumption I'd immediately make would be that if I have code that >> does a $foo = $io->_readline; $io->_pushback($foo); $bar = >> $io->_readline;, $foo will be the same string as $bar, regardless >> what >> other pieces of the code did. Currently, this is not the case, >> because >> the readbuffer that _pushback pushes back into has new strings >> appended >> to the end but readline removes them from the front. > > I think this test is performed in the regressions already, but if > not then it is more than welcome. > >> This easily violates the "principle of least surprise", so I think we >> should change the readbuffer to a stack. As far as I can tell, >> changing >> the _pushback function to "unshift" instead of "push" to the >> readbuffer >> breaks only the Root/RootIO.t test designed to test the old >> behaviour. I >> don't see any other tests failing on my system that don't fail >> without >> this patch. >> >> Any comments from the core devs? > > I don't have a problem with that beyond the change to the RootIO.t > tests (it implies a specific behavior that some developers expect, > so is a very subtle API change). However, this is how one would > expect it, to be more like an 'unread' stack instead of a queue. In > fact, there is a module I used for Biome's pushback/readline called > IO::Unread that implements an IO layer for mimicing this behavior, > might be worth looking into. > >> Cheers, >> Kai > > chris > > > Christopher Fields > Senior Research Scientist > National Center for Supercomputing Applications > Institute for Genomic Biology > University of Illinois Urbana-Champaign > 1206 W. Gregory Dr. , MC-195 > Urbana, IL 61801 > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : =========================================================== From cjfields at illinois.edu Thu Aug 4 17:42:30 2011 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 4 Aug 2011 16:42:30 -0500 Subject: [Bioperl-l] Bio::Root::IO _readline/_pushback behavior In-Reply-To: References: <4E390CE8.2050100@biotech.uni-tuebingen.de> Message-ID: <4196E008-4A81-41E5-A4F9-F9F8D3851E5C@illinois.edu> Yeah, it's a queue; the 'buffering' is a simple internal array using push/shift. I say we merge the change in from the branch and fix any modules accordingly. chris On Aug 4, 2011, at 4:31 PM, Hilmar Lapp wrote: > I agree. In fact I'm surprised that $io->_pushback() does not act like unshift() - that's I thought how it is used. > > -hilmar > > On Aug 3, 2011, at 10:16 AM, Chris Fields wrote: > >> On Aug 3, 2011, at 3:55 AM, Kai Blin wrote: >> >>> -----BEGIN PGP SIGNED MESSAGE----- >>> Hash: SHA1 >>> >>> Hi folks, >>> >>> as I mentioned on https://redmine.open-bio.org/issues/3264 there is >>> something odd going on with Bio::Root::IO's _readline/_pushback >>> functions. This seems to be intentional, at least there is a test case >>> asserting the behaviour I'm seeing. It his however very confusing to the >>> unexpecting programmer using the code. >>> >>> One assumption I'd immediately make would be that if I have code that >>> does a $foo = $io->_readline; $io->_pushback($foo); $bar = >>> $io->_readline;, $foo will be the same string as $bar, regardless what >>> other pieces of the code did. Currently, this is not the case, because >>> the readbuffer that _pushback pushes back into has new strings appended >>> to the end but readline removes them from the front. >> >> I think this test is performed in the regressions already, but if not then it is more than welcome. >> >>> This easily violates the "principle of least surprise", so I think we >>> should change the readbuffer to a stack. As far as I can tell, changing >>> the _pushback function to "unshift" instead of "push" to the readbuffer >>> breaks only the Root/RootIO.t test designed to test the old behaviour. I >>> don't see any other tests failing on my system that don't fail without >>> this patch. >>> >>> Any comments from the core devs? >> >> I don't have a problem with that beyond the change to the RootIO.t tests (it implies a specific behavior that some developers expect, so is a very subtle API change). However, this is how one would expect it, to be more like an 'unread' stack instead of a queue. In fact, there is a module I used for Biome's pushback/readline called IO::Unread that implements an IO layer for mimicing this behavior, might be worth looking into. >> >>> Cheers, >>> Kai >> >> chris >> >> >> Christopher Fields >> Senior Research Scientist >> National Center for Supercomputing Applications >> Institute for Genomic Biology >> University of Illinois Urbana-Champaign >> 1206 W. Gregory Dr. , MC-195 >> Urbana, IL 61801 >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : > =========================================================== > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Thu Aug 4 18:11:29 2011 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 4 Aug 2011 17:11:29 -0500 Subject: [Bioperl-l] [Bioperl-guts-l] BioPerl on GitHub will not install In-Reply-To: <007DAD37-BC86-4D1F-8C40-816890661F7D@illinois.edu> References: <007DAD37-BC86-4D1F-8C40-816890661F7D@illinois.edu> Message-ID: <0A691C42-539E-45A1-B44F-7B0B5D8DE3D8@illinois.edu> Now fixed on github. There was some cruft left in Bio::Root::Build that didn't deal with the recent script renaming. chris On Aug 4, 2011, at 12:42 PM, Chris Fields wrote: > Yes, I can replicate that. It's from the recent renaming for scripts. I'll look into it. > > chris > > On Aug 4, 2011, at 12:39 PM, Razi Khaja wrote: > >> All, >> >> I just checked out the latest development version of BioPerl from GitHub and >> found that it does not install because bp_das_server.pl is missing. >> >> Building BioPerl >> 'blib/script/bp_das_server.pl' and 'blib/script/bp_das_server.pl' are >> identical (not copied) at /opt/bioperl-live/Bio/Root/Build.pm line 219 >> Use of uninitialized value in utime at /usr/lib/perl5/5.8.8/File/Copy.pm >> line 218. >> Use of uninitialized value in utime at /usr/lib/perl5/5.8.8/File/Copy.pm >> line 218. >> Can't rename 'blib/script/bp_das_server.pl' to 'blib/script/bp_das_server.pl': >> No such file or directory at /opt/bioperl-live/Bio/Root/Build.pm line 219. >> >> After copying the bp_das_server.pl that I had from a previous installation >> to 'blib/script', I was able to ./Build test and ./Build install the >> development version I checked out. >> >> Could someone test out this problem and fix it on github? if it really is a >> problem? >> >> Thanks, >> >> Razi >> _______________________________________________ >> Bioperl-guts-l mailing list >> Bioperl-guts-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-guts-l > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From shachigahoimbi at gmail.com Fri Aug 5 01:40:11 2011 From: shachigahoimbi at gmail.com (Shachi Gahoi) Date: Fri, 5 Aug 2011 11:10:11 +0530 Subject: [Bioperl-l] How to show branch length value in tree In-Reply-To: <4E3A80EF.2010409@gmail.com> References: <4E3953FE.5080304@gmail.com> <4E3954AE.2080401@gmail.com> <4E3A80EF.2010409@gmail.com> Message-ID: Instead of both node id and accession, Can I replace node id with accession? On Thu, Aug 4, 2011 at 4:52 PM, Roy Chaudhuri wrote: > Hi Shachi, > > Please keep replies on the mailing list, that way others can follow the > discussion. > > As I mentioned, it is not possible to draw njplot-style trees with labelled > branches using Bio::Tree::Draw::Cladogram, it currently only labels nodes > (you could perhaps add branch labels as a feature request on Redmine). > > The code I gave overwrites the existing "leaf" node ids (the accessions) > with branch lengths, if you want to also keep the existing labels you could > try something like: > > > for my $node ($t1->get_nodes) { > if ($node->is_Leaf) { > $node->id($node->branch_**length.' '.$node->id); > } else { > > $node->id($node->branch_**length) > } > } > > Cheers, > Roy. > > > On 04/08/2011 05:36, Shachi Gahoi wrote: > >> Thank You so much. Now branch length is coming in tree. >> >> But I want Accesssion number in place of node id. >> >> I attached snapshot of tree as I want. Please tell me how can I do this. >> >> >> >> >> On Wed, Aug 3, 2011 at 7:31 PM, Roy Chaudhuri > >> wrote: >> >> Sorry, the code had a typo, it should be: >> >> >> my $obj1 = Bio::Tree::Draw::Cladogram->__**new(-bootstrap => 1, >> -tree => $t1, >> -compact => 0); >> for my $node ($t1->get_nodes) { >> >> $node->id($node->branch___**length) if defined >> $node->branch_length; >> } >> $obj1->print(-file => "$dir/$stem.eps") >> >> On 03/08/2011 14:58, Roy Chaudhuri wrote: >> >> Hi Shachi, >> >> I don't think you can draw labels on branches using >> Bio::Tree::Draw::Cladogram. However, it will draw node labels, >> so you >> could copy the branch lengths over to the node ids: >> >> my $obj1 = Bio::Tree::Draw::Cladogram->__**new(-bootstrap => 1, >> -tree => $t1, >> -compact => 0); >> for my $node ($tree->get_nodes) { >> $node->id($node->branch___**length) if defined >> $node->branch_length; >> } >> $obj1->print(-file => "$dir/$stem.eps") >> >> Incidentally, in your script you write the tree out to a file, >> then read >> it back in using TreeIO. This is unnecessary, you can use $tree >> directly >> as input to Bio::Tree::Draw::Cladogram. >> >> Alternatively, you could write out a newick file and use >> non-Bioperl >> software such as njplot or MEGA to draw your tree with labelled >> branch >> lengths. >> >> Cheers, >> Roy. >> >> On 03/08/2011 07:00, Shachi Gahoi wrote: >> >> Dear All >> >> I am using Bio::Tree modules for constructing and drawing >> tree. *I am unable >> to show branch length value in tree. >> * >> Please tell me How can I do this, if anybody knows. >> >> Here is my script which i am using...and i also attached >> generated tree. >> >> Thanks in advance >> >> ##############################**__############################ >> **##__##########################**####__###### >> >> use Bio::AlignIO; >> use Bio::Align::ProteinStatistics; >> use Bio::Tree::DistanceFactory; >> use Bio::TreeIO; >> use Bio::Tree::Draw::Cladogram; >> >> # for a dna alignment >> # can also use ProteinStatistics >> >> my $alnio = Bio::AlignIO->new(-file => 'ADP.aln', >> -format=>'clustalw'); >> >> my $dfactory = Bio::Tree::DistanceFactory->__**new(-method => >> 'UPGMA'); >> >> my $stats = Bio::Align::ProteinStatistics-**__>new; >> >> my $treeout = Bio::TreeIO->new(-format => 'newick', -file >> =>'>ADP1.dnd'); >> >> while( my $aln = $alnio->next_aln ) >> { >> my $mat = $stats->distance(-method => 'Kimura', -align >> => $aln); >> >> my $tree = $dfactory->make_tree($mat); >> $treeout->write_tree($tree); >> } >> >> my $dir = shift || '.'; >> >> opendir(DIR, $dir) || die $!; >> for my $file ( readdir(DIR) ) >> { >> next unless $file =~ /(\S+)\.dnd$/; >> my $stem = $1; >> my $treeio = Bio::TreeIO->new('-format' => 'newick', >> '-file' => "$dir/$file"); >> >> if( my $t1 = $treeio->next_tree ) >> { >> my $obj1 = >> Bio::Tree::Draw::Cladogram->__**new(-bootstrap => 1, >> -tree >> => $t1, >> >> -compact => 0); >> $obj1->print(-file => "$dir/$stem.eps"); >> } >> } >> >> ##############################**__############################ >> **##__##########################**####__############## >> >> >> >> >> ______________________________**___________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> >> > >> >> http://lists.open-bio.org/__**mailman/listinfo/bioperl-l >> >> > >> >> >> >> >> >> >> -- >> Regards, >> Shachi >> > > -- Regards, Shachi From kai.blin at biotech.uni-tuebingen.de Fri Aug 5 04:40:57 2011 From: kai.blin at biotech.uni-tuebingen.de (Kai Blin) Date: Fri, 05 Aug 2011 10:40:57 +0200 Subject: [Bioperl-l] Bio::Root::IO _readline/_pushback behavior In-Reply-To: <4196E008-4A81-41E5-A4F9-F9F8D3851E5C@illinois.edu> References: <4E390CE8.2050100@biotech.uni-tuebingen.de> <4196E008-4A81-41E5-A4F9-F9F8D3851E5C@illinois.edu> Message-ID: <4E3BAC99.8050806@biotech.uni-tuebingen.de> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2011-08-04 23:42, Chris Fields wrote: > Yeah, it's a queue; the 'buffering' is a simple internal array using > push/shift. I say we merge the change in from the branch and fix > any modules accordingly. Ok, I'm happy to take care of it, if people can tell me how to find and fix modules that use the old assumption. My initial attempt right after making the change was to run the test suite, which came up clean apart from the RootIO.t case that my patch now modifies as well. Cheers, Kai - -- Dipl.-Inform. Kai Blin kai.blin at biotech.uni-tuebingen.de Institute for Microbiology and Infection Medicine Division of Microbiology/Biotechnology Eberhard-Karls-Universit?t T?bingen Auf der Morgenstelle 28 Phone : ++49 7071 29-78841 D-72076 T?bingen Fax : ++49 7071 29-5979 Germany Homepage: http://www.mikrobio.uni-tuebingen.de/ag_wohlleben -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJOO6yZAAoJEKM5lwBiwTTPdjsH/0ELbz9VYIzxlpx+QZ3Jvd55 KTXVP+oOzjIDlOdxbdqYR0w04VXnpkQek3hVt0mbreuKvtdMJY/YhRwZLiOzYSak ruhswUJQnm3K2vkaqpgLESIIUASneFrW7ezfV3R9q/Ov730GBDAtkLTEk7cVV5Cg W515ixJtNC7v6fZmNFJZudQbcUYYgy+8BFgvNUaSoH8YqubMXzjFXknBWeWT0qco ivHjqIc6Nkap799ijPiLEU7ArI1pEOB2jyvjntIocFR72imbo7e86RaVHJCNl/N7 GFbRGoH2m7LVeWFYuNM3vsTS3W4KVLg9U/8UBysykR3uoHAVJhm4T5nCT4NKE/w= =z6QZ -----END PGP SIGNATURE----- From roy.chaudhuri at gmail.com Fri Aug 5 06:54:32 2011 From: roy.chaudhuri at gmail.com (Roy Chaudhuri) Date: Fri, 05 Aug 2011 11:54:32 +0100 Subject: [Bioperl-l] How to show branch length value in tree In-Reply-To: References: <4E3953FE.5080304@gmail.com> <4E3954AE.2080401@gmail.com> <4E3A80EF.2010409@gmail.com> Message-ID: <4E3BCBE8.4030303@gmail.com> In that case then you only want to add branch lengths to non-leaf nodes, so it would be: for my $node ($t1->get_nodes) { $node->id($node->branch_length) unless $node->is_Leaf } On 05/08/2011 06:40, Shachi Gahoi wrote: > > Instead of both node id and accession, Can I replace node id with accession? > > > On Thu, Aug 4, 2011 at 4:52 PM, Roy Chaudhuri > wrote: > > Hi Shachi, > > Please keep replies on the mailing list, that way others can follow > the discussion. > > As I mentioned, it is not possible to draw njplot-style trees with > labelled branches using Bio::Tree::Draw::Cladogram, it currently > only labels nodes (you could perhaps add branch labels as a feature > request on Redmine). > > The code I gave overwrites the existing "leaf" node ids (the > accessions) with branch lengths, if you want to also keep the > existing labels you could try something like: > > > for my $node ($t1->get_nodes) { > if ($node->is_Leaf) { > $node->id($node->branch___length.' '.$node->id); > } else { > > $node->id($node->branch___length) > } > } > > Cheers, > Roy. > > > On 04/08/2011 05:36, Shachi Gahoi wrote: > > Thank You so much. Now branch length is coming in tree. > > But I want Accesssion number in place of node id. > > I attached snapshot of tree as I want. Please tell me how can I > do this. > > > > > On Wed, Aug 3, 2011 at 7:31 PM, Roy Chaudhuri > > >> wrote: > > Sorry, the code had a typo, it should be: > > > my $obj1 = Bio::Tree::Draw::Cladogram->____new(-bootstrap => 1, > -tree => $t1, > -compact => 0); > for my $node ($t1->get_nodes) { > > $node->id($node->branch_____length) if defined > $node->branch_length; > } > $obj1->print(-file => "$dir/$stem.eps") > > On 03/08/2011 14:58, Roy Chaudhuri wrote: > > Hi Shachi, > > I don't think you can draw labels on branches using > Bio::Tree::Draw::Cladogram. However, it will draw node > labels, > so you > could copy the branch lengths over to the node ids: > > my $obj1 = > Bio::Tree::Draw::Cladogram->____new(-bootstrap => 1, > -tree => > $t1, > -compact => > 0); > for my $node ($tree->get_nodes) { > $node->id($node->branch_____length) if defined > $node->branch_length; > } > $obj1->print(-file => "$dir/$stem.eps") > > Incidentally, in your script you write the tree out to a > file, > then read > it back in using TreeIO. This is unnecessary, you can > use $tree > directly > as input to Bio::Tree::Draw::Cladogram. > > Alternatively, you could write out a newick file and use > non-Bioperl > software such as njplot or MEGA to draw your tree with > labelled > branch > lengths. > > Cheers, > Roy. > > On 03/08/2011 07:00, Shachi Gahoi wrote: > > Dear All > > I am using Bio::Tree modules for constructing and > drawing > tree. *I am unable > to show branch length value in tree. > * > Please tell me How can I do this, if anybody knows. > > Here is my script which i am using...and i also attached > generated tree. > > Thanks in advance > > > ##############################____############################__##__##########################__####__###### > > use Bio::AlignIO; > use Bio::Align::ProteinStatistics; > use Bio::Tree::DistanceFactory; > use Bio::TreeIO; > use Bio::Tree::Draw::Cladogram; > > # for a dna alignment > # can also use ProteinStatistics > > my $alnio = Bio::AlignIO->new(-file => 'ADP.aln', > -format=>'clustalw'); > > my $dfactory = > Bio::Tree::DistanceFactory->____new(-method => > 'UPGMA'); > > my $stats = Bio::Align::ProteinStatistics-____>new; > > my $treeout = Bio::TreeIO->new(-format => 'newick', > -file > =>'>ADP1.dnd'); > > while( my $aln = $alnio->next_aln ) > { > my $mat = $stats->distance(-method => 'Kimura', > -align > => $aln); > > my $tree = $dfactory->make_tree($mat); > $treeout->write_tree($tree); > } > > my $dir = shift || '.'; > > opendir(DIR, $dir) || die $!; > for my $file ( readdir(DIR) ) > { > next unless $file =~ /(\S+)\.dnd$/; > my $stem = $1; > my $treeio = Bio::TreeIO->new('-format' => > 'newick', > '-file' => "$dir/$file"); > > if( my $t1 = $treeio->next_tree ) > { > my $obj1 = > Bio::Tree::Draw::Cladogram->____new(-bootstrap => 1, > > -tree > => $t1, > > -compact => 0); > $obj1->print(-file => "$dir/$stem.eps"); > } > } > > > ##############################____############################__##__##########################__####__############## > > > > > ___________________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > > > > http://lists.open-bio.org/____mailman/listinfo/bioperl-l > > > > > > > > > > -- > Regards, > Shachi > > > > > > -- > Regards, > Shachi From lskatz at gmail.com Fri Aug 5 10:32:50 2011 From: lskatz at gmail.com (Lee Katz) Date: Fri, 5 Aug 2011 10:32:50 -0400 Subject: [Bioperl-l] SeqIO: paired end reads In-Reply-To: <57EA9809-E999-43EF-B340-9A552A4A3FB6@gmail.com> References: <57EA9809-E999-43EF-B340-9A552A4A3FB6@gmail.com> Message-ID: Thank you. I figured out through the Newbler manual that there is a linker sequence to separate the paired end reads. Then, the forum at http://seqanswers.com/forums/showthread.php?t=12940 showed me that the linker sequence is "GTTGGAACCGAAAGGGTTTGAATTCAAACCCTTTCGGTTCCAAC". I think a useful addition to bioperl could be to have paired end reads. This is outside of the domain of bioperl, but now I am left wondering how I could specify the distance between reads in Newbler, if the linker sequence is fixed. On Wed, Aug 3, 2011 at 5:17 PM, Jason Stajich wrote: > it depends on the assembler - For Illumina usually the paired ends end with > /1 /2 and they have the same ID but are in two different files. Depends on > if you are using interleaved paired reads or in two separate files. some > just expect the paired reads to be mated by virtue of being in same order in > two files. the ABYSS and Velvet manuals both explain what is expected so > you will want to check on what are Newbler's assumptions on how the paired > ends are encoded. > > There are simulator tools if that is what you are trying to do in the end? > checkout wgsim which comes with samtools or try dnaa > > > On Aug 3, 2011, at 1:01 PM, Lee Katz wrote: > > > Hi all! I was wondering how to construct paired end reads from scratch. > I > > know the locations of certain sequences across the genome with a high > degree > > of confidence and so I want to give them to my assembler as paired end > > reads, along with my other sequence runs (454 and Illumina runs). I plan > to > > use Newbler. > > > > My only problem is that I do not know the correct format in order to > specify > > distance and sequences for a paired end reads run, and so I hope that > there > > is a SeqIO solution. At the least, I hope that one bioperl member can > point > > me to where the definition of the paired end reads file format is...? > > > > Thank you! > > > > --Lee > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From cjfields at illinois.edu Fri Aug 5 11:50:42 2011 From: cjfields at illinois.edu (Chris Fields) Date: Fri, 5 Aug 2011 10:50:42 -0500 Subject: [Bioperl-l] Bio::Root::IO _readline/_pushback behavior In-Reply-To: <4E3BAC99.8050806@biotech.uni-tuebingen.de> References: <4E390CE8.2050100@biotech.uni-tuebingen.de> <4196E008-4A81-41E5-A4F9-F9F8D3851E5C@illinois.edu> <4E3BAC99.8050806@biotech.uni-tuebingen.de> Message-ID: <86DE321E-E532-4089-9B89-E257DB37CE46@illinois.edu> I would just go based on the test suite for now. If we run into others that don't have tests we need to add new tests for those anyway. chris On Aug 5, 2011, at 3:40 AM, Kai Blin wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 2011-08-04 23:42, Chris Fields wrote: > >> Yeah, it's a queue; the 'buffering' is a simple internal array using >> push/shift. I say we merge the change in from the branch and fix >> any modules accordingly. > > Ok, I'm happy to take care of it, if people can tell me how to find and > fix modules that use the old assumption. My initial attempt right after > making the change was to run the test suite, which came up clean apart > from the RootIO.t case that my patch now modifies as well. > > Cheers, > Kai > > - -- > Dipl.-Inform. Kai Blin kai.blin at biotech.uni-tuebingen.de > Institute for Microbiology and Infection Medicine > Division of Microbiology/Biotechnology > Eberhard-Karls-Universit?t T?bingen > Auf der Morgenstelle 28 Phone : ++49 7071 29-78841 > D-72076 T?bingen Fax : ++49 7071 29-5979 > Germany > Homepage: http://www.mikrobio.uni-tuebingen.de/ag_wohlleben > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.10 (GNU/Linux) > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ > > iQEcBAEBAgAGBQJOO6yZAAoJEKM5lwBiwTTPdjsH/0ELbz9VYIzxlpx+QZ3Jvd55 > KTXVP+oOzjIDlOdxbdqYR0w04VXnpkQek3hVt0mbreuKvtdMJY/YhRwZLiOzYSak > ruhswUJQnm3K2vkaqpgLESIIUASneFrW7ezfV3R9q/Ov730GBDAtkLTEk7cVV5Cg > W515ixJtNC7v6fZmNFJZudQbcUYYgy+8BFgvNUaSoH8YqubMXzjFXknBWeWT0qco > ivHjqIc6Nkap799ijPiLEU7ArI1pEOB2jyvjntIocFR72imbo7e86RaVHJCNl/N7 > GFbRGoH2m7LVeWFYuNM3vsTS3W4KVLg9U/8UBysykR3uoHAVJhm4T5nCT4NKE/w= > =z6QZ > -----END PGP SIGNATURE----- > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Fri Aug 5 16:49:54 2011 From: cjfields at illinois.edu (Chris Fields) Date: Fri, 5 Aug 2011 15:49:54 -0500 Subject: [Bioperl-l] BioPerl Test requirements In-Reply-To: <0D28A228-53D1-4843-B99D-9F8A48132EA2@illinois.edu> References: <79E5ED11-1D3F-436F-AA02-D922EAEB123A@illinois.edu> <0D28A228-53D1-4843-B99D-9F8A48132EA2@illinois.edu> Message-ID: <1FDBD8D4-E8E6-44EB-A18A-7E74A0EF9014@illinois.edu> Okay, I tested this out on a branch and then merged into 'master'. Test::Most is a 'build_requires'; Bio::Root::Test is now just a wrapper for Test::Most methods, with a few extra wrinkles to deal with Test::Warn and a few additional methods. I also removed extraneous modules in t/lib along with Bio::Root::Test::Warn (that code was merged into Bio::Root::Test to keep all evilness in one contained location). The nice thing is the transition didn't require changing any tests. However, this will require some testing across the board to make sure everything's working. Maybe worth getting the code cleaned up for another quick point release prior to the GSoC mayhem to ensue shortly... :) chris On Aug 1, 2011, at 3:34 PM, Chris Fields wrote: > Okay, will do. I'll initially test on a branch and then pull in. Thanks for the feedback Hilmar and Dave! > > chris > > On Aug 1, 2011, at 3:30 PM, Hilmar Lapp wrote: > >> I think the small burden this change incurs for each developer is well outweighed by the reduced maintenance and installation burden. Go for it. >> >> -hilmar >> >> On Aug 1, 2011, at 12:07 AM, Chris Fields wrote: >> >>> All, >>> >>> We are currently using a BioPerl-specific module for running tests called Bio::Root::Test. It is essentially a wrapper module, re-exporting all the methods for Test::More, Test::Exception, and Test::Warn. One problem: it currently expects a copy of Test::Warn and Test::Exception in each repository as a fallback. Another problem: these included modules appear to be triggering dependencies with debian packaging. >>> >>> As an example of one hidden dependency, the included Test::Warn requires Array::Compare, which converted to Moose a few years ago, so you automatically have to install the entire Moose dependency tree, even though Bioperl doesn't require it (not a slam on Moose, you really SHOULD be using Moose these days. No, really :). >>> >>> Anway, more recent versions of Test::Warn don't have this requirement, but as we package an old version of this module we get stuck with the dependencies until we (manually) update this for each repository. Ick. >>> >>> I think the best solution is to remove the bioperl-local modules in t/lib and list Test::Most instead as a 'build_requires' in Build.PL, e.g. the module is only necessary for the build phase so is optionally installed. Test::Most essentially does exactly the same thing as Bio::Root::Test and more; it also includes Test::Deep and Test::Diff (Bio::Root::Test has a few additional methods of use as well). >>> >>> As this will require developers to use Test::Most instead, though, I though it would be worth asking on the list to see if there are any objections. Any thoughts? >>> >>> >>> chris >>> >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> -- >> =========================================================== >> : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : >> =========================================================== >> >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From kai.blin at biotech.uni-tuebingen.de Fri Aug 5 18:35:32 2011 From: kai.blin at biotech.uni-tuebingen.de (Kai Blin) Date: Sat, 06 Aug 2011 00:35:32 +0200 Subject: [Bioperl-l] Bio::Root::IO _readline/_pushback behavior In-Reply-To: <86DE321E-E532-4089-9B89-E257DB37CE46@illinois.edu> References: <4E390CE8.2050100@biotech.uni-tuebingen.de> <4196E008-4A81-41E5-A4F9-F9F8D3851E5C@illinois.edu> <4E3BAC99.8050806@biotech.uni-tuebingen.de> <86DE321E-E532-4089-9B89-E257DB37CE46@illinois.edu> Message-ID: <4E3C7034.2000106@biotech.uni-tuebingen.de> On 2011-08-05 17:50, Chris Fields wrote: > I would just go based on the test suite for now. If we run into > others that don't have tests we need to add new tests for those > anyway. Ok, pushed to master. Cheers, Kai -- Dipl.-Inform. Kai Blin kai.blin at biotech.uni-tuebingen.de Institute for Microbiology and Infection Medicine Division of Microbiology/Biotechnology Eberhard-Karls-University of T?bingen Auf der Morgenstelle 28 Phone : ++49 7071 29-78841 D-72076 T?bingen Fax : ++49 7071 29-5979 Deutschland Homepage: http://www.mikrobio.uni-tuebingen.de/ag_wohlleben From shachigahoimbi at gmail.com Sat Aug 6 00:25:43 2011 From: shachigahoimbi at gmail.com (Shachi Gahoi) Date: Sat, 6 Aug 2011 09:55:43 +0530 Subject: [Bioperl-l] How to show branch length value in tree In-Reply-To: <4E3BCBE8.4030303@gmail.com> References: <4E3953FE.5080304@gmail.com> <4E3954AE.2080401@gmail.com> <4E3A80EF.2010409@gmail.com> <4E3BCBE8.4030303@gmail.com> Message-ID: Thank you so much. Please tell me one more thing, *can I reduce branch length font? * On Fri, Aug 5, 2011 at 4:24 PM, Roy Chaudhuri wrote: > In that case then you only want to add branch lengths to non-leaf nodes, so > it would be: > > > for my $node ($t1->get_nodes) { > $node->id($node->branch_**length) unless $node->is_Leaf > > } > > > On 05/08/2011 06:40, Shachi Gahoi wrote: > >> >> Instead of both node id and accession, Can I replace node id with >> accession? >> >> >> On Thu, Aug 4, 2011 at 4:52 PM, Roy Chaudhuri > >> wrote: >> >> Hi Shachi, >> >> Please keep replies on the mailing list, that way others can follow >> the discussion. >> >> As I mentioned, it is not possible to draw njplot-style trees with >> labelled branches using Bio::Tree::Draw::Cladogram, it currently >> only labels nodes (you could perhaps add branch labels as a feature >> request on Redmine). >> >> The code I gave overwrites the existing "leaf" node ids (the >> accessions) with branch lengths, if you want to also keep the >> existing labels you could try something like: >> >> >> for my $node ($t1->get_nodes) { >> if ($node->is_Leaf) { >> $node->id($node->branch___**length.' '.$node->id); >> } else { >> >> $node->id($node->branch___**length) >> } >> } >> >> Cheers, >> Roy. >> >> >> On 04/08/2011 05:36, Shachi Gahoi wrote: >> >> Thank You so much. Now branch length is coming in tree. >> >> But I want Accesssion number in place of node id. >> >> I attached snapshot of tree as I want. Please tell me how can I >> do this. >> >> >> >> >> On Wed, Aug 3, 2011 at 7:31 PM, Roy Chaudhuri >> >> > >> > >>> >> wrote: >> >> Sorry, the code had a typo, it should be: >> >> >> my $obj1 = Bio::Tree::Draw::Cladogram->__**__new(-bootstrap => >> 1, >> -tree => $t1, >> -compact => 0); >> for my $node ($t1->get_nodes) { >> >> $node->id($node->branch_____**length) if defined >> $node->branch_length; >> } >> $obj1->print(-file => "$dir/$stem.eps") >> >> On 03/08/2011 14:58, Roy Chaudhuri wrote: >> >> Hi Shachi, >> >> I don't think you can draw labels on branches using >> Bio::Tree::Draw::Cladogram. However, it will draw node >> labels, >> so you >> could copy the branch lengths over to the node ids: >> >> my $obj1 = >> Bio::Tree::Draw::Cladogram->__**__new(-bootstrap => 1, >> -tree => >> $t1, >> -compact => >> 0); >> for my $node ($tree->get_nodes) { >> $node->id($node->branch_____**length) if defined >> $node->branch_length; >> } >> $obj1->print(-file => "$dir/$stem.eps") >> >> Incidentally, in your script you write the tree out to a >> file, >> then read >> it back in using TreeIO. This is unnecessary, you can >> use $tree >> directly >> as input to Bio::Tree::Draw::Cladogram. >> >> Alternatively, you could write out a newick file and use >> non-Bioperl >> software such as njplot or MEGA to draw your tree with >> labelled >> branch >> lengths. >> >> Cheers, >> Roy. >> >> On 03/08/2011 07:00, Shachi Gahoi wrote: >> >> Dear All >> >> I am using Bio::Tree modules for constructing and >> drawing >> tree. *I am unable >> to show branch length value in tree. >> * >> Please tell me How can I do this, if anybody knows. >> >> Here is my script which i am using...and i also >> attached >> generated tree. >> >> Thanks in advance >> >> >> ##############################**____##########################** >> ##__##__######################**####__####__###### >> >> use Bio::AlignIO; >> use Bio::Align::ProteinStatistics; >> use Bio::Tree::DistanceFactory; >> use Bio::TreeIO; >> use Bio::Tree::Draw::Cladogram; >> >> # for a dna alignment >> # can also use ProteinStatistics >> >> my $alnio = Bio::AlignIO->new(-file => 'ADP.aln', >> -format=>'clustalw'); >> >> my $dfactory = >> Bio::Tree::DistanceFactory->__**__new(-method => >> 'UPGMA'); >> >> my $stats = Bio::Align::ProteinStatistics-**____>new; >> >> my $treeout = Bio::TreeIO->new(-format => 'newick', >> -file >> =>'>ADP1.dnd'); >> >> while( my $aln = $alnio->next_aln ) >> { >> my $mat = $stats->distance(-method => 'Kimura', >> -align >> => $aln); >> >> my $tree = $dfactory->make_tree($mat); >> $treeout->write_tree($tree); >> } >> >> my $dir = shift || '.'; >> >> opendir(DIR, $dir) || die $!; >> for my $file ( readdir(DIR) ) >> { >> next unless $file =~ /(\S+)\.dnd$/; >> my $stem = $1; >> my $treeio = Bio::TreeIO->new('-format' => >> 'newick', >> '-file' => "$dir/$file"); >> >> if( my $t1 = $treeio->next_tree ) >> { >> my $obj1 = >> Bio::Tree::Draw::Cladogram->__**__new(-bootstrap => >> 1, >> >> -tree >> => $t1, >> >> -compact => 0); >> $obj1->print(-file => "$dir/$stem.eps"); >> } >> } >> >> >> ##############################**____##########################** >> ##__##__######################**####__####__############## >> >> >> >> >> ______________________________**_____________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org > bio.org > >> >> >> >> >> >> http://lists.open-bio.org/____**mailman/listinfo/bioperl-l >> >> > >> >> >> >> >> >> >> >> >> >> >> -- >> Regards, >> Shachi >> >> >> >> >> >> -- >> Regards, >> Shachi >> > > -- Regards, Shachi From p.j.a.cock at googlemail.com Sun Aug 7 05:40:52 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Sun, 7 Aug 2011 10:40:52 +0100 Subject: [Bioperl-l] SeqIO: paired end reads In-Reply-To: References: <57EA9809-E999-43EF-B340-9A552A4A3FB6@gmail.com> Message-ID: On Friday, August 5, 2011, Lee Katz wrote: > Thank you. I figured out through the Newbler manual that there is a linker > sequence to separate the paired end reads. Then, the forum at > http://seqanswers.com/forums/showthread.php?t=12940 showed me that the > linker sequence is "GTTGGAACCGAAAGGGTTTGAATTCAAACCCTTTCGGTTCCAAC". There is more than one Roche 454 linker sequence depending on the chemistry used, one is the same as it's reversve complement, one isn't. There is nothing in the SFF file format (nor the Roche specific XML manifest last time I checked) that handles the paired end information explicitly. > I think a useful addition to bioperl could be to have paired end reads. > Maybe, but to do this well you'd want to do flow space alignment of the reads to the linker sequence to find the imperfectly called linker sequences. Personally I use ssf_extract which is a free open source command line tool for this (calling an external aligned tool for paid end 454). > This is outside of the domain of bioperl, but now I am left wondering how I > could specify the distance between reads in Newbler, if the linker sequence > is fixed. How to do that depends on the aligned or assembly tool you are using. Peter From cjfields at illinois.edu Sun Aug 7 11:51:19 2011 From: cjfields at illinois.edu (Chris Fields) Date: Sun, 7 Aug 2011 10:51:19 -0500 Subject: [Bioperl-l] SeqIO: paired end reads In-Reply-To: References: <57EA9809-E999-43EF-B340-9A552A4A3FB6@gmail.com> Message-ID: <19923C8B-6C84-4D9B-8D37-86CAE9BC681E@illinois.edu> On Aug 7, 2011, at 4:40 AM, Peter Cock wrote: > On Friday, August 5, 2011, Lee Katz wrote: >> Thank you. I figured out through the Newbler manual that there is a > linker >> sequence to separate the paired end reads. Then, the forum at >> http://seqanswers.com/forums/showthread.php?t=12940 showed me that the >> linker sequence is "GTTGGAACCGAAAGGGTTTGAATTCAAACCCTTTCGGTTCCAAC". > > There is more than one Roche 454 linker sequence depending on the chemistry > used, one is the same as it's reversve complement, one isn't. > > There is nothing in the SFF file format (nor the Roche specific XML manifest > last time I checked) that handles the paired end information explicitly. Yep, it's all implied AFAIK. >> I think a useful addition to bioperl could be to have paired end reads. >> > > Maybe, but to do this well you'd want to do flow space alignment of the > reads to the linker sequence to find the imperfectly called linker > sequences. > > Personally I use ssf_extract which is a free open source command line tool > for this (calling an external aligned tool for paid end 454). I think it could be done, but I would implement something like this as a wrapper around faster tools (like sff_extract or similar). Implementing the functionality in pure (bio)perl/(bio)python doesn't make much sense if there are newer/faster tools out there. >> This is outside of the domain of bioperl, but now I am left wondering how > I >> could specify the distance between reads in Newbler, if the linker > sequence >> is fixed. > > How to do that depends on the aligned or assembly tool you are using. > > Peter Yep. I don't think there is a defined way to specify that in any format that I know of. chris From Russell.Smithies at agresearch.co.nz Sun Aug 7 17:45:19 2011 From: Russell.Smithies at agresearch.co.nz (Smithies, Russell) Date: Mon, 8 Aug 2011 09:45:19 +1200 Subject: [Bioperl-l] How to show branch length value in tree In-Reply-To: References: <4E3953FE.5080304@gmail.com> <4E3954AE.2080401@gmail.com> <4E3A80EF.2010409@gmail.com> <4E3BCBE8.4030303@gmail.com> Message-ID: <18DF7D20DFEC044098A1062202F5FFF3396074D3C9@exchsth.agresearch.co.nz> The constructor for Bio::Tree::Draw::Cladogram lets you specify the font and size, did you try setting it there? Title : new Usage : my $obj = Bio::Tree::Draw::Cladogram->new(); Function: Builds a new Bio::Tree::Draw::Cladogram object Returns : Bio::Tree::Draw::Cladogram Args : -tree => Bio::Tree::Tree object -second => Bio::Tree::Tree object (optional) -font => font name [string] (optional) <<<<------------- -size => font size [integer] (optional) <<<<------------- -top => top margin [integer] (optional) -bottom => bottom margin [integer] (optional) -left => left margin [integer] (optional) -right => right margin [integer] (optional) -tip => extra tip space [integer] (optional) -column => extra space between cladograms [integer] (optional) -compact => ignore branch lengths [boolean] (optional) -ratio => horizontal to vertical ratio [integer] (optional) -colors => use colors to color edges [boolean] (optional) -bootstrap => draw bootstrap or internal ids [boolean] --Russell > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Shachi Gahoi > Sent: Saturday, 6 August 2011 4:26 p.m. > To: Roy Chaudhuri > Cc: bioperl-l List > Subject: Re: [Bioperl-l] How to show branch length value in tree > > Thank you so much. > > Please tell me one more thing, *can I reduce branch length font? > * > On Fri, Aug 5, 2011 at 4:24 PM, Roy Chaudhuri > wrote: > > > In that case then you only want to add branch lengths to non-leaf > nodes, so > > it would be: > > > > > > for my $node ($t1->get_nodes) { > > $node->id($node->branch_**length) unless $node->is_Leaf > > > > } > > > > > > On 05/08/2011 06:40, Shachi Gahoi wrote: > > > >> > >> Instead of both node id and accession, Can I replace node id with > >> accession? > >> > >> > >> On Thu, Aug 4, 2011 at 4:52 PM, Roy Chaudhuri > >> >> wrote: > >> > >> Hi Shachi, > >> > >> Please keep replies on the mailing list, that way others can > follow > >> the discussion. > >> > >> As I mentioned, it is not possible to draw njplot-style trees > with > >> labelled branches using Bio::Tree::Draw::Cladogram, it currently > >> only labels nodes (you could perhaps add branch labels as a > feature > >> request on Redmine). > >> > >> The code I gave overwrites the existing "leaf" node ids (the > >> accessions) with branch lengths, if you want to also keep the > >> existing labels you could try something like: > >> > >> > >> for my $node ($t1->get_nodes) { > >> if ($node->is_Leaf) { > >> $node->id($node->branch___**length.' '.$node->id); > >> } else { > >> > >> $node->id($node->branch___**length) > >> } > >> } > >> > >> Cheers, > >> Roy. > >> > >> > >> On 04/08/2011 05:36, Shachi Gahoi wrote: > >> > >> Thank You so much. Now branch length is coming in tree. > >> > >> But I want Accesssion number in place of node id. > >> > >> I attached snapshot of tree as I want. Please tell me how can > I > >> do this. > >> > >> > >> > >> > >> On Wed, Aug 3, 2011 at 7:31 PM, Roy Chaudhuri > >> > >> > > >> >> >>> > >> wrote: > >> > >> Sorry, the code had a typo, it should be: > >> > >> > >> my $obj1 = Bio::Tree::Draw::Cladogram->__**__new(- > bootstrap => > >> 1, > >> -tree => > $t1, > >> -compact => > 0); > >> for my $node ($t1->get_nodes) { > >> > >> $node->id($node->branch_____**length) if defined > >> $node->branch_length; > >> } > >> $obj1->print(-file => "$dir/$stem.eps") > >> > >> On 03/08/2011 14:58, Roy Chaudhuri wrote: > >> > >> Hi Shachi, > >> > >> I don't think you can draw labels on branches using > >> Bio::Tree::Draw::Cladogram. However, it will draw > node > >> labels, > >> so you > >> could copy the branch lengths over to the node ids: > >> > >> my $obj1 = > >> Bio::Tree::Draw::Cladogram->__**__new(-bootstrap => 1, > >> -tree > => > >> $t1, > >> -compact > => > >> 0); > >> for my $node ($tree->get_nodes) { > >> $node->id($node->branch_____**length) if > defined > >> $node->branch_length; > >> } > >> $obj1->print(-file => "$dir/$stem.eps") > >> > >> Incidentally, in your script you write the tree out > to a > >> file, > >> then read > >> it back in using TreeIO. This is unnecessary, you can > >> use $tree > >> directly > >> as input to Bio::Tree::Draw::Cladogram. > >> > >> Alternatively, you could write out a newick file and > use > >> non-Bioperl > >> software such as njplot or MEGA to draw your tree > with > >> labelled > >> branch > >> lengths. > >> > >> Cheers, > >> Roy. > >> > >> On 03/08/2011 07:00, Shachi Gahoi wrote: > >> > >> Dear All > >> > >> I am using Bio::Tree modules for constructing and > >> drawing > >> tree. *I am unable > >> to show branch length value in tree. > >> * > >> Please tell me How can I do this, if anybody > knows. > >> > >> Here is my script which i am using...and i also > >> attached > >> generated tree. > >> > >> Thanks in advance > >> > >> > >> > ##############################**____##########################** > >> ##__##__######################**####__####__###### > >> > >> use Bio::AlignIO; > >> use Bio::Align::ProteinStatistics; > >> use Bio::Tree::DistanceFactory; > >> use Bio::TreeIO; > >> use Bio::Tree::Draw::Cladogram; > >> > >> # for a dna alignment > >> # can also use ProteinStatistics > >> > >> my $alnio = Bio::AlignIO->new(-file => 'ADP.aln', > >> -format=>'clustalw'); > >> > >> my $dfactory = > >> Bio::Tree::DistanceFactory->__**__new(-method => > >> 'UPGMA'); > >> > >> my $stats = Bio::Align::ProteinStatistics- > **____>new; > >> > >> my $treeout = Bio::TreeIO->new(-format => > 'newick', > >> -file > >> =>'>ADP1.dnd'); > >> > >> while( my $aln = $alnio->next_aln ) > >> { > >> my $mat = $stats->distance(-method => > 'Kimura', > >> -align > >> => $aln); > >> > >> my $tree = $dfactory->make_tree($mat); > >> $treeout->write_tree($tree); > >> } > >> > >> my $dir = shift || '.'; > >> > >> opendir(DIR, $dir) || die $!; > >> for my $file ( readdir(DIR) ) > >> { > >> next unless $file =~ /(\S+)\.dnd$/; > >> my $stem = $1; > >> my $treeio = Bio::TreeIO->new('-format' => > >> 'newick', > >> '-file' => "$dir/$file"); > >> > >> if( my $t1 = $treeio->next_tree ) > >> { > >> my $obj1 = > >> Bio::Tree::Draw::Cladogram->__**__new(-bootstrap > => > >> 1, > >> > >> -tree > >> => $t1, > >> > >> -compact => 0); > >> $obj1->print(-file => > "$dir/$stem.eps"); > >> } > >> } > >> > >> > >> > ##############################**____##########################** > >> ##__##__######################**####__####__############## > >> > >> > >> > >> > >> > ______________________________**_____________________ > >> Bioperl-l mailing list > >> Bioperl-l at lists.open-bio.org >> bio.org > > >> l at lists.open-__bio.org> > >> bio.org> > >> >> > >> > >> http://lists.open-bio.org/____**mailman/listinfo/bioperl- > l > >> l > >> > > >> l > >> l > >> >> > >> > >> > >> > >> > >> > >> > >> -- > >> Regards, > >> Shachi > >> > >> > >> > >> > >> > >> -- > >> Regards, > >> Shachi > >> > > > > > > > -- > Regards, > Shachi > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= From cjfields at illinois.edu Tue Aug 9 16:10:37 2011 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 9 Aug 2011 15:10:37 -0500 Subject: [Bioperl-l] Question to Bio::SearchIO::infernal.pm In-Reply-To: References: <4E32E14B020000EE00004F57@gwia1.boku.ac.at> Message-ID: <683C7B42-338F-42AE-AF93-11BFB4DB2CB7@illinois.edu> Following this up: Nadine, did you have a bug to report? It's kind of hard to fix this without some example data. chris On Aug 3, 2011, at 8:10 AM, Chris Fields wrote: > Nadine, > > Hard to guess w/o seeing the report, but I'm not terribly surprised. I believe I only coded for simple 1 CM reports, IIRC. You'll have to file this as a bug on redmine along with an example. > > chris > > On Jul 29, 2011, at 9:35 AM, Nadine Elpida Tatto wrote: > >> Hi There! >> >> >> >> I was wondering if you would or can help me. >> >> >> I have an infernal report containing about 2000 CMs from an infernal run against Rfam.cm. To parse this report I wanted to use Bio::SearchIO::infernal.pm. Unfortunately this turned out to be a problem for me, because "$parser->next_result" only delivers the result for the first CM in the report and nothing more. >> >> >> My code: >> #!/usr/bin/perl -w >> >> >> use strict;use Data::Dumper; >> use Bio::SearchIO; >> >> >> my $infile = $ARGV[0]; # infernal report >> my $parser = Bio::SearchIO->new(-format => 'Infernal', >> -file => $infile); >> >> >> while( my $result = $parser->next_result ) { >> print $result->query_name . "\n"; >> } >> >> >> exit; >> >> >> >> >> The output: >> >> >> ntatto:~$ ./infernalParser.pl infernal.output >> 5S_rRNA >> ntatto:~$ >> >> >> >> >> I would expect the following (like parsing a blast report): >> >> >> ntatto:~$ ./infernalParser.pl infernal.output >> 5S_rRNA >> 5_8S_rRNA >> U1 >> ... >> ntatto:~$ >> >> >> >> I would be glad for help. >> >> >> Thank you in advance. >> >> >> Best Regards >> >> >> N Tatto >> >> >> >> >> >> >> >> >> >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From torsten.seemann at infotech.monash.edu.au Sun Aug 14 04:32:46 2011 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Sun, 14 Aug 2011 18:32:46 +1000 Subject: [Bioperl-l] Genbank file : bad features (tag) order with /translation In-Reply-To: <5D9B8006-A20E-4AAE-88EF-0A1DCA56B26E@verizon.net> References: <7717ecd0-c9b6-4ecc-895f-db60ca89e259@l7g2000vbz.googlegroups.com> <5D9B8006-A20E-4AAE-88EF-0A1DCA56B26E@verizon.net> Message-ID: > I currently use BioPerl and SeqIO::genbank to create the *gbf files for NCBI submission, they've always accepted them. In fact I think they don't even use them, I believe they use the *tbl, *fsa, and *agp files and the ASN file as data sources. I'm pretty sure that NCBI/Genbank do *not* accept Genbank files for submission - which I found somewhat ironic! They require an ASN1 formatted file (XML-like hierarchial format, pre-dates XML), which is sometimes given a .sqn extenison if you use the Sequin GUI to prepare it. There are command line tools like "tbl2asn" which will take the .tbl and .fsa files Brian has listed to produce the ASN file too. As far as I know, there is no NCBI tools to take a .gbk and produce the .tbl/.fsa/.agp - does anyone know otherwise? -- --Torsten Seemann --Victorian Bioinformatics Consortium, Dept. Microbiology, Monash University, AUSTRALIA From cjfields at illinois.edu Sun Aug 14 10:22:10 2011 From: cjfields at illinois.edu (Chris Fields) Date: Sun, 14 Aug 2011 09:22:10 -0500 Subject: [Bioperl-l] Genbank file : bad features (tag) order with /translation In-Reply-To: References: <7717ecd0-c9b6-4ecc-895f-db60ca89e259@l7g2000vbz.googlegroups.com> <5D9B8006-A20E-4AAE-88EF-0A1DCA56B26E@verizon.net> Message-ID: <410A8BF5-D5EF-4E7A-B91C-D3DDACBABB75@illinois.edu> Not that I'm aware of, though it shouldn't be hard to set something up using Bio::SeqIO for that. chris On Aug 14, 2011, at 3:32 AM, Torsten Seemann wrote: >> I currently use BioPerl and SeqIO::genbank to create the *gbf files for NCBI submission, they've always accepted them. In fact I think they don't even use them, I believe they use the *tbl, *fsa, and *agp files and the ASN file as data sources. > > I'm pretty sure that NCBI/Genbank do *not* accept Genbank files for > submission - which I found somewhat ironic! > > They require an ASN1 formatted file (XML-like hierarchial format, > pre-dates XML), which is sometimes given a .sqn extenison if you use > the Sequin GUI to prepare it. There are command line tools like > "tbl2asn" which will take the .tbl and .fsa files Brian has listed to > produce the ASN file too. > > As far as I know, there is no NCBI tools to take a .gbk and produce > the .tbl/.fsa/.agp - does anyone know otherwise? > > -- > --Torsten Seemann > --Victorian Bioinformatics Consortium, Dept. Microbiology, Monash > University, AUSTRALIA > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From maximilien1er at gmail.com Sun Aug 14 10:23:39 2011 From: maximilien1er at gmail.com (Maxime =?ISO-8859-1?Q?D=E9raspe?=) Date: Sun, 14 Aug 2011 10:23:39 -0400 Subject: [Bioperl-l] Genbank file : bad features (tag) order with /translation In-Reply-To: References: <7717ecd0-c9b6-4ecc-895f-db60ca89e259@l7g2000vbz.googlegroups.com> <5D9B8006-A20E-4AAE-88EF-0A1DCA56B26E@verizon.net> Message-ID: <1313331819.15034.4.camel@maximilian-home> I know that Artemis from sanger institute can convert a genbank file into a sequin tab file. Then you could use that file to submit it to ncbi with their sequin soft. But I think that the genbank file would be ok too. Max On Sun, 2011-08-14 at 18:32 +1000, Torsten Seemann wrote: > > I currently use BioPerl and SeqIO::genbank to create the *gbf files for NCBI submission, they've always accepted them. In fact I think they don't even use them, I believe they use the *tbl, *fsa, and *agp files and the ASN file as data sources. > > I'm pretty sure that NCBI/Genbank do *not* accept Genbank files for > submission - which I found somewhat ironic! > > They require an ASN1 formatted file (XML-like hierarchial format, > pre-dates XML), which is sometimes given a .sqn extenison if you use > the Sequin GUI to prepare it. There are command line tools like > "tbl2asn" which will take the .tbl and .fsa files Brian has listed to > produce the ASN file too. > > As far as I know, there is no NCBI tools to take a .gbk and produce > the .tbl/.fsa/.agp - does anyone know otherwise? > From punit_vergoboy2004 at yahoo.co.in Thu Aug 18 08:14:54 2011 From: punit_vergoboy2004 at yahoo.co.in (punit kumar) Date: Thu, 18 Aug 2011 17:44:54 +0530 (IST) Subject: [Bioperl-l] query about Bio::Tools::Run::RemoteBlast In-Reply-To: <410A8BF5-D5EF-4E7A-B91C-D3DDACBABB75@illinois.edu> References: <7717ecd0-c9b6-4ecc-895f-db60ca89e259@l7g2000vbz.googlegroups.com> <5D9B8006-A20E-4AAE-88EF-0A1DCA56B26E@verizon.net> <410A8BF5-D5EF-4E7A-B91C-D3DDACBABB75@illinois.edu> Message-ID: <1313669694.59013.YahooMailNeo@web137303.mail.in.yahoo.com> hi friends ,? i am new to Bioperl , and i am using "Bio::Tools::Run::RemoteBlast" for remote blast ?i tried to use this module and i?succeed?a little yet, i want to get the description part of blast alignments which were found against my query sequence, as result is shown in format as given below, which is the out put table of ONLINE BLAST, Sequences producing significant alignments: Accession Description Max score Total score Query coverage E value Links NP_216760.1 acyl carrier protein [Mycobacterium tuberculosis H37Rv] >ref|NP_336774.1| acyl carrier protein [Mycobacterium tuberculosis CDC1551] >ref|NP_855917.1| acyl carrier protein [Mycobacterium bovis AF2122/97] >ref|YP_978350.1| acyl carrier protein [Mycobacterium bovis BCG str. Pasteur 1173P2] >ref|YP_001283588.1| acyl carrier protein [Mycobacterium tuberculosis H37Ra] >ref|YP_001288206.1| acyl carrier protein [Mycobacterium tuberculosis F11] >ref|ZP_02551632.1| acyl carrier protein [Mycobacterium tuberculosis H37Ra] >ref|YP_002645307.1| acyl carrier protein [Mycobacterium bovis BCG str. Tokyo 172] >ref|YP_003031689.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis KZN 1435] >ref|ZP_04925721.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis C] >ref|ZP_04981085.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis str. Haarlem] >ref|ZP_05141736.1| acyl carrier protein [Mycobacterium tuberculosis '98-R604 INH-RIF-EM'] >ref|ZP_06433498.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis T46] >ref|ZP_06437620.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis CPHL_A] >ref|ZP_06443178.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis KZN 605] >ref|ZP_06450592.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis T17] >ref|ZP_06455160.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis K85] >ref|ZP_06504896.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis 02_1987] >ref|ZP_06510220.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis T92] >ref|ZP_06513730.1| meromycolate extension acyl carrier protein acpm [Mycobacterium tuberculosis EAS054] >ref|ZP_06517747.1| meromycolate extension acyl carrier protein acpm [Mycobacterium tuberculosis T85] >ref|ZP_06521786.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis GM 1503] >ref|ZP_06799170.1| acyl carrier protein [Mycobacterium tuberculosis 210] >ref|ZP_06952619.1| acyl carrier protein [Mycobacterium tuberculosis KZN 4207] >ref|ZP_06960948.1| acyl carrier protein [Mycobacterium tuberculosis KZN R506] >ref|ZP_07013145.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis 94_M4241A] >ref|ZP_07414839.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu001] >ref|ZP_07418616.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu002] >ref|ZP_07423348.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu003] >ref|ZP_07427715.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu004] >ref|ZP_07432018.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu005] >ref|ZP_07436410.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu006] >ref|ZP_07440655.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu008] >ref|ZP_07445228.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu007] >ref|ZP_07481045.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu009] >ref|ZP_07485275.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu010] >ref|ZP_07489492.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu011] >ref|ZP_07494023.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu012] >ref|ZP_07816044.1| acyl carrier protein [Mycobacterium tuberculosis KZN V2475] >ref|YP_004723912.1| meromycolate extension acyl carrier protein ACPM [Mycobacterium africanum GM041182] >ref|YP_004745700.1| meromycolate extension acyl carrier protein ACPM [Mycobacterium canettii CIPT 140010059] >sp|P0A4W6.1|ACPM_MYCTU RecName: Full=Meromycolate extension acyl carrier protein; Short=ACP >sp|P0A4W7.1|ACPM_MYCBO RecName: Full=Meromycolate extension acyl carrier protein; Short=ACP >emb|CAA94640.1| MEROMYCOLATE EXTENSION ACYL CARRIER PROTEIN ACPM [Mycobacterium tuberculosis H37Rv] >gb|AAK46588.1| acyl carrier protein [Mycobacterium tuberculosis CDC1551] >emb|CAD97121.1| MEROMYCOLATE EXTENSION ACYL CARRIER PROTEIN ACPM [Mycobacterium bovis AF2122/97] >emb|CAL72249.1| Meromycolate extension acyl carrier protein acpM [Mycobacterium bovis BCG str. Pasteur 1173P2] >gb|EAY60463.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis C] >gb|EBA42598.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis str. Haarlem] >gb|ABQ74026.1| meromycolate extension acyl carrier protein AcpM [Mycobacterium tuberculosis H37Ra] >gb|ABR06604.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis F11] >dbj|BAH26539.1| acyl carrier protein [Mycobacterium bovis BCG str. Tokyo 172] >gb|ACT24794.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis KZN 1435] >gb|EFD13913.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis T46] >gb|EFD18035.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis CPHL_A] >gb|EFD21093.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis KZN 605] >gb|EFD43942.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis K85] >gb|EFD47767.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis T17] >gb|EFD53534.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis 02_1987] >gb|EFD58858.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis T92] >gb|EFD62368.1| meromycolate extension acyl carrier protein acpm [Mycobacterium tuberculosis EAS054] >gb|EFD73930.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis GM 1503] >gb|EFD77945.1| meromycolate extension acyl carrier protein acpm [Mycobacterium tuberculosis T85] >gb|EFI30824.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis 94_M4241A] >gb|EFO74536.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu001] >gb|EFP15742.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu002] >gb|EFP19094.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu003] >gb|EFP22930.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu004] >gb|EFP26734.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu005] >gb|EFP30496.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu006] >gb|EFP33906.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu007] >gb|EFP38213.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu008] >gb|EFP42922.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu009] >gb|EFP46864.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu010] >gb|EFP50800.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu011] >gb|EFP54373.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu012] >gb|EGB28294.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis CDC1551A] >gb|EGE50793.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis W-148] >gb|AEB03875.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis KZN 4207] >gb|AEJ47271.1| acyl carrier protein [Mycobacterium tuberculosis CCDC5079] >gb|AEJ50890.1| acyl carrier protein [Mycobacterium tuberculosis CCDC5180] >emb|CCC27325.1| meromycolate extension acyl carrier protein ACPM [Mycobacterium africanum GM041182] >emb|CCC44598.1| meromycolate extension acyl carrier protein ACPM [Mycobacterium canettii CIPT 140010059] >emb|CCC64838.1| Meromycolate extension acyl carrier protein acpM [Mycobacterium bovis BCG str. Moreau RDJ] 223 223 100% 1e-74 1KLP_A Chain A, The Solution Structure Of Acyl Carrier Protein From Mycobacterium Tuberculosis 220 220 99% 2e-73 ZP_04748738.1 acyl carrier protein [Mycobacterium kansasii ATCC 12478] 165 165 100% 9e-52 ZP_05224070.1 acyl carrier protein [Mycobacterium intracellulare ATCC 13950] 162 162 100% 8e-51 NP_960931.1 acyl carrier protein [Mycobacterium avium subsp. paratuberculosis K-10] >ref|YP_881402.1| acyl carrier protein [Mycobacterium avium 104] >ref|ZP_05216419.1| acyl carrier protein [Mycobacterium avium subsp. avium ATCC 25291] >gb|AAS04314.1| AcpM [Mycobacterium avium subsp. paratuberculosis K-10] >gb|ABK65172.1| acyl carrier protein [Mycobacterium avium 104] >gb|EGO40713.1| acyl carrier protein [Mycobacterium avium subsp. paratuberculosis S397] 162 162 100% 8e-51 NP_302135.1 acyl carrier protein [Mycobacterium leprae TN] >ref|YP_002503765.1| acyl carrier protein [Mycobacterium leprae Br4923] >sp|O69475.1|ACPM_MYCLE RecName: Full=Meromycolate extension acyl carrier protein; Short=ACP >emb|CAA19202.1| acyl carrier protein [Mycobacterium leprae] >emb|CAC30605.1| acyl carrier protein (meromycolate extension) [Mycobacterium leprae] >emb|CAR71749.1| acyl carrier protein (meromycolate extension) [Mycobacterium leprae Br4923] 162 162 100% 2e-50 ZP_07966703.1 hypothetical protein HMPREF9336_03075 [Segniliparus rugosus ATCC BAA-974] >gb|EFV12044.1| hypothetical protein HMPREF9336_03075 [Segniliparus rugosus ATCC BAA-974] 162 162 88% 3e-50 YP_905336.1 acyl carrier protein [Mycobacterium ulcerans Agy99] >ref|YP_001851618.1| meromycolate extension acyl carrier protein AcpM [Mycobacterium marinum M] >gb|ABL03865.1| meromycolate extension acyl carrier protein AcpM [Mycobacterium ulcerans Agy99] >gb|ACC41763.1| meromycolate extension acyl carrier protein AcpM [Mycobacterium marinum M] 161 161 100% 3e-50 ZP_08713925.1 acyl carrier protein [Mycobacterium colombiense CECT 3035] >gb|EGT87768.1| acyl carrier protein [Mycobacterium colombiense CECT 3035] 160 160 100% 6e-50 YP_003660002.1 phosphopantetheine-binding protein [Segniliparus rotundus DSM 44985] >gb|ADG99171.1| phosphopantetheine-binding protein [Segniliparus rotundus DSM 44985] 160 160 88% 8e-50 ? ? ? ? ? ? ? ? ? ? ? where in my code: print "hit name is ",$hit->name, "\n"; # gives me the refrence of aligned sequence ? ? ? print"Score: ".$hsp->score."\n";??# gives me the score of aligned sequence ? ? ?print"E-val: ".$hsp->expect."\n";??# gives me the evalue of aligned sequence ? ? ?print"percent identity: ".$hsp->percent_identity."\n";??# gives me the query coverage ?of aligned sequence i want to use??#print "Description ",$hsp->desc, "\n"; to show the description but i am not getting can any body help me out for this i need to know urgently, thanks to read and i hope i was succesfull to explain my problem . below is the copy of my code i am trying to use : ? use Bio::Tools::Run::RemoteBlast; ? use strict; ? my $v = 1; ? my $prog = 'blastp'; ? my $db ? = 'refseq_protein'; ? my $e_val= '1e-10'; #1e-10 ?my $result; ?#my $code=q| my $answer = my $a / my $b;|; ? ? ? my @params = ( '-prog' => $prog, ? '-data' => $db, ? '-expect' => $e_val ); ? my $factory = Bio::Tools::Run::RemoteBlast->new(@params); ? $v = 1; ? my $str = Bio::SeqIO->new(-file=>'prot.txt' , '-format' => 'fasta' ); ? my $input; ? while($input = $str->next_seq()) ? { ?? ? # ?Blast a sequence against a database: ?? ? my $r = $factory->submit_blast($input); ? print STDERR "waiting..." if( $v > 0 ); ?? ? my %hit_evalue; ? my @evalue; ?? ? while ( my @rids = $factory->each_rid ) { ? ? ? foreach my $rid ( @rids ) { ? ?my $rc = $factory->retrieve_blast($rid); ? ?if( !ref($rc) ) { ? ? ? ?if( $rc < 0 ) { ? ? ? ?$factory->remove_rid($rid); ? ?} ? ? ? ?print STDERR "." if ( $v > 0 ); ? ? ? ?sleep 5; ? ?} else {? ? ? ? ?$factory->remove_rid($rid); ? ? ? ?#print $rid."\n\n"; ? ? ?my $result = $rc->next_result; ? ? ? ? ? ? ?print "db is ", $result->database_name(), "\n"; ? ? ? ?my $count = 0; ? ? ? ?while( my $hit = $result->next_hit ) { ? ?$count++; ? ?#next unless ( $v > 0); ? ?#print "hit name is ", $hit->name, "\n"; ? ?while( my $hsp = $hit->next_hsp ) ?{ ? ? ?print "hit name is ",$hit->name, "\n"; ? ? ?#print "Query name is ",$hsp->desc, "\n"; exit; ? ? ?? ? ? ?print"Score: ".$hsp->score."\n"; ? ? ?print"E-val: ".$hsp->expect."\n"; ? ? ?print"percent identity: ".$hsp->percent_identity."\n"; ?? ?} ? ? ? ? ? ?} ? ?} ? ? ? } ? } ? } From pcantalupo at gmail.com Thu Aug 18 08:55:18 2011 From: pcantalupo at gmail.com (Paul Cantalupo) Date: Thu, 18 Aug 2011 08:55:18 -0400 Subject: [Bioperl-l] query about Bio::Tools::Run::RemoteBlast In-Reply-To: <1313669694.59013.YahooMailNeo@web137303.mail.in.yahoo.com> References: <7717ecd0-c9b6-4ecc-895f-db60ca89e259@l7g2000vbz.googlegroups.com> <5D9B8006-A20E-4AAE-88EF-0A1DCA56B26E@verizon.net> <410A8BF5-D5EF-4E7A-B91C-D3DDACBABB75@illinois.edu> <1313669694.59013.YahooMailNeo@web137303.mail.in.yahoo.com> Message-ID: Punit I think you want '$hit->description' not '$hsp->desc' Paul Paul Cantalupo University of Pittsburgh On Thu, Aug 18, 2011 at 8:14 AM, punit kumar wrote: > hi friends , > > i am new to Bioperl , and i am using "Bio::Tools::Run::RemoteBlast" for remote blast i tried to use this module and i succeed a little yet, i want to get the description part of blast alignments which were found against my query sequence, as result is shown in format as given below, which is the out put table of ONLINE BLAST, > > Sequences producing significant alignments: > Accession > Description > Max score > Total score > Query coverage > E value > Links > NP_216760.1 acyl carrier protein [Mycobacterium tuberculosis H37Rv] >ref|NP_336774.1| acyl carrier protein [Mycobacterium tuberculosis CDC1551] >ref|NP_855917.1| acyl carrier protein [Mycobacterium bovis AF2122/97] >ref|YP_978350.1| acyl carrier protein [Mycobacterium bovis BCG str. Pasteur 1173P2] >ref|YP_001283588.1| acyl carrier protein [Mycobacterium tuberculosis H37Ra] >ref|YP_001288206.1| acyl carrier protein [Mycobacterium tuberculosis F11] >ref|ZP_02551632.1| acyl carrier protein [Mycobacterium tuberculosis H37Ra] >ref|YP_002645307.1| acyl carrier protein [Mycobacterium bovis BCG str. Tokyo 172] >ref|YP_003031689.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis KZN 1435] >ref|ZP_04925721.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis C] >ref|ZP_04981085.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis str. Haarlem] >ref|ZP_05141736.1| acyl carrier > protein [Mycobacterium tuberculosis '98-R604 INH-RIF-EM'] >ref|ZP_06433498.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis T46] >ref|ZP_06437620.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis CPHL_A] >ref|ZP_06443178.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis KZN 605] >ref|ZP_06450592.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis T17] >ref|ZP_06455160.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis K85] >ref|ZP_06504896.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis 02_1987] >ref|ZP_06510220.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis T92] >ref|ZP_06513730.1| meromycolate extension acyl carrier protein acpm [Mycobacterium tuberculosis EAS054] >ref|ZP_06517747.1| meromycolate extension acyl carrier protein acpm > [Mycobacterium tuberculosis T85] >ref|ZP_06521786.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis GM 1503] >ref|ZP_06799170.1| acyl carrier protein [Mycobacterium tuberculosis 210] >ref|ZP_06952619.1| acyl carrier protein [Mycobacterium tuberculosis KZN 4207] >ref|ZP_06960948.1| acyl carrier protein [Mycobacterium tuberculosis KZN R506] >ref|ZP_07013145.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis 94_M4241A] >ref|ZP_07414839.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu001] >ref|ZP_07418616.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu002] >ref|ZP_07423348.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu003] >ref|ZP_07427715.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu004] >ref|ZP_07432018.1| meromycolate extension acyl carrier protein > acpM [Mycobacterium tuberculosis SUMu005] >ref|ZP_07436410.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu006] >ref|ZP_07440655.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu008] >ref|ZP_07445228.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu007] >ref|ZP_07481045.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu009] >ref|ZP_07485275.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu010] >ref|ZP_07489492.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu011] >ref|ZP_07494023.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu012] >ref|ZP_07816044.1| acyl carrier protein [Mycobacterium tuberculosis KZN V2475] >ref|YP_004723912.1| meromycolate extension acyl carrier protein ACPM [Mycobacterium africanum > GM041182] >ref|YP_004745700.1| meromycolate extension acyl carrier protein ACPM [Mycobacterium canettii CIPT 140010059] >sp|P0A4W6.1|ACPM_MYCTU RecName: Full=Meromycolate extension acyl carrier protein; Short=ACP >sp|P0A4W7.1|ACPM_MYCBO RecName: Full=Meromycolate extension acyl carrier protein; Short=ACP >emb|CAA94640.1| MEROMYCOLATE EXTENSION ACYL CARRIER PROTEIN ACPM [Mycobacterium tuberculosis H37Rv] >gb|AAK46588.1| acyl carrier protein [Mycobacterium tuberculosis CDC1551] >emb|CAD97121.1| MEROMYCOLATE EXTENSION ACYL CARRIER PROTEIN ACPM [Mycobacterium bovis AF2122/97] >emb|CAL72249.1| Meromycolate extension acyl carrier protein acpM [Mycobacterium bovis BCG str. Pasteur 1173P2] >gb|EAY60463.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis C] >gb|EBA42598.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis str. Haarlem] >gb|ABQ74026.1| meromycolate extension acyl carrier protein AcpM > [Mycobacterium tuberculosis H37Ra] >gb|ABR06604.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis F11] >dbj|BAH26539.1| acyl carrier protein [Mycobacterium bovis BCG str. Tokyo 172] >gb|ACT24794.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis KZN 1435] >gb|EFD13913.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis T46] >gb|EFD18035.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis CPHL_A] >gb|EFD21093.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis KZN 605] >gb|EFD43942.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis K85] >gb|EFD47767.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis T17] >gb|EFD53534.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis 02_1987] >gb|EFD58858.1| meromycolate extension acyl carrier > protein acpM [Mycobacterium tuberculosis T92] >gb|EFD62368.1| meromycolate extension acyl carrier protein acpm [Mycobacterium tuberculosis EAS054] >gb|EFD73930.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis GM 1503] >gb|EFD77945.1| meromycolate extension acyl carrier protein acpm [Mycobacterium tuberculosis T85] >gb|EFI30824.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis 94_M4241A] >gb|EFO74536.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu001] >gb|EFP15742.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu002] >gb|EFP19094.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu003] >gb|EFP22930.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu004] >gb|EFP26734.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu005] > >gb|EFP30496.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu006] >gb|EFP33906.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu007] >gb|EFP38213.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu008] >gb|EFP42922.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu009] >gb|EFP46864.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu010] >gb|EFP50800.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu011] >gb|EFP54373.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu012] >gb|EGB28294.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis CDC1551A] >gb|EGE50793.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis W-148] >gb|AEB03875.1| meromycolate extension acyl > carrier protein acpM [Mycobacterium tuberculosis KZN 4207] >gb|AEJ47271.1| acyl carrier protein [Mycobacterium tuberculosis CCDC5079] >gb|AEJ50890.1| acyl carrier protein [Mycobacterium tuberculosis CCDC5180] >emb|CCC27325.1| meromycolate extension acyl carrier protein ACPM [Mycobacterium africanum GM041182] >emb|CCC44598.1| meromycolate extension acyl carrier protein ACPM [Mycobacterium canettii CIPT 140010059] >emb|CCC64838.1| Meromycolate extension acyl carrier protein acpM [Mycobacterium bovis BCG str. Moreau RDJ] 223 223 100% 1e-74 > 1KLP_A Chain A, The Solution Structure Of Acyl Carrier Protein From Mycobacterium Tuberculosis 220 220 99% 2e-73 > ZP_04748738.1 acyl carrier protein [Mycobacterium kansasii ATCC 12478] 165 165 100% 9e-52 > ZP_05224070.1 acyl carrier protein [Mycobacterium intracellulare ATCC 13950] 162 162 100% 8e-51 > NP_960931.1 acyl carrier protein [Mycobacterium avium subsp. paratuberculosis K-10] >ref|YP_881402.1| acyl carrier protein [Mycobacterium avium 104] >ref|ZP_05216419.1| acyl carrier protein [Mycobacterium avium subsp. avium ATCC 25291] >gb|AAS04314.1| AcpM [Mycobacterium avium subsp. paratuberculosis K-10] >gb|ABK65172.1| acyl carrier protein [Mycobacterium avium 104] >gb|EGO40713.1| acyl carrier protein [Mycobacterium avium subsp. paratuberculosis S397] 162 162 100% 8e-51 > NP_302135.1 acyl carrier protein [Mycobacterium leprae TN] >ref|YP_002503765.1| acyl carrier protein [Mycobacterium leprae Br4923] >sp|O69475.1|ACPM_MYCLE RecName: Full=Meromycolate extension acyl carrier protein; Short=ACP >emb|CAA19202.1| acyl carrier protein [Mycobacterium leprae] >emb|CAC30605.1| acyl carrier protein (meromycolate extension) [Mycobacterium leprae] >emb|CAR71749.1| acyl carrier protein (meromycolate extension) [Mycobacterium leprae Br4923] 162 162 100% 2e-50 > ZP_07966703.1 hypothetical protein HMPREF9336_03075 [Segniliparus rugosus ATCC BAA-974] >gb|EFV12044.1| hypothetical protein HMPREF9336_03075 [Segniliparus rugosus ATCC BAA-974] 162 162 88% 3e-50 > YP_905336.1 acyl carrier protein [Mycobacterium ulcerans Agy99] >ref|YP_001851618.1| meromycolate extension acyl carrier protein AcpM [Mycobacterium marinum M] >gb|ABL03865.1| meromycolate extension acyl carrier protein AcpM [Mycobacterium ulcerans Agy99] >gb|ACC41763.1| meromycolate extension acyl carrier protein AcpM [Mycobacterium marinum M] 161 161 100% 3e-50 > ZP_08713925.1 acyl carrier protein [Mycobacterium colombiense CECT 3035] >gb|EGT87768.1| acyl carrier protein [Mycobacterium colombiense CECT 3035] 160 160 100% 6e-50 > YP_003660002.1 phosphopantetheine-binding protein [Segniliparus rotundus DSM 44985] >gb|ADG99171.1| phosphopantetheine-binding protein [Segniliparus rotundus DSM 44985] 160 160 88% 8e-50 > > where in my code: > > print "hit name is ",$hit->name, "\n"; # gives me the refrence of aligned sequence > print"Score: ".$hsp->score."\n"; # gives me the score of aligned sequence > print"E-val: ".$hsp->expect."\n"; # gives me the evalue of aligned sequence > print"percent identity: ".$hsp->percent_identity."\n"; # gives me the query coverage of aligned sequence > > i want to use #print "Description ",$hsp->desc, "\n"; to show the description but i am not getting can any body help me out for this i need to know urgently, thanks to read and i hope i was succesfull to explain my problem . > > below is the copy of my code i am trying to use : > > > > > use Bio::Tools::Run::RemoteBlast; > use strict; > my $v = 1; > my $prog = 'blastp'; > my $db = 'refseq_protein'; > my $e_val= '1e-10'; #1e-10 > > my $result; > #my $code=q| my $answer = my $a / my $b;|; > > > > > > my @params = ( > '-prog' => $prog, > '-data' => $db, > '-expect' => $e_val > ); > > my $factory = Bio::Tools::Run::RemoteBlast->new(@params); > $v = 1; > my $str = Bio::SeqIO->new(-file=>'prot.txt' , '-format' => 'fasta' ); > my $input; > while($input = $str->next_seq()) > { > > # Blast a sequence against a database: > > my $r = $factory->submit_blast($input); > print STDERR "waiting..." if( $v > 0 ); > > my %hit_evalue; > my @evalue; > > while ( my @rids = $factory->each_rid ) { > foreach my $rid ( @rids ) { > my $rc = $factory->retrieve_blast($rid); > if( !ref($rc) ) { > if( $rc < 0 ) { > $factory->remove_rid($rid); > } > print STDERR "." if ( $v > 0 ); > sleep 5; > } else { > $factory->remove_rid($rid); > #print $rid."\n\n"; > my $result = $rc->next_result; > > print "db is ", $result->database_name(), "\n"; > my $count = 0; > while( my $hit = $result->next_hit ) { > $count++; > #next unless ( $v > 0); > #print "hit name is ", $hit->name, "\n"; > while( my $hsp = $hit->next_hsp ) > { > print "hit name is ",$hit->name, "\n"; > #print "Query name is ",$hsp->desc, "\n"; exit; > > print"Score: ".$hsp->score."\n"; > print"E-val: ".$hsp->expect."\n"; > print"percent identity: ".$hsp->percent_identity."\n"; > } > > > } > } > } > } > } > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From David.Messina at sbc.su.se Fri Aug 19 05:07:35 2011 From: David.Messina at sbc.su.se (Dave Messina) Date: Fri, 19 Aug 2011 11:07:35 +0200 Subject: [Bioperl-l] Fwd: pls help.. In-Reply-To: References: Message-ID: Whoops, resending ? the attachment was too big. Ravi, please provide only a few example lines from your GFF file, or host the file elsewhere and post a link to it. Dave ---------- Forwarded message ---------- From: Dave Messina Date: Fri, Aug 19, 2011 at 10:53 Subject: pls help.. To: ravi.devani89 at gmail.com Cc: bioperl-l Ravi, Your message belongs on the main BioPerl list, not the bioperl-dev list, so I'm reposting it there. To sign up for the main list, go to: http://bioperl.org/mailman/listinfo/bioperl-l Dave ---------- Forwarded message ---------- From: Ravi Devani To: bioperl-dev at lists.open-bio.org Date: Fri, 19 Aug 2011 13:54:22 +0530 Subject: Fwd: pls help.. i tried to create a gff3 file from .gbk file using bioperl genbank2gff3 script but what i get is same features repeating many times.. and the file keeps growing in size ntil my harddisk gets full.. i have tried to filter all other features except "region" but still it repeats a single entry many times.. i have attached a part of the file generated.. pls kindly help me. From ravi.devani89 at gmail.com Fri Aug 19 01:16:00 2011 From: ravi.devani89 at gmail.com (Ravi Devani) Date: Fri, 19 Aug 2011 10:46:00 +0530 Subject: [Bioperl-l] pls help.. In-Reply-To: References: Message-ID: ---------- Forwarded message ---------- From: Ravi Devani Date: Thu, Aug 18, 2011 at 12:40 PM Subject: pls help.. To: scott at scottcain.net i tried to create a gff3 file from .gbk file using bp_genbank2gff3.pl but what i get is same features repeating many times.. and the file keeps growing in size ntil my harddisk gets full.. i have tried to filter all other features except "region" but still it repeats a single entry many times.. i have attached a part of the file generated.. pls kindly help me. -------------- next part -------------- A non-text attachment was scrubbed... Name: ref_chrUn.gff Type: application/octet-stream Size: 602112 bytes Desc: not available URL: From anjan.purkayastha at gmail.com Mon Aug 15 10:32:39 2011 From: anjan.purkayastha at gmail.com (ANJAN PURKAYASTHA) Date: Mon, 15 Aug 2011 10:32:39 -0400 Subject: [Bioperl-l] Problem with Bio::DB::Taxonomy Message-ID: Hello, I wrote a short test script for the Bio::DB::Taxonomy module: ================================================ #!/usr/bin/perl -w use strict; use Bio::DB::Taxonomy; my ($nodesfile, $namesfile)= ('nodes.dmp', 'names.dmp'); my $db= new Bio::DB::Taxonomy(-source => 'flatfile', -nodesfile => $nodesfile, -namesfile => $namesfile ); my $bacteria= $db->get_Taxonomy_Node(-taxonid => '2'); print("$bacteria->id\t$bacteria->name\n"); ================================================ On running this script I expect the following output: 2 Bacteria. Instead I get a warning: UNIVERSAL->import is deprecated and will be removed in a future perl at /usr/share/perl5/vendor_perl/Bio/Tree/TreeFunctionsI.pm line 94. and the following ouput: Bio::Taxon=HASH(0x158dbe0)->id Bio::Taxon=HASH(0x158dbe0)->name The script seems to be working but there seems to be a problem with dereferencing a Bio::Taxon object. Any leads on how to troubleshoot this will be much appreciated. Thanks Anjan -- =================================== Anjan Purkayastha, PhD Senior Computational Biologist TessArae LLC 46090 Lake Center Plaza, Suite 304 Potomac Falls, VA 20165** Office- 703.444.7188 ext. 116 Mobile-703.740.6939 =================================== From scott at scottcain.net Fri Aug 19 09:45:47 2011 From: scott at scottcain.net (Scott Cain) Date: Fri, 19 Aug 2011 09:45:47 -0400 Subject: [Bioperl-l] pls help.. In-Reply-To: References: Message-ID: <92CA808D-16F0-4F08-BC44-8A0C06292EA8@scottcain.net> Ravi, The gff file is fairly useless from a debugging perspective. Can you please attach the genbank file you're using? Also, please indicate what version of bioperl you're using. Scott Sent from my iPhone On Aug 19, 2011, at 1:16 AM, Ravi Devani wrote: > ---------- Forwarded message ---------- > From: Ravi Devani > Date: Thu, Aug 18, 2011 at 12:40 PM > Subject: pls help.. > To: scott at scottcain.net > > > i tried to create a gff3 file from .gbk file using > bp_genbank2gff3.pl but > what i get is same features repeating many times.. and the file keeps > growing in size ntil my harddisk gets full.. i have tried to filter > all > other features except "region" but still it repeats a single entry > many > times.. i have attached a part of the file generated.. pls kindly > help me. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Fri Aug 19 10:05:03 2011 From: cjfields at illinois.edu (Chris Fields) Date: Fri, 19 Aug 2011 09:05:03 -0500 Subject: [Bioperl-l] Problem with Bio::DB::Taxonomy In-Reply-To: References: Message-ID: <7A733494-D831-43ED-9AE4-AB62AC5A2761@illinois.edu> Anjan, You are likely using an old version of BioPerl (this was fixed in the latest release on CPAN I believe). Bio::DB::Taxonomy uses Bio::Taxon, so the use ofname() is incorrect; it is node_name(); if this is documented somewhere it is incorrect, so let us know where that came from. Also, the print statement at the end isn't interpolating correctly; in general with objects I make this more explicit: print $bacteria->id."\t".$bacteria->node_name."\n"; Correcting that, it works for me: [cjfields at pyrimidine1 anjan]$ perl test.pl 2 Bacteria chris On Aug 15, 2011, at 9:32 AM, ANJAN PURKAYASTHA wrote: > Hello, > I wrote a short test script for the Bio::DB::Taxonomy module: > ================================================ > #!/usr/bin/perl -w > use strict; > use Bio::DB::Taxonomy; > > my ($nodesfile, $namesfile)= ('nodes.dmp', 'names.dmp'); > > my $db= new Bio::DB::Taxonomy(-source => 'flatfile', > -nodesfile => $nodesfile, > -namesfile => $namesfile > ); > > my $bacteria= $db->get_Taxonomy_Node(-taxonid => '2'); > print("$bacteria->id\t$bacteria->name\n"); > ================================================ > > On running this script I expect the following output: 2 Bacteria. > > Instead I get a warning: > UNIVERSAL->import is deprecated and will be removed in a future perl at > /usr/share/perl5/vendor_perl/Bio/Tree/TreeFunctionsI.pm line 94. > > and the following ouput: > Bio::Taxon=HASH(0x158dbe0)->id Bio::Taxon=HASH(0x158dbe0)->name > > The script seems to be working but there seems to be a problem with > dereferencing a Bio::Taxon object. > > Any leads on how to troubleshoot this will be much appreciated. > Thanks > Anjan > > > > -- > =================================== > Anjan Purkayastha, PhD > Senior Computational Biologist > TessArae LLC > 46090 Lake Center Plaza, Suite 304 > Potomac Falls, VA 20165** > Office- 703.444.7188 ext. 116 > Mobile-703.740.6939 > =================================== > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Fri Aug 19 10:26:06 2011 From: cjfields at illinois.edu (Chris Fields) Date: Fri, 19 Aug 2011 09:26:06 -0500 Subject: [Bioperl-l] pls help.. In-Reply-To: <92CA808D-16F0-4F08-BC44-8A0C06292EA8@scottcain.net> References: <92CA808D-16F0-4F08-BC44-8A0C06292EA8@scottcain.net> Message-ID: <1D308407-17A9-4203-9D6C-D71FA0FD74D0@illinois.edu> Scott, http://www.ncbi.nlm.nih.gov/nuccore/NW_002121371.1?report=gbwithparts&log$=seqview (it's in the GFF file) It definitely is getting stuck in a loop for the genomic region, but using the file for GFF3 doesn't make sense (very few features of note). On Aug 19, 2011, at 8:45 AM, Scott Cain wrote: > Ravi, > > The gff file is fairly useless from a debugging perspective. Can you please attach the genbank file you're using? Also, please indicate what version of bioperl you're using. > > Scott > > > Sent from my iPhone > > On Aug 19, 2011, at 1:16 AM, Ravi Devani wrote: > >> ---------- Forwarded message ---------- >> From: Ravi Devani >> Date: Thu, Aug 18, 2011 at 12:40 PM >> Subject: pls help.. >> To: scott at scottcain.net >> >> >> i tried to create a gff3 file from .gbk file using bp_genbank2gff3.pl but >> what i get is same features repeating many times.. and the file keeps >> growing in size ntil my harddisk gets full.. i have tried to filter all >> other features except "region" but still it repeats a single entry many >> times.. i have attached a part of the file generated.. pls kindly help me. >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From scott at scottcain.net Fri Aug 19 10:38:16 2011 From: scott at scottcain.net (Scott Cain) Date: Fri, 19 Aug 2011 10:38:16 -0400 Subject: [Bioperl-l] pls help.. In-Reply-To: <1D308407-17A9-4203-9D6C-D71FA0FD74D0@illinois.edu> References: <92CA808D-16F0-4F08-BC44-8A0C06292EA8@scottcain.net> <1D308407-17A9-4203-9D6C-D71FA0FD74D0@illinois.edu> Message-ID: I was wondering if perhaps the genbank file had been manipulated in some way. Scott On Fri, Aug 19, 2011 at 10:26 AM, Chris Fields wrote: > Scott, > > http://www.ncbi.nlm.nih.gov/nuccore/NW_002121371.1?report=gbwithparts&log$=seqview > > (it's in the GFF file) > > It definitely is getting stuck in a loop for the genomic region, but using the file for GFF3 doesn't make sense (very few features of note). > > On Aug 19, 2011, at 8:45 AM, Scott Cain wrote: > >> Ravi, >> >> The gff file is fairly useless from a debugging perspective. Can you please attach the genbank file you're using? ?Also, please indicate what version of bioperl you're using. >> >> Scott >> >> >> Sent from my iPhone >> >> On Aug 19, 2011, at 1:16 AM, Ravi Devani wrote: >> >>> ---------- Forwarded message ---------- >>> From: Ravi Devani >>> Date: Thu, Aug 18, 2011 at 12:40 PM >>> Subject: pls help.. >>> To: scott at scottcain.net >>> >>> >>> i tried to create a gff3 file from .gbk file using bp_genbank2gff3.pl but >>> what i get is same features repeating many times.. and the file keeps >>> growing in size ntil my harddisk gets full.. i have tried to filter all >>> other features except "region" but still it repeats a single entry many >>> times.. ?i have attached a part of the file generated.. pls kindly help me. >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- ------------------------------------------------------------------------ Scott Cain, Ph. D.? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? scott at scottcain dot net GMOD Coordinator (http://gmod.org/)? ? ? ? ? ? ? ? ? ?? 216-392-3087 Ontario Institute for Cancer Research From cjfields at illinois.edu Fri Aug 19 15:19:40 2011 From: cjfields at illinois.edu (Chris Fields) Date: Fri, 19 Aug 2011 14:19:40 -0500 Subject: [Bioperl-l] pls help.. In-Reply-To: References: <92CA808D-16F0-4F08-BC44-8A0C06292EA8@scottcain.net> <1D308407-17A9-4203-9D6C-D71FA0FD74D0@illinois.edu> Message-ID: Yeah, the output is rather odd. Maybe it's using the contig file version? chris On Aug 19, 2011, at 9:38 AM, Scott Cain wrote: > I was wondering if perhaps the genbank file had been manipulated in some way. > > Scott > > > On Fri, Aug 19, 2011 at 10:26 AM, Chris Fields wrote: >> Scott, >> >> http://www.ncbi.nlm.nih.gov/nuccore/NW_002121371.1?report=gbwithparts&log$=seqview >> >> (it's in the GFF file) >> >> It definitely is getting stuck in a loop for the genomic region, but using the file for GFF3 doesn't make sense (very few features of note). >> >> On Aug 19, 2011, at 8:45 AM, Scott Cain wrote: >> >>> Ravi, >>> >>> The gff file is fairly useless from a debugging perspective. Can you please attach the genbank file you're using? Also, please indicate what version of bioperl you're using. >>> >>> Scott >>> >>> >>> Sent from my iPhone >>> >>> On Aug 19, 2011, at 1:16 AM, Ravi Devani wrote: >>> >>>> ---------- Forwarded message ---------- >>>> From: Ravi Devani >>>> Date: Thu, Aug 18, 2011 at 12:40 PM >>>> Subject: pls help.. >>>> To: scott at scottcain.net >>>> >>>> >>>> i tried to create a gff3 file from .gbk file using bp_genbank2gff3.pl but >>>> what i get is same features repeating many times.. and the file keeps >>>> growing in size ntil my harddisk gets full.. i have tried to filter all >>>> other features except "region" but still it repeats a single entry many >>>> times.. i have attached a part of the file generated.. pls kindly help me. >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> > > > > -- > ------------------------------------------------------------------------ > Scott Cain, Ph. D. scott at scottcain dot net > GMOD Coordinator (http://gmod.org/) 216-392-3087 > Ontario Institute for Cancer Research From hlapp at drycafe.net Fri Aug 19 23:38:51 2011 From: hlapp at drycafe.net (Hilmar Lapp) Date: Fri, 19 Aug 2011 22:38:51 -0500 Subject: [Bioperl-l] [BioSQL-l] How is is_circular recorded in BioSQL (by BioPerl)? In-Reply-To: <4E2D79D6.6020108@gmail.com> References: <4E2D5000.30305@gmail.com> <4E2D5314.5090107@gmail.com> <4E2D5BAC.8020001@gmail.com> <4E2D79D6.6020108@gmail.com> Message-ID: <59AF5708-AECD-4375-9EB8-6E79D4B21C26@drycafe.net> I realize I'm chiming in here late, but the below sums it up quite well. In fact, biosequence.alphabet column was originally (pre-2002) called molecule, and the BioPerl Genbank writer defaults to alphabet() if molecule() is not defined. -hilmar Sent with a tap. On Jul 25, 2011, at 9:12 AM, Roy Chaudhuri wrote: > As with the is_circular hack, you could store the molecule type by adding it as an annotation in the SequenceProcessor (it's stored as $seq->molecule by BioPerl). > > Actually, when round-tripping a GenBank file through BioSQL, the LOCUS line molecule type ends up in lower case, which makes me wonder if it is coming from alphabet in the biosequence table. From hlapp at drycafe.net Fri Aug 19 21:02:12 2011 From: hlapp at drycafe.net (Hilmar Lapp) Date: Fri, 19 Aug 2011 20:02:12 -0500 Subject: [Bioperl-l] Error writing SequenceProcessor to associate GO terms in biosql database In-Reply-To: <26C59A57-F54A-4237-8D97-4E7A77E55D59@sgul.ac.uk> References: <26C59A57-F54A-4237-8D97-4E7A77E55D59@sgul.ac.uk> Message-ID: <6BDB69DE-5856-4061-96FA-0CF2884EDD9E@drycafe.net> Hi Adam I'm not sure whether you've received a response to this. Apologies if not. There is indeed a NOT NULL constraint on seqfeature_qualifier_value.value. The only other metadata association table in BioSQL that does this is location_qualifier_value. In the latter case there is arguably some sense to that (at least originally for locations the purpose of that table was pretty much to store the fuzzy location start/end properties), but for seqfeatures this looks like a bug to me. I'll post this to the BioSQL list and fix it f there are no objections, but feel free to drop the NOT NULL on that column yourself in the meantime. The INSERT query gets constructed in the innards of Bioperl-db. There is no reason to mess with that for this problem though - just drop the NOT NULL constraint. -hilmar Sent with a tap. On Jul 26, 2011, at 10:07 AM, Adam Witney wrote: > > Hi, > > I'm trying to write a SequenceProcessor for a genbank file to associate GO terms to the GO data preloaded in my biosql database. The command looks like this: > > perl load_seqdatabase.pl --dbname=biosql --driver=Pg --host=myhost --port= 5432 --dbuser=user --dbpass=pass -format genbank -namespace testing -pipeline 'GOSequenceProcessor' --debug S_sonnei.EB1_s_sonnei.dat > > The SequenceProcessor process_seq looks like this: > > sub process_seq{ > my ($self,$seq) = @_; > > my @features = $seq->get_SeqFeatures(); > foreach my $feat ( @features ) { > if ( $feat->has_tag('db_xref') ) { > my @db_xrefs = $feat->get_tag_values('db_xref'); > > foreach my $db_xref (@db_xrefs) { > if ( $db_xref =~ m/^GO:/ ) { > my $term = Bio::Annotation::OntologyTerm->new(-identifier => $db_xref, > -ontology => 'Gene Ontology'); > $feat->annotation->add_Annotation($term); > } > } > } > } > > return ($seq); > } > > But this gives this error: > > preparing INSERT statement: INSERT INTO seqfeature_qualifier_value (seqfeature_id, term_id, rank) VALUES (?, ?, ?) > TermAdaptor::add_assoc: binding column 1 to "935181" (FK to Bio::SeqFeature::Generic) > TermAdaptor::add_assoc: binding column 2 to "50253" (FK to Bio::Annotation::OntologyTerm) > TermAdaptor::add_assoc: binding column 3 to "1" (rank) > > --------------------- WARNING --------------------- > MSG: TermAdaptor::add_assoc: unexpected failure of statement execution: ERROR: null value in column "value" violates not-null constraint > name: INSERT ASSOC [1] Bio::SeqFeature::Generic;Bio::Annotation::OntologyTerm > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::add_association /var/users/adam/BioPerl/bioperl-db/lib//Bio/DB/BioSQL/BasePersistenceAdaptor.pm:458 > STACK Bio::DB::BioSQL::AnnotationCollectionAdaptor::add_association /var/users/adam/BioPerl/bioperl-db/lib//Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:468 > STACK Bio::DB::BioSQL::SeqFeatureAdaptor::store_children /var/users/adam/BioPerl/bioperl-db/lib//Bio/DB/BioSQL/SeqFeatureAdaptor.pm:304 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create /var/users/adam/BioPerl/bioperl-db/lib//Bio/DB/BioSQL/BasePersistenceAdaptor.pm:227 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store /var/users/adam/BioPerl/bioperl-db/lib//Bio/DB/BioSQL/BasePersistenceAdaptor.pm:264 > STACK Bio::DB::Persistent::PersistentObject::store /var/users/adam/BioPerl/bioperl-db/lib//Bio/DB/Persistent/PersistentObject.pm:284 > STACK Bio::DB::BioSQL::SeqAdaptor::store_children /var/users/adam/BioPerl/bioperl-db/lib//Bio/DB/BioSQL/SeqAdaptor.pm:257 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create /var/users/adam/BioPerl/bioperl-db/lib//Bio/DB/BioSQL/BasePersistenceAdaptor.pm:227 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store /var/users/adam/BioPerl/bioperl-db/lib//Bio/DB/BioSQL/BasePersistenceAdaptor.pm:264 > STACK Bio::DB::Persistent::PersistentObject::store /var/users/adam/BioPerl/bioperl-db/lib//Bio/DB/Persistent/PersistentObject.pm:284STACK (eval) /var/users/adam/BioPerl/bioperl-db/scripts/biosql/load_seqdatabase.pl:630 > STACK toplevel /var/users/adam/BioPerl/bioperl-db/scripts/biosql/load_seqdatabase.pl:612 > > As you can see it generates an INSERT against seqfeature_qualifier_value without including a 'value' field, which is of course defined as NOT NULL. > > Firstly, is this the best way to achieve this? And secondly, where is the INSERT statement put together, I can't seem to find it in the object hierarchy > > Thanks > > adam > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From ulrik.stervbo at gmail.com Sun Aug 21 13:33:44 2011 From: ulrik.stervbo at gmail.com (Ulrik Stervbo) Date: Sun, 21 Aug 2011 19:33:44 +0200 Subject: [Bioperl-l] Change of Expasy Protparam url Message-ID: it seems the there are some minor changes with the urls for the expasy-services. In the Protparam.pm, line 110 should be changed from @args=('-url'=>'http://www.expasy.org/cgi-bin/protparam','-form'=>'sequence', at args); to @args=('-url'=>'http://web.expasy.org/cgi-bin/protparam/protparam','-form'=>'sequence', at args); At least it seems to be working here, after adding the change to my local Protparam.pm Cheers, Ulrik From cjfields at illinois.edu Sun Aug 21 13:56:10 2011 From: cjfields at illinois.edu (Chris Fields) Date: Sun, 21 Aug 2011 12:56:10 -0500 Subject: [Bioperl-l] Change of Expasy Protparam url In-Reply-To: References: Message-ID: <9178B7E4-6EF2-4BC7-9B1C-9E5B282B5012@illinois.edu> Thanks for pointing that out. I've updated that on github. The critical thing is to get some tests working, so a failure for the webservice doesn't happen again w/o some exceptions (so we can track this). chris On Aug 21, 2011, at 12:33 PM, Ulrik Stervbo wrote: > it seems the there are some minor changes with the urls for the expasy-services. > > In the Protparam.pm, line 110 should be changed from > @args=('-url'=>'http://www.expasy.org/cgi-bin/protparam','-form'=>'sequence', at args); > > to > > @args=('-url'=>'http://web.expasy.org/cgi-bin/protparam/protparam','-form'=>'sequence', at args); > > At least it seems to be working here, after adding the change to my > local Protparam.pm > > Cheers, > Ulrik > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From scott at scottcain.net Mon Aug 22 11:51:55 2011 From: scott at scottcain.net (Scott Cain) Date: Mon, 22 Aug 2011 11:51:55 -0400 Subject: [Bioperl-l] pls help.. In-Reply-To: References: <92CA808D-16F0-4F08-BC44-8A0C06292EA8@scottcain.net> <1D308407-17A9-4203-9D6C-D71FA0FD74D0@illinois.edu> Message-ID: Hello Ravi, Please keep the BioPerl mailing list cc'ed. I downloaded your 1.7GB multi-genbank file and started processing it with bp_genbank2gff3.pl and I killed it when the GFF file got to 10GB, however, it was working as expected. I suggest you upgrade to the most recent release of BioPerl and try again. Additionally, it might make sense to break that big multi-genbank file into smaller files. Scott On Sun, Aug 21, 2011 at 11:33 AM, Ravi Devani wrote: > scott i hv given the link to the gbk file, please kindly help me > > On 8/19/11, Scott Cain wrote: >> Ravi, >> >> I also meant to ask what version of BioPerl you are using. ?When I run >> this command >> >> ? bp_genbank2gff3.pl NW_002121371.gbk >> >> I get a rather dull GFF3 file with 4 lines of GFF (one region and >> three gaps) and a fasta section. >> >> Scott >> >> >> On Fri, Aug 19, 2011 at 12:33 PM, Ravi Devani >> wrote: >>> No the genbank file has not been manipulated >>> >>> On 8/19/11, Scott Cain wrote: >>>> I was wondering if perhaps the genbank file had been manipulated in some >>>> way. >>>> >>>> Scott >>>> >>>> >>>> On Fri, Aug 19, 2011 at 10:26 AM, Chris Fields >>>> wrote: >>>>> Scott, >>>>> >>>>> http://www.ncbi.nlm.nih.gov/nuccore/NW_002121371.1?report=gbwithparts&log$=seqview >>>>> >>>>> (it's in the GFF file) >>>>> >>>>> It definitely is getting stuck in a loop for the genomic region, but >>>>> using >>>>> the file for GFF3 doesn't make sense (very few features of note). >>>>> >>>>> On Aug 19, 2011, at 8:45 AM, Scott Cain wrote: >>>>> >>>>>> Ravi, >>>>>> >>>>>> The gff file is fairly useless from a debugging perspective. Can you >>>>>> please attach the genbank file you're using? ?Also, please indicate >>>>>> what >>>>>> version of bioperl you're using. >>>>>> >>>>>> Scott >>>>>> >>>>>> >>>>>> Sent from my iPhone >>>>>> >>>>>> On Aug 19, 2011, at 1:16 AM, Ravi Devani >>>>>> wrote: >>>>>> >>>>>>> ---------- Forwarded message ---------- >>>>>>> From: Ravi Devani >>>>>>> Date: Thu, Aug 18, 2011 at 12:40 PM >>>>>>> Subject: pls help.. >>>>>>> To: scott at scottcain.net >>>>>>> >>>>>>> >>>>>>> i tried to create a gff3 file from .gbk file using bp_genbank2gff3.pl >>>>>>> but >>>>>>> what i get is same features repeating many times.. and the file keeps >>>>>>> growing in size ntil my harddisk gets full.. i have tried to filter >>>>>>> all >>>>>>> other features except "region" but still it repeats a single entry >>>>>>> many >>>>>>> times.. ?i have attached a part of the file generated.. pls kindly >>>>>>> help >>>>>>> me. >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Bioperl-l mailing list >>>>>>> Bioperl-l at lists.open-bio.org >>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>>>> _______________________________________________ >>>>>> Bioperl-l mailing list >>>>>> Bioperl-l at lists.open-bio.org >>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>>> >>>>> >>>> >>>> >>>> >>>> -- >>>> ------------------------------------------------------------------------ >>>> Scott Cain, Ph. D.? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? scott at scottcain >>>> dot >>>> net >>>> GMOD Coordinator (http://gmod.org/)? ? ? ? ? ? ? ? ? ?? 216-392-3087 >>>> Ontario Institute for Cancer Research >>>> >>> >> >> >> >> -- >> ------------------------------------------------------------------------ >> Scott Cain, Ph. D.? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? scott at scottcain dot >> net >> GMOD Coordinator (http://gmod.org/)? ? ? ? ? ? ? ? ? ?? 216-392-3087 >> Ontario Institute for Cancer Research >> > -- ------------------------------------------------------------------------ Scott Cain, Ph. D.? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? scott at scottcain dot net GMOD Coordinator (http://gmod.org/)? ? ? ? ? ? ? ? ? ?? 216-392-3087 Ontario Institute for Cancer Research From allenday at ionflux.com Mon Aug 22 14:40:33 2011 From: allenday at ionflux.com (Allen Day, PhD) Date: Mon, 22 Aug 2011 18:40:33 +0000 Subject: [Bioperl-l] Beijing and Los Angeles Human NGS Biostatistics/Informatics jobs Message-ID: Hi all, Ion Flux is a startup that I just created to apply NGS technology to the clinical diagnostics field. We like to think of ourselves as an enterprise class "23andme". This is an early-stage startup -- you will have a chance to influence the company and to be rewarded accordingly. I am Allen, the founder. We have a couple of open positions - for smart, passionate, scientist / engineering types. Others need not apply. Please check out these job descriptions, if this sparks your interest: http://ionflux.com/blog/careers/bioinformatician-data-modeling-and-processing/ http://ionflux.com/blog/careers/bioinformatician-data-analysis-and-algorithms/ Our offices are in Los Angeles (UCLA adjacent) and Beijing (????@??????). I'm happy to post future openings to other lists in the future if this isn't the right venue for an occasional job announcement. -Allen From acpatel at gmail.com Mon Aug 22 15:25:50 2011 From: acpatel at gmail.com (Anand Patel) Date: Mon, 22 Aug 2011 14:25:50 -0500 Subject: [Bioperl-l] primer3redux 0.09 add_targets is not there Message-ID: I'm trying to get Primer3Redux to work, and am noticing some strange things. While I found and changed my parameters to the new primer3 2.2.3 parameters, I still can't find add_targets. Assigning the parameters using set_parameters works for primer3redux, add_targets is ?leftover? from primer3. So is this a doc/POD issue? Thanks, Anand Anand C. Patel, MD MS Washington University School of Medicine acpatel at gmail.com From cjfields1 at gmail.com Mon Aug 22 15:42:28 2011 From: cjfields1 at gmail.com (Christopher Fields) Date: Mon, 22 Aug 2011 14:42:28 -0500 Subject: [Bioperl-l] primer3redux 0.09 add_targets is not there In-Reply-To: References: Message-ID: On Aug 22, 2011, at 2:25 PM, Anand Patel wrote: > I'm trying to get Primer3Redux to work, and am noticing some strange > things. While I found and changed my parameters to the new primer3 > 2.2.3 parameters, I still can't find add_targets. > > Assigning the parameters using set_parameters works for primer3redux, > add_targets is ?leftover? from primer3. > > So is this a doc/POD issue? I'm confused. You are trying to use add_targets with the latest primer3, but it isn't there? Or is the Primer3Redux wrapper missing this parameter? chris > Thanks, > Anand > > Anand C. Patel, MD MS > Washington University School of Medicine > acpatel at gmail.com > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From acpatel at gmail.com Mon Aug 22 15:52:12 2011 From: acpatel at gmail.com (Anand Patel) Date: Mon, 22 Aug 2011 14:52:12 -0500 Subject: [Bioperl-l] primer3redux 0.09 add_targets is not there In-Reply-To: References: Message-ID: my $primer3 = Bio::Tools::Run::Primer3Redux->new(-outfile => "temp.out", -path => "/usr/bin/primer3_core"); If I use this: $primer3->add_targets( 'PRIMER_OPT_TM'=>$PRIMER_OPT_TM,'PRIMER_MIN_TM'=>$PRIMER_MIN_TM, 'PRIMER_MAX_TM'=>$PRIMER_MAX_TM, 'PRIMER_PAIR_MAX_DIFF_TM'=>$PRIMER_MAX_DIFF_TM, 'PRIMER_MAX_SIZE'=>$PRIMER_MAX_SIZE,'PRIMER_OPT_SIZE'=>$PRIMER_OPT_SIZE, 'PRIMER_MIN_SIZE'=>$PRIMER_MIN_SIZE, 'PRIMER_MAX_GC'=>$PRIMER_MAX_GC, 'PRIMER_OPT_GC_PERCENT'=>$PRIMER_OPT_GC_PERCENT, 'PRIMER_MIN_GC'=>$PRIMER_MIN_GC, 'SEQUENCE_TARGET'=>$TARGET, 'PRIMER_PRODUCT_SIZE_RANGE' =>$PRIMER_PRODUCT_SIZE_RANGE); I get: Can't locate object method "add_targets" via package "Bio::Tools::Run::Primer3Redux" at p3ra.pl line 31, line 1. On the other hand, if I change that line to: $primer3->set_parameters( 'PRIMER_OPT_TM'=>$PRIMER_OPT_TM,'PRIMER_MIN_TM'=>$PRIMER_MIN_TM, 'PRIMER_MAX_TM'=>$PRIMER_MAX_TM, 'PRIMER_PAIR_MAX_DIFF_TM'=>$PRIMER_MAX_DIFF_TM, 'PRIMER_MAX_SIZE'=>$PRIMER_MAX_SIZE,'PRIMER_OPT_SIZE'=>$PRIMER_OPT_SIZE, 'PRIMER_MIN_SIZE'=>$PRIMER_MIN_SIZE, 'PRIMER_MAX_GC'=>$PRIMER_MAX_GC, 'PRIMER_OPT_GC_PERCENT'=>$PRIMER_OPT_GC_PERCENT, 'PRIMER_MIN_GC'=>$PRIMER_MIN_GC, 'SEQUENCE_TARGET'=>$TARGET, 'PRIMER_PRODUCT_SIZE_RANGE' =>$PRIMER_PRODUCT_SIZE_RANGE); It works. When I looked at the source code for Primer3Redux, I couldn't find add_targets, but set_parameters looked like it might work, so I used that instead, and it worked. But I see over in the github that there are other issues with the documentation (how primer3redux's result object is now 3 deep rather than 2 deep). Not sure if this is in that category or not. Thanks, Anand On Mon, Aug 22, 2011 at 2:42 PM, Christopher Fields wrote: > On Aug 22, 2011, at 2:25 PM, Anand Patel wrote: > >> I'm trying to get Primer3Redux to work, and am noticing some strange >> things. ?While I found and changed my parameters to the new primer3 >> 2.2.3 parameters, I still can't find add_targets. >> >> Assigning the parameters using set_parameters works for primer3redux, >> add_targets is ?leftover? from primer3. >> >> So is this a doc/POD issue? > > I'm confused. ?You are trying to use add_targets with the latest primer3, but it isn't there? ?Or is the Primer3Redux wrapper missing this parameter? > > chris > >> Thanks, >> Anand >> >> Anand C. Patel, MD MS >> Washington University School of Medicine >> acpatel at gmail.com >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From cjfields at illinois.edu Mon Aug 22 16:10:25 2011 From: cjfields at illinois.edu (Chris Fields) Date: Mon, 22 Aug 2011 15:10:25 -0500 Subject: [Bioperl-l] primer3redux 0.09 add_targets is not there In-Reply-To: References: Message-ID: <3BE41688-C163-4EA1-AF6A-34A6052FCFEA@illinois.edu> On Aug 22, 2011, at 2:52 PM, Anand Patel wrote: > my $primer3 = Bio::Tools::Run::Primer3Redux->new(-outfile => > "temp.out", -path => "/usr/bin/primer3_core"); > > If I use this: > $primer3->add_targets( > 'PRIMER_OPT_TM'=>$PRIMER_OPT_TM,'PRIMER_MIN_TM'=>$PRIMER_MIN_TM, > 'PRIMER_MAX_TM'=>$PRIMER_MAX_TM, > 'PRIMER_PAIR_MAX_DIFF_TM'=>$PRIMER_MAX_DIFF_TM, > 'PRIMER_MAX_SIZE'=>$PRIMER_MAX_SIZE,'PRIMER_OPT_SIZE'=>$PRIMER_OPT_SIZE, > 'PRIMER_MIN_SIZE'=>$PRIMER_MIN_SIZE, > 'PRIMER_MAX_GC'=>$PRIMER_MAX_GC, > 'PRIMER_OPT_GC_PERCENT'=>$PRIMER_OPT_GC_PERCENT, > 'PRIMER_MIN_GC'=>$PRIMER_MIN_GC, > 'SEQUENCE_TARGET'=>$TARGET, 'PRIMER_PRODUCT_SIZE_RANGE' > =>$PRIMER_PRODUCT_SIZE_RANGE); > > I get: > Can't locate object method "add_targets" via package > "Bio::Tools::Run::Primer3Redux" at p3ra.pl line 31, line 1. > > On the other hand, if I change that line to: > $primer3->set_parameters( > 'PRIMER_OPT_TM'=>$PRIMER_OPT_TM,'PRIMER_MIN_TM'=>$PRIMER_MIN_TM, > 'PRIMER_MAX_TM'=>$PRIMER_MAX_TM, > 'PRIMER_PAIR_MAX_DIFF_TM'=>$PRIMER_MAX_DIFF_TM, > 'PRIMER_MAX_SIZE'=>$PRIMER_MAX_SIZE,'PRIMER_OPT_SIZE'=>$PRIMER_OPT_SIZE, > 'PRIMER_MIN_SIZE'=>$PRIMER_MIN_SIZE, > 'PRIMER_MAX_GC'=>$PRIMER_MAX_GC, > 'PRIMER_OPT_GC_PERCENT'=>$PRIMER_OPT_GC_PERCENT, > 'PRIMER_MIN_GC'=>$PRIMER_MIN_GC, > 'SEQUENCE_TARGET'=>$TARGET, 'PRIMER_PRODUCT_SIZE_RANGE' > =>$PRIMER_PRODUCT_SIZE_RANGE); > > It works. When I looked at the source code for Primer3Redux, I > couldn't find add_targets, but set_parameters looked like it might > work, so I used that instead, and it worked. > > But I see over in the github that there are other issues with the > documentation (how primer3redux's result object is now 3 deep rather > than 2 deep). Not sure if this is in that category or not. That is true; documentation was to be updated but that hasn't happened yet (haven't had the free time to work specifically on this, and I think fschwach was to work on some HOWTO documentation). I do plan on an update in the next few weeks to address the various Issues on github, if you can file this as well it would help. I have to go back and look at the history of add_targets() reative to primer3 bioperl code, but I don't think this was part of the commit history of Bio::Tools::Run::Primer3Redux (maybe for the old version, Bio::Tools::Run::Primer3), so that is probably cruft left over from the update. Would be easy enough to alias it for convenience... chris > Thanks, > Anand ... From miquel.amat at me.com Tue Aug 23 16:11:15 2011 From: miquel.amat at me.com (Miguel A. Amat) Date: Tue, 23 Aug 2011 16:11:15 -0400 Subject: [Bioperl-l] Installation on OS X Lion Message-ID: I am trying to install bioperl on mac os x 10.7 but ran into problems with the dependency packages Bio::ASN1::EntrezGene and DBD::mysql. I am running the latest version of CPAN and perl -v 5.12.3 and the BioPerl-1.6.1 package. The installation was being conducted interactively through via the "perl Build.PL" command. Can you provide some help, or suggest an alternative way of installing BioPerl? From cjfields at illinois.edu Tue Aug 23 20:14:49 2011 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 23 Aug 2011 19:14:49 -0500 Subject: [Bioperl-l] Installation on OS X Lion In-Reply-To: References: Message-ID: Try installing the latest version from CPAN; this bypasses the Bio::ASN1::EntrezGene req. DBD::mysql is only needed if you intend on using modules requiring that functionality. chris On Aug 23, 2011, at 3:11 PM, Miguel A. Amat wrote: > I am trying to install bioperl on mac os x 10.7 but ran into problems with the dependency packages Bio::ASN1::EntrezGene and DBD::mysql. > > I am running the latest version of CPAN and perl -v 5.12.3 and the BioPerl-1.6.1 package. The installation was being conducted interactively through via the "perl Build.PL" command. > > Can you provide some help, or suggest an alternative way of installing BioPerl? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From miquel.amat at me.com Tue Aug 23 23:25:31 2011 From: miquel.amat at me.com (Miguel A Amat) Date: Tue, 23 Aug 2011 23:25:31 -0400 Subject: [Bioperl-l] Installation on OS X Lion In-Reply-To: References: Message-ID: Thanks for the feedback, Chris. Now I just need to get GD to install ... On Aug 23, 2011, at 8:14 PM, Chris Fields wrote: > Try installing the latest version from CPAN; this bypasses the Bio::ASN1::EntrezGene req. DBD::mysql is only needed if you intend on using modules requiring that functionality. > > chris > > On Aug 23, 2011, at 3:11 PM, Miguel A. Amat wrote: > >> I am trying to install bioperl on mac os x 10.7 but ran into problems with the dependency packages Bio::ASN1::EntrezGene and DBD::mysql. >> >> I am running the latest version of CPAN and perl -v 5.12.3 and the BioPerl-1.6.1 package. The installation was being conducted interactively through via the "perl Build.PL" command. >> >> Can you provide some help, or suggest an alternative way of installing BioPerl? >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > From scott at scottcain.net Wed Aug 24 10:31:44 2011 From: scott at scottcain.net (Scott Cain) Date: Wed, 24 Aug 2011 10:31:44 -0400 Subject: [Bioperl-l] Installation on OS X Lion In-Reply-To: References: Message-ID: <0D4184A9-2166-4869-823A-BC780E684DCE@scottcain.net> Hi Miguel, Did you try the installer for snow leopard on sourceforge: http://sourceforge.net/projects/gmod/files/Generic%20Genome%20Browser/libgd-MacOSX/ I don't know if it will work on lion but I don't have a copy of lion yet to try it out on. Scott Sent from my iPhone On Aug 23, 2011, at 11:25 PM, Miguel A Amat wrote: > Thanks for the feedback, Chris. Now I just need to get GD to > install ... > > On Aug 23, 2011, at 8:14 PM, Chris Fields > wrote: > >> Try installing the latest version from CPAN; this bypasses the >> Bio::ASN1::EntrezGene req. DBD::mysql is only needed if you intend >> on using modules requiring that functionality. >> >> chris >> >> On Aug 23, 2011, at 3:11 PM, Miguel A. Amat wrote: >> >>> I am trying to install bioperl on mac os x 10.7 but ran into >>> problems with the dependency packages Bio::ASN1::EntrezGene and >>> DBD::mysql. >>> >>> I am running the latest version of CPAN and perl -v 5.12.3 and the >>> BioPerl-1.6.1 package. The installation was being conducted >>> interactively through via the "perl Build.PL" command. >>> >>> Can you provide some help, or suggest an alternative way of >>> installing BioPerl? >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From sheena.scroggins at gmail.com Wed Aug 24 12:21:07 2011 From: sheena.scroggins at gmail.com (Sheena Scroggins) Date: Wed, 24 Aug 2011 09:21:07 -0700 Subject: [Bioperl-l] End of GSoC Message-ID: I just wanted to give a GIANT thanks to my mentors on the BioPerl project, Rob Buels and Chris Fields. They helped me tremendously and we made great progress on the reorganization. All of the modules we extracted can be found on github at https://github.com/bioperl We used a Dist Zilla plugin bundle, which can also be found there. The steps used in the process will be outlined on the BioPerl wiki in the upcoming weeks. The reorganization is off to a great start and by outlining the workflow I'm hoping others will be able to contribute more easily. The progress updates were posted at techomics.com during the project, although they were sporadic. The original outline of the project can be found there as well. Thanks again to all the mentors of GSoC, this program wouldn't work without you! Sheena From miquel.amat at me.com Wed Aug 24 13:48:06 2011 From: miquel.amat at me.com (Miguel A. Amat) Date: Wed, 24 Aug 2011 13:48:06 -0400 Subject: [Bioperl-l] Installation on OS X Lion In-Reply-To: <0D4184A9-2166-4869-823A-BC780E684DCE@scottcain.net> References: <0D4184A9-2166-4869-823A-BC780E684DCE@scottcain.net> Message-ID: <251484F2-E0EB-454D-B664-BB0834FFCF76@me.com> Thanks for all the help; I finally got it to work. Here are the steps I took: upgraded CPAN and used latest version of BioPerl installed dependencies in interactive mode, but GD failed. Quit the installation and tried ?install GD-SVG?; this one seems to have less functionality than GD, but it worked. Installed Bio::Perl. Then, installed Bio::ASN1::EntrezGene Best. On Aug 24, 2011, at 10:31 AM, Scott Cain wrote: > Hi Miguel, > > Did you try the installer for snow leopard on sourceforge: > > http://sourceforge.net/projects/gmod/files/Generic%20Genome%20Browser/libgd-MacOSX/ > > I don't know if it will work on lion but I don't have a copy of lion yet to try it out on. > > Scott > > > Sent from my iPhone > > On Aug 23, 2011, at 11:25 PM, Miguel A Amat wrote: > >> Thanks for the feedback, Chris. Now I just need to get GD to install ... >> >> On Aug 23, 2011, at 8:14 PM, Chris Fields wrote: >> >>> Try installing the latest version from CPAN; this bypasses the Bio::ASN1::EntrezGene req. DBD::mysql is only needed if you intend on using modules requiring that functionality. >>> >>> chris >>> >>> On Aug 23, 2011, at 3:11 PM, Miguel A. Amat wrote: >>> >>>> I am trying to install bioperl on mac os x 10.7 but ran into problems with the dependency packages Bio::ASN1::EntrezGene and DBD::mysql. >>>> >>>> I am running the latest version of CPAN and perl -v 5.12.3 and the BioPerl-1.6.1 package. The installation was being conducted interactively through via the "perl Build.PL" command. >>>> >>>> Can you provide some help, or suggest an alternative way of installing BioPerl? >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Wed Aug 24 13:51:19 2011 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 24 Aug 2011 12:51:19 -0500 Subject: [Bioperl-l] Installation on OS X Lion In-Reply-To: <251484F2-E0EB-454D-B664-BB0834FFCF76@me.com> References: <0D4184A9-2166-4869-823A-BC780E684DCE@scottcain.net> <251484F2-E0EB-454D-B664-BB0834FFCF76@me.com> Message-ID: <200F67E8-7B4E-40AD-9C0D-37160B970F22@illinois.edu> Interesting, since GD::SVG requires GD. Anyway, glad to know it's working for you! chris On Aug 24, 2011, at 12:48 PM, Miguel A. Amat wrote: > Thanks for all the help; I finally got it to work. Here are the steps I took: > > > ? upgraded CPAN and used latest version of BioPerl > ? installed dependencies in interactive mode, but GD failed. > ? Quit the installation and tried ?install GD-SVG?; this one seems to have less functionality than GD, but it worked. > ? Installed Bio::Perl. > ? Then, installed Bio::ASN1::EntrezGene > > > > > > > Best. > > > On Aug 24, 2011, at 10:31 AM, Scott Cain wrote: > >> Hi Miguel, >> >> Did you try the installer for snow leopard on sourceforge: >> >> http://sourceforge.net/projects/gmod/files/Generic%20Genome%20Browser/libgd-MacOSX/ >> >> I don't know if it will work on lion but I don't have a copy of lion yet to try it out on. >> >> Scott >> >> >> Sent from my iPhone >> >> On Aug 23, 2011, at 11:25 PM, Miguel A Amat wrote: >> >>> Thanks for the feedback, Chris. Now I just need to get GD to install ... >>> >>> On Aug 23, 2011, at 8:14 PM, Chris Fields wrote: >>> >>>> Try installing the latest version from CPAN; this bypasses the Bio::ASN1::EntrezGene req. DBD::mysql is only needed if you intend on using modules requiring that functionality. >>>> >>>> chris >>>> >>>> On Aug 23, 2011, at 3:11 PM, Miguel A. Amat wrote: >>>> >>>>> I am trying to install bioperl on mac os x 10.7 but ran into problems with the dependency packages Bio::ASN1::EntrezGene and DBD::mysql. >>>>> >>>>> I am running the latest version of CPAN and perl -v 5.12.3 and the BioPerl-1.6.1 package. The installation was being conducted interactively through via the "perl Build.PL" command. >>>>> >>>>> Can you provide some help, or suggest an alternative way of installing BioPerl? >>>>> _______________________________________________ >>>>> Bioperl-l mailing list >>>>> Bioperl-l at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > From abualiga2 at gmail.com Wed Aug 24 14:09:10 2011 From: abualiga2 at gmail.com (galeb abu-ali) Date: Wed, 24 Aug 2011 14:09:10 -0400 Subject: [Bioperl-l] append schema to proxy In-Reply-To: References: Message-ID: Hi, I'm trying to run a program that generates a circular genome homology atlas "BLASTatlas" ( http://www.cbs.dtu.dk/ws/ws.php?entry=BLASTatlas ). I think the problem is with the module that appends schemas to the proxy, and I don't know how to do that manually. I've emailed the author couple times and have not heard back. Pasted below is the error message. At your convenience, I'd greatly appreciate your help. thanks galeb p/s - also, is there another program that can generate concetric circular plots of BLAST scores for multiple bacterial genomes with a per nucleotide resolution? thanks [galeb at localhost GeneWiz]$ BLASTatlas -modus circle -ref BX571966.fsa -proteins BX571966.proteins.fsa -ann BX571966.ann -blastcfg blast.cfg -customcfg custom.cfg --dnap="Intrinsic Curvature,Stacking Energy,Position Preference" -title "B. pseudomallei K96243" > sgeneric.ps # title set to 'B. pseudomallei K96243' # output format is ps # modus is 'circle' # loading reference genome ... # loading proteins ... # parsing blast lane configuration (blast.cfg) ... # .. parsing blast lane (B. ubonensis Bu) ... # .. .. program: tblastn # .. .. parsing color 101010_040410 # .. .. .. color from: r:10, g:10, b:10 # .. .. .. color to: r:04, g:04, b:10 # .. .. byrange: 0 .. 0.8 # .. parsing sequene source 'cat ./19539.fsa |' ... 1142 done # .. parsing blast lane (B. pseudomallei DM98) ... # .. .. program: tblastn # .. .. parsing color 101010_040410 # .. .. .. color from: r:10, g:10, b:10 # .. .. .. color to: r:04, g:04, b:10 # .. .. byrange: 0 .. 0.8 # .. parsing sequene source 'cat ./19509.fsa |' ... 2370 done # parsing custom lane configuration (custom.cfg) ... # .. parsing custom data entry SIDD at -0.035 ... # .. .. parsing color 000010_101010 # .. .. .. color from: r:00, g:00, b:10 # .. .. .. color to: r:10, g:10, b:10 # .. .. byrange: 9 .. 10 # .. .. boxfilter 5000 ... # .. parsing data source 'gunzip -c BX571966-57a2f2c2e11ca0dd8cd74493d667d4d6-3173005.sidd--0.035-c-10-c.out.gz | cut -f4 |' ... # .. .. parsing data source ... 3173005 done # reading external files and build hash of sequences ... *panic: schemas() removed in v2.00, not needed anymore* at /usr/local/lib/perl5/site_perl/5.12.2/XML/Compile/WSDL11.pm line 65 XML::Compile::WSDL11::schemas(XML::Compile::WSDL11=HASH(0x1fed6740)) at xml-compile.pl line 48 main::appendSchemas(XML::Compile::WSDL11=HASH(0x1fed6740), " http://www.cbs.dtu.dk/ws/common/ws_common_1_0b.xsd", " http://www.cbs.dtu.dk/ws/BLASTatlas/ws_blastatlas_1_0_ws2.xsd") at BLASTatlas line 177 From roy.chaudhuri at gmail.com Wed Aug 24 14:21:12 2011 From: roy.chaudhuri at gmail.com (Roy Chaudhuri) Date: Wed, 24 Aug 2011 19:21:12 +0100 Subject: [Bioperl-l] append schema to proxy In-Reply-To: References: Message-ID: <4E554118.90108@gmail.com> Hi Galeb, This is the wrong mailing list for your question - it's intended for discussion of the Bioperl toolkit, not general bioinformatics questions. Next time, try a general bioinformatics mailing list such as BBB: http://www.bioinformatics.org/lists/bbb Having said all that, maybe you could try BRIG: http://sourceforge.net/projects/brig/ http://www.biomedcentral.com/1471-2164/12/402 Cheers, Roy. On 24/08/2011 19:09, galeb abu-ali wrote: > Hi, > > I'm trying to run a program that generates a circular genome homology atlas > "BLASTatlas" ( http://www.cbs.dtu.dk/ws/ws.php?entry=BLASTatlas ). I think > the problem is with the module that appends schemas to the proxy, and I > don't know how to do that manually. I've emailed the author couple times and > have not heard back. Pasted below is the error message. At your convenience, > I'd greatly appreciate your help. > > thanks > > galeb > > p/s - also, is there another program that can generate concetric circular > plots of BLAST scores for multiple bacterial genomes with a per nucleotide > resolution? thanks > > [galeb at localhost GeneWiz]$ BLASTatlas -modus circle -ref BX571966.fsa > -proteins BX571966.proteins.fsa -ann BX571966.ann -blastcfg blast.cfg > -customcfg custom.cfg --dnap="Intrinsic Curvature,Stacking Energy,Position > Preference" -title "B. pseudomallei K96243"> sgeneric.ps > # title set to 'B. pseudomallei K96243' > # output format is ps > # modus is 'circle' > # loading reference genome ... > # loading proteins ... > # parsing blast lane configuration (blast.cfg) ... > # .. parsing blast lane (B. ubonensis Bu) ... > # .. .. program: tblastn > # .. .. parsing color 101010_040410 > # .. .. .. color from: r:10, g:10, b:10 > # .. .. .. color to: r:04, g:04, b:10 > # .. .. byrange: 0 .. 0.8 > # .. parsing sequene source 'cat ./19539.fsa |' ... 1142 done > # .. parsing blast lane (B. pseudomallei DM98) ... > # .. .. program: tblastn > # .. .. parsing color 101010_040410 > # .. .. .. color from: r:10, g:10, b:10 > # .. .. .. color to: r:04, g:04, b:10 > # .. .. byrange: 0 .. 0.8 > # .. parsing sequene source 'cat ./19509.fsa |' ... 2370 done > # parsing custom lane configuration (custom.cfg) ... > # .. parsing custom data entry SIDD at -0.035 ... > # .. .. parsing color 000010_101010 > # .. .. .. color from: r:00, g:00, b:10 > # .. .. .. color to: r:10, g:10, b:10 > # .. .. byrange: 9 .. 10 > # .. .. boxfilter 5000 ... > # .. parsing data source 'gunzip -c > BX571966-57a2f2c2e11ca0dd8cd74493d667d4d6-3173005.sidd--0.035-c-10-c.out.gz > | cut -f4 |' ... > # .. .. parsing data source ... 3173005 done > # reading external files and build hash of sequences ... > *panic: schemas() removed in v2.00, not needed anymore* > at /usr/local/lib/perl5/site_perl/5.12.2/XML/Compile/WSDL11.pm line 65 > XML::Compile::WSDL11::schemas(XML::Compile::WSDL11=HASH(0x1fed6740)) at > xml-compile.pl line 48 > main::appendSchemas(XML::Compile::WSDL11=HASH(0x1fed6740), " > http://www.cbs.dtu.dk/ws/common/ws_common_1_0b.xsd", " > http://www.cbs.dtu.dk/ws/BLASTatlas/ws_blastatlas_1_0_ws2.xsd") at > BLASTatlas line 177 > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Wed Aug 24 14:22:26 2011 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 24 Aug 2011 13:22:26 -0500 Subject: [Bioperl-l] append schema to proxy In-Reply-To: References: Message-ID: <61E5BF3C-653F-40D3-8764-0DA61859BC8B@illinois.edu> Sorry, but this doesn't have anything to do with BioPerl. Not sure you'll get an answer here. chris On Aug 24, 2011, at 1:09 PM, galeb abu-ali wrote: > Hi, > > I'm trying to run a program that generates a circular genome homology atlas > "BLASTatlas" ( http://www.cbs.dtu.dk/ws/ws.php?entry=BLASTatlas ). I think > the problem is with the module that appends schemas to the proxy, and I > don't know how to do that manually. I've emailed the author couple times and > have not heard back. Pasted below is the error message. At your convenience, > I'd greatly appreciate your help. > > thanks > > galeb > > p/s - also, is there another program that can generate concetric circular > plots of BLAST scores for multiple bacterial genomes with a per nucleotide > resolution? thanks > > [galeb at localhost GeneWiz]$ BLASTatlas -modus circle -ref BX571966.fsa > -proteins BX571966.proteins.fsa -ann BX571966.ann -blastcfg blast.cfg > -customcfg custom.cfg --dnap="Intrinsic Curvature,Stacking Energy,Position > Preference" -title "B. pseudomallei K96243" > sgeneric.ps > # title set to 'B. pseudomallei K96243' > # output format is ps > # modus is 'circle' > # loading reference genome ... > # loading proteins ... > # parsing blast lane configuration (blast.cfg) ... > # .. parsing blast lane (B. ubonensis Bu) ... > # .. .. program: tblastn > # .. .. parsing color 101010_040410 > # .. .. .. color from: r:10, g:10, b:10 > # .. .. .. color to: r:04, g:04, b:10 > # .. .. byrange: 0 .. 0.8 > # .. parsing sequene source 'cat ./19539.fsa |' ... 1142 done > # .. parsing blast lane (B. pseudomallei DM98) ... > # .. .. program: tblastn > # .. .. parsing color 101010_040410 > # .. .. .. color from: r:10, g:10, b:10 > # .. .. .. color to: r:04, g:04, b:10 > # .. .. byrange: 0 .. 0.8 > # .. parsing sequene source 'cat ./19509.fsa |' ... 2370 done > # parsing custom lane configuration (custom.cfg) ... > # .. parsing custom data entry SIDD at -0.035 ... > # .. .. parsing color 000010_101010 > # .. .. .. color from: r:00, g:00, b:10 > # .. .. .. color to: r:10, g:10, b:10 > # .. .. byrange: 9 .. 10 > # .. .. boxfilter 5000 ... > # .. parsing data source 'gunzip -c > BX571966-57a2f2c2e11ca0dd8cd74493d667d4d6-3173005.sidd--0.035-c-10-c.out.gz > | cut -f4 |' ... > # .. .. parsing data source ... 3173005 done > # reading external files and build hash of sequences ... > *panic: schemas() removed in v2.00, not needed anymore* > at /usr/local/lib/perl5/site_perl/5.12.2/XML/Compile/WSDL11.pm line 65 > XML::Compile::WSDL11::schemas(XML::Compile::WSDL11=HASH(0x1fed6740)) at > xml-compile.pl line 48 > main::appendSchemas(XML::Compile::WSDL11=HASH(0x1fed6740), " > http://www.cbs.dtu.dk/ws/common/ws_common_1_0b.xsd", " > http://www.cbs.dtu.dk/ws/BLASTatlas/ws_blastatlas_1_0_ws2.xsd") at > BLASTatlas line 177 > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From abualiga2 at gmail.com Wed Aug 24 14:39:33 2011 From: abualiga2 at gmail.com (abualiga2 at gmail.com) Date: Wed, 24 Aug 2011 18:39:33 +0000 Subject: [Bioperl-l] append schema to proxy In-Reply-To: <4E554118.90108@gmail.com> Message-ID: <00504502ec3723598f04ab44a23f@google.com> Roy, thanks! I'll try that. galeb On Aug 24, 2011 2:21pm, Roy Chaudhuri wrote: > Hi Galeb, > This is the wrong mailing list for your question - it's intended for > discussion of the Bioperl toolkit, not general bioinformatics questions. > Next time, try a general bioinformatics mailing list such as BBB: > http://www.bioinformatics.org/lists/bbb > Having said all that, maybe you could try BRIG: > http://sourceforge.net/projects/brig/ > http://www.biomedcentral.com/1471-2164/12/402 > Cheers, > Roy. From slucky at ibab.ac.in Mon Aug 22 02:01:16 2011 From: slucky at ibab.ac.in (Lucky Singh) Date: Mon, 22 Aug 2011 11:31:16 +0530 (IST) Subject: [Bioperl-l] Problem using Bio::Tools::Run::RemoteBlast Message-ID: <37711.192.168.1.254.1313992876.squirrel@webmail.ibab.ac.in> Dear sir/Ma'am, I am student of Institute of Bioinformatics and Applied Biotechnology, Bangalore, India. While doing my project work I needed remoteblast.pm. So I used default example program which is available with this package. Now I wanted to host it from web server, but This program is not working from it may be it is not able to create or write on file from web server but in command line it is working fine. I don't know the possible reason, please help me to figure it out. -> I am using same example program with basic cgi modification for taking input from web browser. -> Ubuntu 10.04 64 bit OS -> apache2 server -> I have given all permissions 777 recursively to cgi-bin folder -- Regards, Lucky Singh Institute of Bioinformatics and Applied Biotechnology, ------------------------------------------------------ Biotech Park Electronics City Phase I Bangalore 560 100 India. Tel: 080-28528900, 080-28528901, 080-28528902 Fax: 080-28528904 From abualiga2 at gmail.com Wed Aug 24 13:26:10 2011 From: abualiga2 at gmail.com (galeb abu-ali) Date: Wed, 24 Aug 2011 13:26:10 -0400 Subject: [Bioperl-l] append schema to proxy Message-ID: Hi, I'm trying to run a program that generates a circular genome homology atlas "BLASTatlas" ( http://www.cbs.dtu.dk/ws/ws.php?entry=BLASTatlas ). I think the problem is with the module that appends schemas to the proxy, and I don't know how to do that manually. I've emailed the author couple times and have not heard back. Pasted below is the error message. At your convenience, I'd greatly appreciate your help. thanks galeb p/s - also, is there another program that can generate concetric circular plots of BLAST scores for multiple bacterial genomes with a per nucleotide resolution? thanks [galeb at localhost GeneWiz]$ BLASTatlas -modus circle -ref BX571966.fsa -proteins BX571966.proteins.fsa -ann BX571966.ann -blastcfg blast.cfg -customcfg custom.cfg --dnap="Intrinsic Curvature,Stacking Energy,Position Preference" -title "B. pseudomallei K96243" > sgeneric.ps # title set to 'B. pseudomallei K96243' # output format is ps # modus is 'circle' # loading reference genome ... # loading proteins ... # parsing blast lane configuration (blast.cfg) ... # .. parsing blast lane (B. ubonensis Bu) ... # .. .. program: tblastn # .. .. parsing color 101010_040410 # .. .. .. color from: r:10, g:10, b:10 # .. .. .. color to: r:04, g:04, b:10 # .. .. byrange: 0 .. 0.8 # .. parsing sequene source 'cat ./19539.fsa |' ... 1142 done # .. parsing blast lane (B. pseudomallei DM98) ... # .. .. program: tblastn # .. .. parsing color 101010_040410 # .. .. .. color from: r:10, g:10, b:10 # .. .. .. color to: r:04, g:04, b:10 # .. .. byrange: 0 .. 0.8 # .. parsing sequene source 'cat ./19509.fsa |' ... 2370 done # parsing custom lane configuration (custom.cfg) ... # .. parsing custom data entry SIDD at -0.035 ... # .. .. parsing color 000010_101010 # .. .. .. color from: r:00, g:00, b:10 # .. .. .. color to: r:10, g:10, b:10 # .. .. byrange: 9 .. 10 # .. .. boxfilter 5000 ... # .. parsing data source 'gunzip -c BX571966-57a2f2c2e11ca0dd8cd74493d667d4d6-3173005.sidd--0.035-c-10-c.out.gz | cut -f4 |' ... # .. .. parsing data source ... 3173005 done # reading external files and build hash of sequences ... *panic: schemas() removed in v2.00, not needed anymore* at /usr/local/lib/perl5/site_perl/5.12.2/XML/Compile/WSDL11.pm line 65 XML::Compile::WSDL11::schemas(XML::Compile::WSDL11=HASH(0x1fed6740)) at xml-compile.pl line 48 main::appendSchemas(XML::Compile::WSDL11=HASH(0x1fed6740), " http://www.cbs.dtu.dk/ws/common/ws_common_1_0b.xsd", " http://www.cbs.dtu.dk/ws/BLASTatlas/ws_blastatlas_1_0_ws2.xsd") at BLASTatlas line 177 From jj.emerson at gmail.com Wed Aug 24 21:53:38 2011 From: jj.emerson at gmail.com (J.J. Emerson) Date: Wed, 24 Aug 2011 18:53:38 -0700 Subject: [Bioperl-l] Bio::SeqIO can't guess the format of data from a pipe Message-ID: Hello All, I have experienced some behavior in SeqIO that doesn't seem to be what I would expect. Basically, for a certain script, if I try to pass something like "-fh => \*STDIN" to Bio::SeqIO->new(), it will fail if both of the following two conditions are met simultaneously: 1. STDIN is coming from a pipe; 2. SeqIO is trying to guess the format. If STDIO is coming from redirection instead of a pipe or if the format is specified manually (i.e. BioPERL doesn't have to guess), the error doesn't seem to occur. This issue has been reported previously: http://lists.open-bio.org/pipermail/bioperl-l/2010-July/033681.html https://redmine.open-bio.org/issues/3122 This issue is ultimately one of using seek() on a pipe, which is forbidden (see below). To be clear, there are kludgy ways around this that allow BioPERL to take input from a pipe AND guess the format. My naive and inefficient kludge was to test for reading from STDIN and for the absence of a format. If both of these conditions are met, then I slurp STDIN into a variable and then open a filehandle on that variable, and pass it to SeqIO, which can guess the format if the fh isn't opened on a pipe. SeqIO then successfully guesses the format and does the SeqIO thing, at the expense of having the program pass over the data at least twice. And if the input file is huge, it could potentially consume all the memory. A better way to address the problem would be to process the input one line at a time, but this seems to require more extensive changes. The reason I'm reposting this is because I think that the inability to guess the sequence format from data originating from a pipe is an important limitation for a fundamental part of BioPERL. When designing scripts to be used in pipelines, the inability to guess formats for piped data limits BioPERL's pipelineability substantially. Even though previous reports of this have been made and a bug opened and closed, I was wondering if anyone thought this was worthwhile fixing so as to make SeqIO (and probably AlignIO as well?) more flexible? Does anyone think this should be refiled as a bug? Cheers, J.J. PS Below are snippets of code and/or errors related to reproducing the failure to guess unspecified formats. I'll see how Mailman treats my attachments and post the code as a reply if they don't work. The bioperl_fhtest.pl attachment is the script that reproduces the error. The w.fa is a fasta file containing some sequence. Here are the command lines to generate the behavior I observe (w.fa is a file containing some fasta sequences, in my case it was the w gene from different *Drosophila* species): ./bioperl_fhtest.pl fasta < w.fa # Works (redirection, no guessing) > ./bioperl_fhtest.pl < w.fa # Works (redirection, guessing) > > cat w.fa | ./bioperl_fhtest.pl fasta # Works (pipe, no guessing) > cat w.fa | ./bioperl_fhtest.pl # DOESN'T work (pipe, guessing) > Here's the error I get in the last case: ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: Failed resetting the filehandle; IO error occurred > STACK: Error::throw > STACK: Bio::Root::Root::throw > /usr/local/share/perl/5.10.1/Bio/Root/Root.pm:472 > STACK: Bio::Tools::GuessSeqFormat::guess > /usr/local/share/perl/5.10.1/Bio/Tools/GuessSeqFormat.pm:512 > STACK: Bio::SeqIO::new /usr/local/share/perl/5.10.1/Bio/SeqIO.pm:381 > STACK: ./bioperl_fhtest.pl:8 > ----------------------------------------------------------- > >From what I gather, the error is triggered by a failure of seek() on a STDIO fh on lines 517-518 (text from the version GuessSeqFormat.pm installed on my server): 512 if (defined $self->{-file}) { > 513 # Close the file we opened. > 514 close($fh); > 515 } elsif (ref $fh eq 'GLOB') { > 516 # Try seeking to the start position. > 517 seek($fh, $start_pos, 0) || $self->throw("Failed resetting > the ". > 518 "filehandle; IO error > occurred");; > 519 } elsif (defined $fh && $fh->can('setpos')) { > 520 # Seek to the start position. > 521 $fh->setpos($start_pos); > 522 } > -------------- next part -------------- A non-text attachment was scrubbed... Name: bioperl_fhtest.pl Type: text/x-perl-script Size: 505 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: w.fa Type: application/octet-stream Size: 6335 bytes Desc: not available URL: From frederic.sapet at gmail.com Thu Aug 25 09:24:08 2011 From: frederic.sapet at gmail.com (=?UTF-8?B?RnLDqWTDqXJpYyBTYXBldA==?=) Date: Thu, 25 Aug 2011 15:24:08 +0200 Subject: [Bioperl-l] fasta35 and fasta36 parsing support in BioPerl Message-ID: Hello I have tried to parse a fasta35 report file using BioPerl, in order to produce a valid HTML file. It seems to work well, but there's a small issue with homology string in the report. Please find in joined files, a test script. After that, I have tried to parse a fasta36 file, but this seems to be not supported yet: here is the error thrown : Uncaught exception from user code: ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: Unrecognized alignment line (3) '>--' STACK: Error::throw STACK: Bio::Root::Root::throw /home/bga/bioinfo/fsapet/BioPerlLive/lib/perl5/Bio/Root/Root.pm:472 STACK: Bio::SearchIO::fasta::next_result /home/bga/bioinfo/fsapet/BioPerlLive/lib/perl5/Bio/SearchIO/fasta.pm:1061 STACK: ./test.pl:36 ----------------------------------------------------------- at /usr/lib/perl5/site_perl/5.10.0/Error.pm line 184 Error::throw('Bio::Root::Exception', 'Unrecognized alignment line (3) \'>--\'') called at /home/bga/bioinfo/fsapet/BioPerlLive/lib/perl5/Bio/Root/Root.pm line 472 Bio::Root::Root::throw('Bio::SearchIO::fasta=HASH', 'Unrecognized alignment line (3) \'>--\'') called at /home/bga/bioinfo/fsapet/BioPerlLive/lib/perl5/Bio/SearchIO/fasta.pm line 1061 Bio::SearchIO::fasta::next_result('Bio::SearchIO::fasta=HASH') called at ./test.pl line 36 Thank you Fred -------------- next part -------------- A non-text attachment was scrubbed... Name: FastaBioPerl.tar.bz2 Type: application/x-bzip2 Size: 7692 bytes Desc: not available URL: From miquel.amat at me.com Tue Aug 23 02:07:54 2011 From: miquel.amat at me.com (Miguel A. Amat) Date: Tue, 23 Aug 2011 02:07:54 -0400 Subject: [Bioperl-l] Help Message-ID: <44829080-5467-4103-AF5B-D09CBDA6F99F@me.com> I am trying to install bioperl on mac os x 10.7 but ran into problems with the dependencies Bio::ASN1::EntrezGene and DBD::mysql. I am running the latest version of CPAN and perl -v 5.12.3 and the BioPerl-1.6.1 package. The installation was being conducted interactively through via the "perl Build.PL" command. Can you provide some help? From bosborne11 at verizon.net Thu Aug 25 10:35:29 2011 From: bosborne11 at verizon.net (Brian Osborne) Date: Thu, 25 Aug 2011 10:35:29 -0400 Subject: [Bioperl-l] SeqIO alters Genbank files Message-ID: bioperl-l, I need to run something by you before I commit code and tests. I have code that takes a Genbank file as input and creates another Genbank file as output. I noticed that SeqIO - specifically FTHelper.pm - was taking a tag like this in the input file: /score=100.1 And adding a "note" tag, so the output file contains this: /score=100.1 /note="score=100.1" I'm assuming that the code does this because NCBI will not accept score tags and values even though Bioperl, generally speaking, does not say that NCBI defines the fine details of Genbank format. On the other hand I don't like the idea that SeqIO is altering the content. It also turns out that if you have code that does multiple round-trips you end up with text like this: /score=100.1 /note="score=100.1" /note="score=100.1" /note="score=100.1" /note="score=100.1" Should I comment out the code that's doing these edits or not? Thanks again, Brian O. From cjfields at illinois.edu Thu Aug 25 12:21:15 2011 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 25 Aug 2011 11:21:15 -0500 Subject: [Bioperl-l] Problem using Bio::Tools::Run::RemoteBlast In-Reply-To: <37711.192.168.1.254.1313992876.squirrel@webmail.ibab.ac.in> References: <37711.192.168.1.254.1313992876.squirrel@webmail.ibab.ac.in> Message-ID: It's hard to evaluate what the problem is w/o code, the BioPerl version, and so on. It's very possible you are using an out-of-date BioPerl. chris On Aug 22, 2011, at 1:01 AM, Lucky Singh wrote: > Dear sir/Ma'am, > > I am student of Institute of Bioinformatics and Applied Biotechnology, > Bangalore, India. While doing my project work I needed remoteblast.pm. So > I used default example program which is available with this package. Now I > wanted to host it from web server, but This program is not working from it > may be it is not able to create or write on file from web server but in > command line it is working fine. I don't know the possible reason, please > help me to figure it out. > > > -> I am using same example program with basic cgi modification for taking > input from web browser. > -> Ubuntu 10.04 64 bit OS > -> apache2 server > -> I have given all permissions 777 recursively to cgi-bin folder > > > -- > Regards, > Lucky Singh > > Institute of Bioinformatics and Applied Biotechnology, > ------------------------------------------------------ > Biotech Park > Electronics City Phase I > Bangalore 560 100 > India. > Tel: 080-28528900, 080-28528901, 080-28528902 > Fax: 080-28528904 > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Thu Aug 25 12:34:40 2011 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 25 Aug 2011 11:34:40 -0500 Subject: [Bioperl-l] fasta35 and fasta36 parsing support in BioPerl In-Reply-To: References: Message-ID: <4C95797A-343C-4651-AF0C-964A7E10E8D1@illinois.edu> Frederic, The best place to post this is to our bug server: http://redmine.open-bio.org Attach all relevant data for the bug, this really helps us to diagnose the issue. chris On Aug 25, 2011, at 8:24 AM, Fr?d?ric Sapet wrote: > Hello > I have tried to parse a fasta35 report file using BioPerl, in order to > produce a valid HTML file. > It seems to work well, but there's a small issue with homology string > in the report. > Please find in joined files, a test script. > > After that, I have tried to parse a fasta36 file, but this seems to be > not supported yet: here is the error thrown : > > Uncaught exception from user code: > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: Unrecognized alignment line (3) '>--' > STACK: Error::throw > STACK: Bio::Root::Root::throw > /home/bga/bioinfo/fsapet/BioPerlLive/lib/perl5/Bio/Root/Root.pm:472 > STACK: Bio::SearchIO::fasta::next_result > /home/bga/bioinfo/fsapet/BioPerlLive/lib/perl5/Bio/SearchIO/fasta.pm:1061 > STACK: ./test.pl:36 > ----------------------------------------------------------- > at /usr/lib/perl5/site_perl/5.10.0/Error.pm line 184 > Error::throw('Bio::Root::Exception', 'Unrecognized alignment line (3) > \'>--\'') called at > /home/bga/bioinfo/fsapet/BioPerlLive/lib/perl5/Bio/Root/Root.pm line > 472 > Bio::Root::Root::throw('Bio::SearchIO::fasta=HASH', 'Unrecognized > alignment line (3) \'>--\'') called at > /home/bga/bioinfo/fsapet/BioPerlLive/lib/perl5/Bio/SearchIO/fasta.pm > line 1061 > Bio::SearchIO::fasta::next_result('Bio::SearchIO::fasta=HASH') called > at ./test.pl line 36 > > Thank you > > Fred > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Thu Aug 25 12:42:30 2011 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 25 Aug 2011 11:42:30 -0500 Subject: [Bioperl-l] SeqIO alters Genbank files In-Reply-To: References: Message-ID: Brian, I think comment out the code; our baked-in validation is only half-correct anyway, and I think it's probably a good idea to veer towards separation of format validation and parsing (they're two related but different concerns). To tell the truth, I think we should eschew using FTHelper altogether and just use a Bio::SeqFeatureI-based class directly. I haven't quite grasped the reasoning behind FTHelper.pm, and I would bet removing it as a middleman across the board would help parsing speed. Anyone have an objection to that, or at least an explanation for generation of tons of FTHelper instances that couldn't be handled by a Factory? chris On Aug 25, 2011, at 9:35 AM, Brian Osborne wrote: > bioperl-l, > > I need to run something by you before I commit code and tests. I have code that takes a Genbank file as input and creates another Genbank file as output. I noticed that SeqIO - specifically FTHelper.pm - was taking a tag like this in the input file: > > /score=100.1 > > And adding a "note" tag, so the output file contains this: > > /score=100.1 > /note="score=100.1" > > I'm assuming that the code does this because NCBI will not accept score tags and values even though Bioperl, generally speaking, does not say that NCBI defines the fine details of Genbank format. > > On the other hand I don't like the idea that SeqIO is altering the content. It also turns out that if you have code that does multiple round-trips you end up with text like this: > > /score=100.1 > /note="score=100.1" > /note="score=100.1" > /note="score=100.1" > /note="score=100.1" > > Should I comment out the code that's doing these edits or not? > > Thanks again, > > Brian O. > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Thu Aug 25 12:58:51 2011 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 25 Aug 2011 11:58:51 -0500 Subject: [Bioperl-l] Bio::SeqIO can't guess the format of data from a pipe In-Reply-To: References: Message-ID: <2E070303-ADDF-472B-835B-6D3959B83E9A@illinois.edu> On Aug 24, 2011, at 8:53 PM, J.J. Emerson wrote: > Hello All, > > I have experienced some behavior in SeqIO that doesn't seem to be what I > would expect. Basically, for a certain script, if I try to pass something > like "-fh => \*STDIN" to Bio::SeqIO->new(), it will fail if both of the > following two conditions are met simultaneously: > > 1. STDIN is coming from a pipe; > 2. SeqIO is trying to guess the format. > > If STDIO is coming from redirection instead of a pipe or if the format is > specified manually (i.e. BioPERL doesn't have to guess), the error doesn't > seem to occur. > > This issue has been reported previously: > > http://lists.open-bio.org/pipermail/bioperl-l/2010-July/033681.html > https://redmine.open-bio.org/issues/3122 Yes, this was addressed according to that case. > This issue is ultimately one of using seek() on a pipe, which is forbidden > (see below). To be clear, there are kludgy ways around this that allow > BioPERL to take input from a pipe AND guess the format. My naive and > inefficient kludge was to test for reading from STDIN and for the absence of > a format. If both of these conditions are met, then I slurp STDIN into a > variable and then open a filehandle on that variable, and pass it to SeqIO, > which can guess the format if the fh isn't opened on a pipe. SeqIO then > successfully guesses the format and does the SeqIO thing, at the expense of > having the program pass over the data at least twice. And if the input file > is huge, it could potentially consume all the memory. A better way to > address the problem would be to process the input one line at a time, but > this seems to require more extensive changes. Have you tried tempfiles? Not that this is a great solution, but it's very commonly used for large sequence data, and it is seekable. This behavior could also be wrapped in GuessSeqFormat i suppose (but see below) > The reason I'm reposting this is because I think that the inability to guess > the sequence format from data originating from a pipe is an important > limitation for a fundamental part of BioPERL. When designing scripts to be > used in pipelines, the inability to guess formats for piped data limits > BioPERL's pipelineability substantially. Even though previous reports of > this have been made and a bug opened and closed, I was wondering if anyone > thought this was worthwhile fixing so as to make SeqIO (and probably AlignIO > as well?) more flexible? > > Does anyone think this should be refiled as a bug? > > Cheers, > > J.J. The fundamental problem with pipes (as you indicated) is that the data stream is not seekable. We do have a built-in buffer in Bio::Root::IO that somewhat handles this, but Bio::Tools::GuessSeqFormat is (IIRC) designed to use the filehandle directly, bypassing the BioPerl IO layer completely. One solution is to redesign GuessSeqFormat to use Bio::Root::IO, have GuessSeqFormat push all data back to the buffer, then let SeqIO parse. That will require some fundamental changes for both Bio::Root::IO and Bio::SeqIO (note that one cannot pass a Bio::Root::IO instance to another Bio::Root::IO-based class for parsing at this time). The other option is (as hinted above) having GuessSeqFormat dump the data to a tempfile, seek back after guessing, and retain the filehandle for Bio::SeqIO. Not the best solutions, but either should work. My question (not a criticism, just trying to understand the problem): why are you going through all the trouble of using GuessSeqFormat as a permanent solution anyway? If you have a stream returning a possibly unknown data type, I would argue that the fundamental bug is not GuessSeqFormat but something else, more specifically not knowing the behavior of the data source and the returned format to begin with. Is something preventing that? My point is, GuessSeqFormat is fine as a temporary stop-gap, but it is not a permanent solution to your problems (it is guessing, after all). Note the code has had very little development over the years, and the related SeqIO code hasn't aged particularly well. > PS > > Below are snippets of code and/or errors related to reproducing the failure > to guess unspecified formats. I'll see how Mailman treats my attachments and > post the code as a reply if they don't work. > > The bioperl_fhtest.pl attachment is the script that reproduces the error. > The w.fa is a fasta file containing some sequence. > > Here are the command lines to generate the behavior I observe (w.fa is a > file containing some fasta sequences, in my case it was the w gene from > different *Drosophila* species): > > ./bioperl_fhtest.pl fasta < w.fa # Works (redirection, no guessing) >> ./bioperl_fhtest.pl < w.fa # Works (redirection, guessing) >> >> cat w.fa | ./bioperl_fhtest.pl fasta # Works (pipe, no guessing) >> cat w.fa | ./bioperl_fhtest.pl # DOESN'T work (pipe, guessing) >> > > > Here's the error I get in the last case: > > ------------- EXCEPTION: Bio::Root::Exception ------------- >> MSG: Failed resetting the filehandle; IO error occurred >> STACK: Error::throw >> STACK: Bio::Root::Root::throw >> /usr/local/share/perl/5.10.1/Bio/Root/Root.pm:472 >> STACK: Bio::Tools::GuessSeqFormat::guess >> /usr/local/share/perl/5.10.1/Bio/Tools/GuessSeqFormat.pm:512 >> STACK: Bio::SeqIO::new /usr/local/share/perl/5.10.1/Bio/SeqIO.pm:381 >> STACK: ./bioperl_fhtest.pl:8 >> ----------------------------------------------------------- >> > >> From what I gather, the error is triggered by a failure of seek() on a STDIO > fh on lines 517-518 (text from the version GuessSeqFormat.pm installed on my > server): > > 512 if (defined $self->{-file}) { >> 513 # Close the file we opened. >> 514 close($fh); >> 515 } elsif (ref $fh eq 'GLOB') { >> 516 # Try seeking to the start position. >> 517 seek($fh, $start_pos, 0) || $self->throw("Failed resetting >> the ". >> 518 "filehandle; IO error >> occurred");; >> 519 } elsif (defined $fh && $fh->can('setpos')) { >> 520 # Seek to the start position. >> 521 $fh->setpos($start_pos); >> 522 } >> > _______________________________________________ You are always welcome to reopen and update the bug, or file a new one. chris From cjfields at illinois.edu Thu Aug 25 13:16:03 2011 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 25 Aug 2011 12:16:03 -0500 Subject: [Bioperl-l] SeqIO alters Genbank files In-Reply-To: References: Message-ID: <393F144A-AECE-4F7D-B418-B71D46F3C82F@illinois.edu> Brian, Yes, that's correct (comment out or remove the other stuff). Not sure what difference it will make, I'm interested to see if anything fundamental expects this behavior and breaks with tests. Using 'git blame', it appears Allen Day added this in relation to Feature-Annotation code we actually reverted a few years ago, so this should be removed anyway. I still think we should work around FTHelper altogether. Reading the code, it seems like a ton of wasted instances being generated for no apparent reason. Now going back to our bioperl archives to see if there is any need for it... chris On Aug 25, 2011, at 11:53 AM, Brian Osborne wrote: > Chris, > > OK, will do. I should add that an early version of FTHelper was doing this same edit with the "strand", "source_tag", and "frame" tags but someone has commented out the "source_tag" and "strand" lines. > > Should I comment out both "score" and "frame" code? > > BIO > > On Aug 25, 2011, at 12:42 PM, Chris Fields wrote: > >> Brian, >> >> I think comment out the code; our baked-in validation is only half-correct anyway, and I think it's probably a good idea to veer towards separation of format validation and parsing (they're two related but different concerns). >> >> To tell the truth, I think we should eschew using FTHelper altogether and just use a Bio::SeqFeatureI-based class directly. I haven't quite grasped the reasoning behind FTHelper.pm, and I would bet removing it as a middleman across the board would help parsing speed. Anyone have an objection to that, or at least an explanation for generation of tons of FTHelper instances that couldn't be handled by a Factory? >> >> chris >> >> On Aug 25, 2011, at 9:35 AM, Brian Osborne wrote: >> >>> bioperl-l, >>> >>> I need to run something by you before I commit code and tests. I have code that takes a Genbank file as input and creates another Genbank file as output. I noticed that SeqIO - specifically FTHelper.pm - was taking a tag like this in the input file: >>> >>> /score=100.1 >>> >>> And adding a "note" tag, so the output file contains this: >>> >>> /score=100.1 >>> /note="score=100.1" >>> >>> I'm assuming that the code does this because NCBI will not accept score tags and values even though Bioperl, generally speaking, does not say that NCBI defines the fine details of Genbank format. >>> >>> On the other hand I don't like the idea that SeqIO is altering the content. It also turns out that if you have code that does multiple round-trips you end up with text like this: >>> >>> /score=100.1 >>> /note="score=100.1" >>> /note="score=100.1" >>> /note="score=100.1" >>> /note="score=100.1" >>> >>> Should I comment out the code that's doing these edits or not? >>> >>> Thanks again, >>> >>> Brian O. >>> >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > From bosborne11 at verizon.net Thu Aug 25 12:53:08 2011 From: bosborne11 at verizon.net (Brian Osborne) Date: Thu, 25 Aug 2011 12:53:08 -0400 Subject: [Bioperl-l] SeqIO alters Genbank files In-Reply-To: References: Message-ID: Chris, OK, will do. I should add that an early version of FTHelper was doing this same edit with the "strand", "source_tag", and "frame" tags but someone has commented out the "source_tag" and "strand" lines. Should I comment out both "score" and "frame" code? BIO On Aug 25, 2011, at 12:42 PM, Chris Fields wrote: > Brian, > > I think comment out the code; our baked-in validation is only half-correct anyway, and I think it's probably a good idea to veer towards separation of format validation and parsing (they're two related but different concerns). > > To tell the truth, I think we should eschew using FTHelper altogether and just use a Bio::SeqFeatureI-based class directly. I haven't quite grasped the reasoning behind FTHelper.pm, and I would bet removing it as a middleman across the board would help parsing speed. Anyone have an objection to that, or at least an explanation for generation of tons of FTHelper instances that couldn't be handled by a Factory? > > chris > > On Aug 25, 2011, at 9:35 AM, Brian Osborne wrote: > >> bioperl-l, >> >> I need to run something by you before I commit code and tests. I have code that takes a Genbank file as input and creates another Genbank file as output. I noticed that SeqIO - specifically FTHelper.pm - was taking a tag like this in the input file: >> >> /score=100.1 >> >> And adding a "note" tag, so the output file contains this: >> >> /score=100.1 >> /note="score=100.1" >> >> I'm assuming that the code does this because NCBI will not accept score tags and values even though Bioperl, generally speaking, does not say that NCBI defines the fine details of Genbank format. >> >> On the other hand I don't like the idea that SeqIO is altering the content. It also turns out that if you have code that does multiple round-trips you end up with text like this: >> >> /score=100.1 >> /note="score=100.1" >> /note="score=100.1" >> /note="score=100.1" >> /note="score=100.1" >> >> Should I comment out the code that's doing these edits or not? >> >> Thanks again, >> >> Brian O. >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From jj.emerson at gmail.com Thu Aug 25 14:52:48 2011 From: jj.emerson at gmail.com (J.J. Emerson) Date: Thu, 25 Aug 2011 11:52:48 -0700 Subject: [Bioperl-l] Bio::SeqIO can't guess the format of data from a pipe In-Reply-To: <2E070303-ADDF-472B-835B-6D3959B83E9A@illinois.edu> References: <2E070303-ADDF-472B-835B-6D3959B83E9A@illinois.edu> Message-ID: Hi Chris, You asked: My question (not a criticism, just trying to understand the problem): why > are you going through all the trouble of using GuessSeqFormat as a permanent > solution anyway? If you have a stream returning a possibly unknown data > type, I would argue that the fundamental bug is not GuessSeqFormat but > something else, more specifically not knowing the behavior of the data > source and the returned format to begin with. Is something preventing that? > In my particular case, I'm trying not to impose a particular usage scenario onto the script I'm writing in the hopes it will be useful (and general) to others in my lab in the future*. In my proximate case, I will certainly be able to provide SeqIO with a format argument. But insofar as GuessSeqFormat is considered desirable (and reasonable people could indeed disagree whether it is desirable) I think its applicability shouldn't hinge on whether it is guessing on a pipe or a file. My point is, GuessSeqFormat is fine as a temporary stop-gap, but it is not a > permanent solution to your problems (it is guessing, after all). Note the > code has had very little development over the years, and the related SeqIO > code hasn't aged particularly well. > I see. I wasn't aware that GuessSeqFormat was so relatively neglected. Given the rather challenging nature of the more elegant fix you suggested (using the buffering of Root:IO), perhaps I should consider dropping my issue or filing it as a feature request rather than a bug? Cheers, J.J. PS * The way I plan on using my script is roughly as follows: prog1 [some arguments] \ | myscript.pl --informat fasta \ | prog2 \ | prog3 > pipeline.output However, I'd like for the "--informat" switch to be optional, mainly to increase usability for other users. For any well considered format, the information is right there in the data to know what the format is, and as such, providing the format a second time is somewhat redundant. In principle, being able to do the following would be very useful: prog1 [some arguments] \ | myscript.pl \ | prog2 > pipeline.output The modularity of pipelining is very valuable and this is what caused me to anticipate a usage scenario that involved both GuessSeqFormat and reading from a pipe. On Thu, Aug 25, 2011 at 9:58 AM, Chris Fields wrote: > On Aug 24, 2011, at 8:53 PM, J.J. Emerson wrote: > > > Hello All, > > > > I have experienced some behavior in SeqIO that doesn't seem to be what I > > would expect. Basically, for a certain script, if I try to pass something > > like "-fh => \*STDIN" to Bio::SeqIO->new(), it will fail if both of the > > following two conditions are met simultaneously: > > > > 1. STDIN is coming from a pipe; > > 2. SeqIO is trying to guess the format. > > > > If STDIO is coming from redirection instead of a pipe or if the format is > > specified manually (i.e. BioPERL doesn't have to guess), the error > doesn't > > seem to occur. > > > > This issue has been reported previously: > > > > http://lists.open-bio.org/pipermail/bioperl-l/2010-July/033681.html > > https://redmine.open-bio.org/issues/3122 > > Yes, this was addressed according to that case. > > > This issue is ultimately one of using seek() on a pipe, which is > forbidden > > (see below). To be clear, there are kludgy ways around this that allow > > BioPERL to take input from a pipe AND guess the format. My naive and > > inefficient kludge was to test for reading from STDIN and for the absence > of > > a format. If both of these conditions are met, then I slurp STDIN into a > > variable and then open a filehandle on that variable, and pass it to > SeqIO, > > which can guess the format if the fh isn't opened on a pipe. SeqIO then > > successfully guesses the format and does the SeqIO thing, at the expense > of > > having the program pass over the data at least twice. And if the input > file > > is huge, it could potentially consume all the memory. A better way to > > address the problem would be to process the input one line at a time, but > > this seems to require more extensive changes. > > Have you tried tempfiles? Not that this is a great solution, but it's very > commonly used for large sequence data, and it is seekable. This behavior > could also be wrapped in GuessSeqFormat i suppose (but see below) > > > The reason I'm reposting this is because I think that the inability to > guess > > the sequence format from data originating from a pipe is an important > > limitation for a fundamental part of BioPERL. When designing scripts to > be > > used in pipelines, the inability to guess formats for piped data limits > > BioPERL's pipelineability substantially. Even though previous reports of > > this have been made and a bug opened and closed, I was wondering if > anyone > > thought this was worthwhile fixing so as to make SeqIO (and probably > AlignIO > > as well?) more flexible? > > > > Does anyone think this should be refiled as a bug? > > > > Cheers, > > > > J.J. > > The fundamental problem with pipes (as you indicated) is that the data > stream is not seekable. We do have a built-in buffer in Bio::Root::IO that > somewhat handles this, but Bio::Tools::GuessSeqFormat is (IIRC) designed to > use the filehandle directly, bypassing the BioPerl IO layer completely. > > One solution is to redesign GuessSeqFormat to use Bio::Root::IO, have > GuessSeqFormat push all data back to the buffer, then let SeqIO parse. That > will require some fundamental changes for both Bio::Root::IO and Bio::SeqIO > (note that one cannot pass a Bio::Root::IO instance to another > Bio::Root::IO-based class for parsing at this time). > > The other option is (as hinted above) having GuessSeqFormat dump the data > to a tempfile, seek back after guessing, and retain the filehandle for > Bio::SeqIO. Not the best solutions, but either should work. > > My question (not a criticism, just trying to understand the problem): why > are you going through all the trouble of using GuessSeqFormat as a permanent > solution anyway? If you have a stream returning a possibly unknown data > type, I would argue that the fundamental bug is not GuessSeqFormat but > something else, more specifically not knowing the behavior of the data > source and the returned format to begin with. Is something preventing that? > > My point is, GuessSeqFormat is fine as a temporary stop-gap, but it is not > a permanent solution to your problems (it is guessing, after all). Note the > code has had very little development over the years, and the related SeqIO > code hasn't aged particularly well. > > > PS > > > > Below are snippets of code and/or errors related to reproducing the > failure > > to guess unspecified formats. I'll see how Mailman treats my attachments > and > > post the code as a reply if they don't work. > > > > The bioperl_fhtest.pl attachment is the script that reproduces the > error. > > The w.fa is a fasta file containing some sequence. > > > > Here are the command lines to generate the behavior I observe (w.fa is a > > file containing some fasta sequences, in my case it was the w gene from > > different *Drosophila* species): > > > > ./bioperl_fhtest.pl fasta < w.fa # Works (redirection, no guessing) > >> ./bioperl_fhtest.pl < w.fa # Works (redirection, guessing) > >> > >> cat w.fa | ./bioperl_fhtest.pl fasta # Works (pipe, no guessing) > >> cat w.fa | ./bioperl_fhtest.pl # DOESN'T work (pipe, guessing) > >> > > > > > > Here's the error I get in the last case: > > > > ------------- EXCEPTION: Bio::Root::Exception ------------- > >> MSG: Failed resetting the filehandle; IO error occurred > >> STACK: Error::throw > >> STACK: Bio::Root::Root::throw > >> /usr/local/share/perl/5.10.1/Bio/Root/Root.pm:472 > >> STACK: Bio::Tools::GuessSeqFormat::guess > >> /usr/local/share/perl/5.10.1/Bio/Tools/GuessSeqFormat.pm:512 > >> STACK: Bio::SeqIO::new /usr/local/share/perl/5.10.1/Bio/SeqIO.pm:381 > >> STACK: ./bioperl_fhtest.pl:8 > >> ----------------------------------------------------------- > >> > > > >> From what I gather, the error is triggered by a failure of seek() on a > STDIO > > fh on lines 517-518 (text from the version GuessSeqFormat.pm installed on > my > > server): > > > > 512 if (defined $self->{-file}) { > >> 513 # Close the file we opened. > >> 514 close($fh); > >> 515 } elsif (ref $fh eq 'GLOB') { > >> 516 # Try seeking to the start position. > >> 517 seek($fh, $start_pos, 0) || $self->throw("Failed > resetting > >> the ". > >> 518 "filehandle; IO error > >> occurred");; > >> 519 } elsif (defined $fh && $fh->can('setpos')) { > >> 520 # Seek to the start position. > >> 521 $fh->setpos($start_pos); > >> 522 } > >> > > _______________________________________________ > > You are always welcome to reopen and update the bug, or file a new one. > > chris > > From cjfields at illinois.edu Thu Aug 25 17:04:15 2011 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 25 Aug 2011 16:04:15 -0500 Subject: [Bioperl-l] Bio::SeqIO can't guess the format of data from a pipe In-Reply-To: References: <2E070303-ADDF-472B-835B-6D3959B83E9A@illinois.edu> Message-ID: On Aug 25, 2011, at 1:52 PM, J.J. Emerson wrote: > Hi Chris, > > You asked: > > My question (not a criticism, just trying to understand the problem): why are you going through all the trouble of using GuessSeqFormat as a permanent solution anyway? If you have a stream returning a possibly unknown data type, I would argue that the fundamental bug is not GuessSeqFormat but something else, more specifically not knowing the behavior of the data source and the returned format to begin with. Is something preventing that? > > In my particular case, I'm trying not to impose a particular usage scenario onto the script I'm writing in the hopes it will be useful (and general) to others in my lab in the future*. In my proximate case, I will certainly be able to provide SeqIO with a format argument. But insofar as GuessSeqFormat is considered desirable (and reasonable people could indeed disagree whether it is desirable) I think its applicability shouldn't hinge on whether it is guessing on a pipe or a file. > > My point is, GuessSeqFormat is fine as a temporary stop-gap, but it is not a permanent solution to your problems (it is guessing, after all). Note the code has had very little development over the years, and the related SeqIO code hasn't aged particularly well. > > I see. I wasn't aware that GuessSeqFormat was so relatively neglected. Given the rather challenging nature of the more elegant fix you suggested (using the buffering of Root:IO), perhaps I should consider dropping my issue or filing it as a feature request rather than a bug? That's fine. I don't want to dissuade you from taking this on, either. > Cheers, > > J.J. > > PS > > * The way I plan on using my script is roughly as follows: > > prog1 [some arguments] \ > | myscript.pl --informat fasta \ > | prog2 \ > | prog3 > pipeline.output > > However, I'd like for the "--informat" switch to be optional, mainly to increase usability for other users. For any well considered format, the information is right there in the data to know what the format is, and as such, providing the format a second time is somewhat redundant. In principle, being able to do the following would be very useful: > > prog1 [some arguments] \ > | myscript.pl \ > | prog2 > pipeline.output > > The modularity of pipelining is very valuable and this is what caused me to anticipate a usage scenario that involved both GuessSeqFormat and reading from a pipe. Not disagreeing with you at all, flexible code is best. chris From hlapp at drycafe.net Thu Aug 25 22:29:44 2011 From: hlapp at drycafe.net (Hilmar Lapp) Date: Fri, 26 Aug 2011 11:29:44 +0900 Subject: [Bioperl-l] SeqIO alters Genbank files In-Reply-To: References: Message-ID: <00375B6C-64AE-4D43-9D98-6CD90C31A76A@drycafe.net> Could this behavior perhaps be made optional, with the default being off? -hilmar On Aug 25, 2011, at 11:35 PM, Brian Osborne wrote: > bioperl-l, > > I need to run something by you before I commit code and tests. I > have code that takes a Genbank file as input and creates another > Genbank file as output. I noticed that SeqIO - specifically > FTHelper.pm - was taking a tag like this in the input file: > > /score=100.1 > > And adding a "note" tag, so the output file contains this: > > /score=100.1 > /note="score=100.1" > > I'm assuming that the code does this because NCBI will not accept > score tags and values even though Bioperl, generally speaking, does > not say that NCBI defines the fine details of Genbank format. > > On the other hand I don't like the idea that SeqIO is altering the > content. It also turns out that if you have code that does multiple > round-trips you end up with text like this: > > /score=100.1 > /note="score=100.1" > /note="score=100.1" > /note="score=100.1" > /note="score=100.1" > > Should I comment out the code that's doing these edits or not? > > Thanks again, > > Brian O. > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : =========================================================== From carandraug+dev at gmail.com Fri Aug 26 10:20:39 2011 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Fri, 26 Aug 2011 15:20:39 +0100 Subject: [Bioperl-l] Problem using Bio::Tools::Run::RemoteBlast In-Reply-To: <37711.192.168.1.254.1313992876.squirrel@webmail.ibab.ac.in> References: <37711.192.168.1.254.1313992876.squirrel@webmail.ibab.ac.in> Message-ID: On 22 August 2011 07:01, Lucky Singh wrote: > Now I > wanted to host it from web server, but This program is not working from it > may be it is not able to create or write on file from web server but in > command line it is working fine. I don't know the possible reason, please > help me to figure it out. Have you looked in the apache logs (look in /var/log/apache2/error.log) ? Can you pastebin your whole code and the content of the error log after trying to run the script? From bosborne11 at verizon.net Fri Aug 26 10:39:44 2011 From: bosborne11 at verizon.net (Brian Osborne) Date: Fri, 26 Aug 2011 10:39:44 -0400 Subject: [Bioperl-l] SeqIO alters Genbank files In-Reply-To: <00375B6C-64AE-4D43-9D98-6CD90C31A76A@drycafe.net> References: <00375B6C-64AE-4D43-9D98-6CD90C31A76A@drycafe.net> Message-ID: <9EB8EA4F-0E22-4446-A57E-F726E001B068@verizon.net> Hilmar, Yes, of course. Are you thinking that this code is designed, in part, to help people submit to NCBI? BIO On Aug 25, 2011, at 10:29 PM, Hilmar Lapp wrote: > Could this behavior perhaps be made optional, with the default being off? > > -hilmar > > On Aug 25, 2011, at 11:35 PM, Brian Osborne wrote: > >> bioperl-l, >> >> I need to run something by you before I commit code and tests. I have code that takes a Genbank file as input and creates another Genbank file as output. I noticed that SeqIO - specifically FTHelper.pm - was taking a tag like this in the input file: >> >> /score=100.1 >> >> And adding a "note" tag, so the output file contains this: >> >> /score=100.1 >> /note="score=100.1" >> >> I'm assuming that the code does this because NCBI will not accept score tags and values even though Bioperl, generally speaking, does not say that NCBI defines the fine details of Genbank format. >> >> On the other hand I don't like the idea that SeqIO is altering the content. It also turns out that if you have code that does multiple round-trips you end up with text like this: >> >> /score=100.1 >> /note="score=100.1" >> /note="score=100.1" >> /note="score=100.1" >> /note="score=100.1" >> >> Should I comment out the code that's doing these edits or not? >> >> Thanks again, >> >> Brian O. >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : > =========================================================== > > > > From hlapp at drycafe.net Fri Aug 26 10:50:26 2011 From: hlapp at drycafe.net (Hilmar Lapp) Date: Fri, 26 Aug 2011 23:50:26 +0900 Subject: [Bioperl-l] SeqIO alters Genbank files In-Reply-To: <9EB8EA4F-0E22-4446-A57E-F726E001B068@verizon.net> References: <00375B6C-64AE-4D43-9D98-6CD90C31A76A@drycafe.net> <9EB8EA4F-0E22-4446-A57E-F726E001B068@verizon.net> Message-ID: On Aug 26, 2011, at 11:39 PM, Brian Osborne wrote: > Are you thinking that this code is designed, in part, to help people > submit to NCBI? I don't know, but perhaps. My thinking was, if the code is doing something that's useful in some, but bad in many or most other situations, it'd be nice if the useful behavior could be retained as an option for those who expressly want (or need) it. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : =========================================================== From florent.angly at gmail.com Sat Aug 27 07:12:05 2011 From: florent.angly at gmail.com (Florent Angly) Date: Sat, 27 Aug 2011 21:12:05 +1000 Subject: [Bioperl-l] Bio::SeqIO can't guess the format of data from a pipe In-Reply-To: References: <2E070303-ADDF-472B-835B-6D3959B83E9A@illinois.edu> Message-ID: <4E58D105.7050805@gmail.com> On the topic of guessing file formats, last I checked, it was difficult to reuse the format guessed by Bio::SeqIO For example, if I want to takes sequences in any format (FASTA, FASTQ, ...) and filter some of them out and put them in a new file in the same format, I need to do something along these lines: # Open the file and let BioPerl guess its format my $in = Bio::SeqIO->new( -file => $input_seqfile ); # Have Bioperl guess the format (again) so we can use the same format for the output file my $format = $in->_guess_format( $input_seqfile ); # Open the output file (same format as the input file my $out = Bio::SeqIO->new( -file => ">".$output_seqfile , format => $format ); # Now do the work... The limitations of the code above is that in is more complex than it should be and forces Bioperl do check the file format twice. My proposal would be to store the format of a file somewhere in the Bio::SeqIO object and create a new get/set method in Bio::SeqIO called format() to store of access its value. The idea would be that the example code above could be rewritten as: # Open the file and let BioPerl guess its format my $in = Bio::SeqIO->new( -file => $input_seqfile ); # Retrieve the format guessed by BioPerl my $format = $in->format( ); # Open the output file using the same format as the input file my $out = Bio::SeqIO->new( -file => ">".$output_seqfile , format => $format ); # Now do the work... I think this is more elegant since it is more readable, requires less computation (the file format is guessed once), and is more consistent with other Bio::SeqIO methods like alphabet, that guesses the alphabet but has a get/set method to access it. Florent On 26/08/11 07:04, Chris Fields wrote: > On Aug 25, 2011, at 1:52 PM, J.J. Emerson wrote: > >> Hi Chris, >> >> You asked: >> >> My question (not a criticism, just trying to understand the problem): why are you going through all the trouble of using GuessSeqFormat as a permanent solution anyway? If you have a stream returning a possibly unknown data type, I would argue that the fundamental bug is not GuessSeqFormat but something else, more specifically not knowing the behavior of the data source and the returned format to begin with. Is something preventing that? >> >> In my particular case, I'm trying not to impose a particular usage scenario onto the script I'm writing in the hopes it will be useful (and general) to others in my lab in the future*. In my proximate case, I will certainly be able to provide SeqIO with a format argument. But insofar as GuessSeqFormat is considered desirable (and reasonable people could indeed disagree whether it is desirable) I think its applicability shouldn't hinge on whether it is guessing on a pipe or a file. >> >> My point is, GuessSeqFormat is fine as a temporary stop-gap, but it is not a permanent solution to your problems (it is guessing, after all). Note the code has had very little development over the years, and the related SeqIO code hasn't aged particularly well. >> >> I see. I wasn't aware that GuessSeqFormat was so relatively neglected. Given the rather challenging nature of the more elegant fix you suggested (using the buffering of Root:IO), perhaps I should consider dropping my issue or filing it as a feature request rather than a bug? > That's fine. I don't want to dissuade you from taking this on, either. > >> Cheers, >> >> J.J. >> >> PS >> >> * The way I plan on using my script is roughly as follows: >> >> prog1 [some arguments] \ >> | myscript.pl --informat fasta \ >> | prog2 \ >> | prog3> pipeline.output >> >> However, I'd like for the "--informat" switch to be optional, mainly to increase usability for other users. For any well considered format, the information is right there in the data to know what the format is, and as such, providing the format a second time is somewhat redundant. In principle, being able to do the following would be very useful: >> >> prog1 [some arguments] \ >> | myscript.pl \ >> | prog2> pipeline.output >> >> The modularity of pipelining is very valuable and this is what caused me to anticipate a usage scenario that involved both GuessSeqFormat and reading from a pipe. > Not disagreeing with you at all, flexible code is best. > > chris > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Fri Aug 26 23:54:05 2011 From: cjfields at illinois.edu (Chris Fields) Date: Fri, 26 Aug 2011 22:54:05 -0500 Subject: [Bioperl-l] Bio::SeqIO can't guess the format of data from a pipe In-Reply-To: <4E58D105.7050805@gmail.com> References: <2E070303-ADDF-472B-835B-6D3959B83E9A@illinois.edu> <4E58D105.7050805@gmail.com> Message-ID: On Aug 27, 2011, at 6:12 AM, Florent Angly wrote: > On the topic of guessing file formats, last I checked, it was difficult to reuse the format guessed by Bio::SeqIO > > For example, if I want to takes sequences in any format (FASTA, FASTQ, ...) and filter some of them out and put them in a new file in the same format, I need to do something along these lines: > > # Open the file and let BioPerl guess its format > my $in = Bio::SeqIO->new( -file => $input_seqfile ); > > # Have Bioperl guess the format (again) so we can use the same format for the output file > my $format = $in->_guess_format( $input_seqfile ); > > # Open the output file (same format as the input file > my $out = Bio::SeqIO->new( -file => ">".$output_seqfile , format => $format ); > > # Now do the work... > > The limitations of the code above is that in is more complex than it should be and forces Bioperl do check the file format twice. My proposal would be to store the format of a file somewhere in the Bio::SeqIO object and create a new get/set method in Bio::SeqIO called format() to store of access its value. The name of the class is the format (that's how they are loaded). We could add this as a convenience level for Bio::SeqIO (fairly easy to do, actually), but it would only makes sense as a getter. Bio::SeqIO dynamically loads the proper Bio::SeqIO:: module in the constructor (Bio::SeqIO::genbank, for example). Being able to set the format to 'fasta' with a loaded Bio::SeqIO::genbank still gets GenBank format. > The idea would be that the example code above could be rewritten as: > > # Open the file and let BioPerl guess its format > my $in = Bio::SeqIO->new( -file => $input_seqfile ); > > # Retrieve the format guessed by BioPerl > my $format = $in->format( ); > > # Open the output file using the same format as the input file > my $out = Bio::SeqIO->new( -file => ">".$output_seqfile , format => $format ); > > # Now do the work... > > I think this is more elegant since it is more readable, requires less computation (the file format is guessed once), and is more consistent with other Bio::SeqIO methods like alphabet, that guesses the alphabet but has a get/set method to access it. > > Florent Guessing the alphabet for the vast majority of sequence data isn't quite as complex and quixotic as guessing a sequence format. The latter is far more variable and infinitely increases, much like standards (ex: http://xkcd.com/927/). Not that sequences aren't capable of change... chris From hlapp at drycafe.net Fri Aug 26 23:43:57 2011 From: hlapp at drycafe.net (Hilmar Lapp) Date: Sat, 27 Aug 2011 12:43:57 +0900 Subject: [Bioperl-l] Bio::SeqIO can't guess the format of data from a pipe In-Reply-To: <4E58D105.7050805@gmail.com> References: <2E070303-ADDF-472B-835B-6D3959B83E9A@illinois.edu> <4E58D105.7050805@gmail.com> Message-ID: The format is already available - it is in essence the class of the SeqIO instance: my $format = ref($in); Rather than passing that into SeqIO->new(), you can directly instantiate a new object from it: my $out = ref($in)->new(-file => ...); Would that address what you are trying to accomplish? -hilmar Sent with a tap. On Aug 27, 2011, at 8:12 PM, Florent Angly wrote: > My proposal would be to store the format of a file somewhere in the Bio::SeqIO object and create a new get/set method in Bio::SeqIO called format() to store of access its value. The idea would be that the example code above could be rewritten as: > > # Open the file and let BioPerl guess its format > my $in = Bio::SeqIO->new( -file => $input_seqfile ); > > # Retrieve the format guessed by BioPerl > my $format = $in->format( ); > > # Open the output file using the same format as the input file > my $out = Bio::SeqIO->new( -file => ">".$output_seqfile , format => $format ); > > # Now do the work... > > I think this is more elegant since it is more readable, requires less computation (the file format is guessed once), and is more consistent with other Bio::SeqIO methods like alphabet, that guesses the alphabet but has a get/set method to access it. From florent.angly at gmail.com Sun Aug 28 05:08:32 2011 From: florent.angly at gmail.com (Florent Angly) Date: Sun, 28 Aug 2011 19:08:32 +1000 Subject: [Bioperl-l] Bio::SeqIO can't guess the format of data from a pipe In-Reply-To: References: <2E070303-ADDF-472B-835B-6D3959B83E9A@illinois.edu> <4E58D105.7050805@gmail.com> Message-ID: <4E5A0590.2010805@gmail.com> Yes indeed, that's a very convenient way to implement a format() methods that gets the format of the file. I'll try to implement it today. More logic may be involved because of the formats that take variants, e.g. the FASTQ format (Bio::SeqIO::fastq module) has a 'sanger', 'illumina' and 'solexa' variants. Florent On 27/08/11 13:43, Hilmar Lapp wrote: > The format is already available - it is in essence the class of the SeqIO instance: > > my $format = ref($in); > > Rather than passing that into SeqIO->new(), you can directly instantiate a new object from it: > > my $out = ref($in)->new(-file => ...); > > Would that address what you are trying to accomplish? > > -hilmar > > Sent with a tap. > > On Aug 27, 2011, at 8:12 PM, Florent Angly wrote: > >> My proposal would be to store the format of a file somewhere in the Bio::SeqIO object and create a new get/set method in Bio::SeqIO called format() to store of access its value. The idea would be that the example code above could be rewritten as: >> >> # Open the file and let BioPerl guess its format >> my $in = Bio::SeqIO->new( -file => $input_seqfile ); >> >> # Retrieve the format guessed by BioPerl >> my $format = $in->format( ); >> >> # Open the output file using the same format as the input file >> my $out = Bio::SeqIO->new( -file => ">".$output_seqfile , format => $format ); >> >> # Now do the work... >> >> I think this is more elegant since it is more readable, requires less computation (the file format is guessed once), and is more consistent with other Bio::SeqIO methods like alphabet, that guesses the alphabet but has a get/set method to access it. From cjfields at illinois.edu Sat Aug 27 23:27:34 2011 From: cjfields at illinois.edu (Chris Fields) Date: Sat, 27 Aug 2011 22:27:34 -0500 Subject: [Bioperl-l] Bio::SeqIO can't guess the format of data from a pipe In-Reply-To: <4E5A0590.2010805@gmail.com> References: <2E070303-ADDF-472B-835B-6D3959B83E9A@illinois.edu> <4E58D105.7050805@gmail.com> <4E5A0590.2010805@gmail.com> Message-ID: <8D639B95-0666-4F09-8E9E-88C8CDF76ABC@illinois.edu> There is no reason the variant couldn't also be a method; it's fairly generic to Bio::SeqIO. FASTQ just happens to be the only parser that takes advantage of it (probably b/c I added it when I refactored FASTQ :) See the code for Bio::SeqIO::new to see what is done. Again, like the format it only makes sense as a getter method. chris On Aug 28, 2011, at 4:08 AM, Florent Angly wrote: > > Yes indeed, that's a very convenient way to implement a format() methods that gets the format of the file. I'll try to implement it today. More logic may be involved because of the formats that take variants, e.g. the FASTQ format (Bio::SeqIO::fastq module) has a 'sanger', 'illumina' and 'solexa' variants. > Florent > > > On 27/08/11 13:43, Hilmar Lapp wrote: >> The format is already available - it is in essence the class of the SeqIO instance: >> >> my $format = ref($in); >> >> Rather than passing that into SeqIO->new(), you can directly instantiate a new object from it: >> >> my $out = ref($in)->new(-file => ...); >> >> Would that address what you are trying to accomplish? >> >> -hilmar >> >> Sent with a tap. >> >> On Aug 27, 2011, at 8:12 PM, Florent Angly wrote: >> >>> My proposal would be to store the format of a file somewhere in the Bio::SeqIO object and create a new get/set method in Bio::SeqIO called format() to store of access its value. The idea would be that the example code above could be rewritten as: >>> >>> # Open the file and let BioPerl guess its format >>> my $in = Bio::SeqIO->new( -file => $input_seqfile ); >>> >>> # Retrieve the format guessed by BioPerl >>> my $format = $in->format( ); >>> >>> # Open the output file using the same format as the input file >>> my $out = Bio::SeqIO->new( -file => ">".$output_seqfile , format => $format ); >>> >>> # Now do the work... >>> >>> I think this is more elegant since it is more readable, requires less computation (the file format is guessed once), and is more consistent with other Bio::SeqIO methods like alphabet, that guesses the alphabet but has a get/set method to access it. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From florent.angly at gmail.com Sun Aug 28 18:35:36 2011 From: florent.angly at gmail.com (Florent Angly) Date: Mon, 29 Aug 2011 08:35:36 +1000 Subject: [Bioperl-l] Bio::SeqIO can't guess the format of data from a pipe In-Reply-To: <8D639B95-0666-4F09-8E9E-88C8CDF76ABC@illinois.edu> References: <2E070303-ADDF-472B-835B-6D3959B83E9A@illinois.edu> <4E58D105.7050805@gmail.com> <4E5A0590.2010805@gmail.com> <8D639B95-0666-4F09-8E9E-88C8CDF76ABC@illinois.edu> Message-ID: <4E5AC2B8.9060808@gmail.com> Hi, I implemented the format() getter method in Bio::SeqIO as discussed, essentially following the way proposed by Hilmar. The variant() method is not needed since Bio::SeqIO::fastq already has a get/set method for that. I noticed that there are plenty more Bio*IO modules that could benefit from having a format() method, e.g.: Bio::AlignIO Bio::ClusterIO Bio::FeatureIO Bio::MapIO Bio::OntologyIO Bio::SearchIO Bio::TreeIO Bio::Assembly::IO * The code could be copy-pasted for each of them but it is not very graceful. Is there a way we could have all these IO modules share the same format() method? * Note how the IO class for Bio::Assembly is called Bio::Assembly::IO, and not Bio::AssemblyIO like for other classes. This may be something to change in the future for consistency. Florent On 28/08/11 13:27, Chris Fields wrote: > There is no reason the variant couldn't also be a method; it's fairly generic to Bio::SeqIO. FASTQ just happens to be the only parser that takes advantage of it (probably b/c I added it when I refactored FASTQ :) > > See the code for Bio::SeqIO::new to see what is done. Again, like the format it only makes sense as a getter method. > > chris > > On Aug 28, 2011, at 4:08 AM, Florent Angly wrote: > >> Yes indeed, that's a very convenient way to implement a format() methods that gets the format of the file. I'll try to implement it today. More logic may be involved because of the formats that take variants, e.g. the FASTQ format (Bio::SeqIO::fastq module) has a 'sanger', 'illumina' and 'solexa' variants. >> Florent >> >> >> On 27/08/11 13:43, Hilmar Lapp wrote: >>> The format is already available - it is in essence the class of the SeqIO instance: >>> >>> my $format = ref($in); >>> >>> Rather than passing that into SeqIO->new(), you can directly instantiate a new object from it: >>> >>> my $out = ref($in)->new(-file => ...); >>> >>> Would that address what you are trying to accomplish? >>> >>> -hilmar >>> >>> Sent with a tap. >>> >>> On Aug 27, 2011, at 8:12 PM, Florent Angly wrote: >>> >>>> My proposal would be to store the format of a file somewhere in the Bio::SeqIO object and create a new get/set method in Bio::SeqIO called format() to store of access its value. The idea would be that the example code above could be rewritten as: >>>> >>>> # Open the file and let BioPerl guess its format >>>> my $in = Bio::SeqIO->new( -file => $input_seqfile ); >>>> >>>> # Retrieve the format guessed by BioPerl >>>> my $format = $in->format( ); >>>> >>>> # Open the output file using the same format as the input file >>>> my $out = Bio::SeqIO->new( -file => ">".$output_seqfile , format => $format ); >>>> >>>> # Now do the work... >>>> >>>> I think this is more elegant since it is more readable, requires less computation (the file format is guessed once), and is more consistent with other Bio::SeqIO methods like alphabet, that guesses the alphabet but has a get/set method to access it. >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Sun Aug 28 21:10:27 2011 From: cjfields at illinois.edu (Chris Fields) Date: Sun, 28 Aug 2011 20:10:27 -0500 Subject: [Bioperl-l] Bio::SeqIO can't guess the format of data from a pipe In-Reply-To: <4E5AC2B8.9060808@gmail.com> References: <2E070303-ADDF-472B-835B-6D3959B83E9A@illinois.edu> <4E58D105.7050805@gmail.com> <4E5A0590.2010805@gmail.com> <8D639B95-0666-4F09-8E9E-88C8CDF76ABC@illinois.edu> <4E5AC2B8.9060808@gmail.com> Message-ID: On Aug 28, 2011, at 5:35 PM, Florent Angly wrote: > Hi, > > I implemented the format() getter method in Bio::SeqIO as discussed, essentially following the way proposed by Hilmar. The variant() method is not needed since Bio::SeqIO::fastq already has a get/set method for that. Right, but the method could be used by other modules if it were moved to Bio::SeqIO. for instance. > I noticed that there are plenty more Bio*IO modules that could benefit from having a format() method, e.g.: > Bio::AlignIO > Bio::ClusterIO > Bio::FeatureIO > Bio::MapIO > Bio::OntologyIO > Bio::SearchIO > Bio::TreeIO > Bio::Assembly::IO * > The code could be copy-pasted for each of them but it is not very graceful. Is there a way we could have all these IO modules share the same format() method? Move the method to Bio::Root::IO, the common base class for all of the above. > * Note how the IO class for Bio::Assembly is called Bio::Assembly::IO, and not Bio::AssemblyIO like for other classes. This may be something to change in the future for consistency. > > Florent That's possible; one could take advantage of that for redesign/API issues if it were needed. chris From noncoding at gmail.com Mon Aug 29 06:31:10 2011 From: noncoding at gmail.com (Remo Sanges) Date: Mon, 29 Aug 2011 12:31:10 +0200 Subject: [Bioperl-l] Opportunity: PhD in BIOINFORMATICS at SZN, Naples, Italy In-Reply-To: <7F0AE58E-6052-469B-ACD0-207FAD060472@drycafe.net> References: <7F0AE58E-6052-469B-ACD0-207FAD060472@drycafe.net> Message-ID: <4E5B6A6E.2020508@gmail.com> (Apologies if you have received this already or if this is considered spam. Please feel free to pass on to anyone who might be interested.) The Stazione Zoologica Anton Dohrn in Naples is among the top research institutions in the world in the fields of marine biology and ecology. The new established bioinformatics laboratory is seeking for a candidate interested in the evolution of genome architecture http://bit.ly/okEGvL We are looking for someone who understands basic biological and evolutionary problems and is able to independently accomplish bioinformatics tasks. Candidates will be expected to have knowledge of biology, genetics and functional genomics, to demonstrate the ability to work in a UNIX/Linux environment and to be familiar with a scripting language (e.g. Perl), a database system (e.g. MySQL) and a statistical programming environment (e.g R). Previous experience with comparative genomics and genomics databases as well as an understanding of statistical methods used in the interpretation of biological data is a desirable asset. Wet lab work might be required during the PhD. All the information about the PhD and the guidelines on how to apply are listed on the webpage http://bit.ly/d2WuXk The closing date for applications is 20 September 2011. Kind Regards Remo -- Remo Sanges Bioinformatics - Animal Physiology and Evolution Stazione Zoologica Anton Dohrn Villa Comunale, 80121 Napoli - Italy +39 081 5833428 From locarpau at upvnet.upv.es Mon Aug 29 12:47:13 2011 From: locarpau at upvnet.upv.es (Lorenzo Carretero Paulet) Date: Mon, 29 Aug 2011 18:47:13 +0200 Subject: [Bioperl-l] Saving Codeml Output file In-Reply-To: <9866C4A4-AC36-4A25-B38F-3006A7BB0F11@illinois.edu> References: <9645AF32-5EC3-41FA-9A32-45B6B92E31FD@illinois.edu> <4DF56976.8080704@upvnet.upv.es> <9866C4A4-AC36-4A25-B38F-3006A7BB0F11@illinois.edu> Message-ID: <1314636433.4e5bc291a40c6@webmail.upv.es> Hi all, I'm running codeml from the PAML package using the corresponding Bioperl wrapper. I'd like to save the output file as -outfile => 'mlc', as in: my $codeml_factory = new Bio::Tools::Run::Phylo::PAML::Codeml ( -outfile => 'mlc', -save_tempfiles => 1, -alignment => $codon_MSA, -tree => $biotree, -params => { #'outfile' =>'mlc', 'verbose' => 1, 'noisy' => 9, 'runmode' => 0, #user tree 'seqtype' => 1, 'model' => $model, 'NSsites' => $NSsites, 'fix_omega' => $fix_omega, 'omega' => $omega, 'ncatG' => $ncatG, 'icode' => 0, #* 0:universal code; 1:mammalian mt; 2-10:see below (5:ciliate nuclear) #'fix_alpha' => 0, #'fix_kappa' => 0, #'RateAncestor' => 0, 'CodonFreq' => 2, 'cleandata' => 1, # remove sites with amibguity data (1 yes, 0 no), 'ndata' => 1 }, ); and subsequently parsing it using my $parserF= Bio::Tools::Phylo::PAML->new (-file => "mlc", -dir => "./"); However, I get the following message. ------------- EXCEPTION ------------- MSG: Could not open mlc: No such file or directory STACK Bio::Root::IO::_initialize_io /Library/Perl//5.10.0/Bio/Root/IO.pm:351 STACK Bio::Tools::Phylo::PAML::new /Library/Perl//5.10.0/Bio/Tools/Phylo/PAML.pm:239 STACK main::BranchSiteEvolAnalysis /Users/marioafares/Documents/workspace/BactEvolGen/BactEvolGen.pl:1421 STACK toplevel /Users/marioafares/Documents/workspace/BactEvolGen/BactEvolGen.pl:939 ------------------------------------- what I guess means the output file is not being saved in the previous step. Anyone knows what's wrong. Tnak you very much in advance for your help. Cheers, Lorenzo From David.Messina at sbc.su.se Mon Aug 29 13:43:33 2011 From: David.Messina at sbc.su.se (Dave Messina) Date: Mon, 29 Aug 2011 19:43:33 +0200 Subject: [Bioperl-l] Saving Codeml Output file In-Reply-To: <1314636433.4e5bc291a40c6@webmail.upv.es> References: <9645AF32-5EC3-41FA-9A32-45B6B92E31FD@illinois.edu> <4DF56976.8080704@upvnet.upv.es> <9866C4A4-AC36-4A25-B38F-3006A7BB0F11@illinois.edu> <1314636433.4e5bc291a40c6@webmail.upv.es> Message-ID: Hi Lorenzo, and subsequently parsing it using > my $parserF= Bio::Tools::Phylo::PAML->new (-file => "mlc", -dir => "./"); > > However, I get the following message. > > ------------- EXCEPTION ------------- > MSG: Could not open mlc: No such file or directory > > what I guess means the output file is not being saved in the previous step. > Your interpretation could be correct. I think though that it might be that the -dir parameter you specify, "./", is not correct. Are you seeing the mlc file in the '.' (current working) dir? If I remember correctly, by default the mlc file is created in a temporary directory in /scratch or /tmp, and the save_tempfiles flag simply keeps that temporary directory from being deleted. I don't have the docs in front of me, but I believe there's a way to get the path of the temp directory that B::T::P::PAML is using. If so, you can use that path as the value for the -dir parameter. Let me know if not, though, and we can follow up on this. Dave PS - also, could you verify that you're using the latest versions of bioperl-live and bioperl-run from Github? From Kevin.M.Brown at asu.edu Mon Aug 29 14:09:29 2011 From: Kevin.M.Brown at asu.edu (Kevin Brown) Date: Mon, 29 Aug 2011 11:09:29 -0700 Subject: [Bioperl-l] Saving Codeml Output file In-Reply-To: References: <9645AF32-5EC3-41FA-9A32-45B6B92E31FD@illinois.edu><4DF56976.8080704@upvnet.upv.es><9866C4A4-AC36-4A25-B38F-3006A7BB0F11@illinois.edu><1314636433.4e5bc291a40c6@webmail.upv.es> Message-ID: <1A4207F8295607498283FE9E93B775B407CCB29D@EX02.asurite.ad.asu.edu> Opening a file for output that does not exist requires the > or >> redirector (depending on if you want to overwrite or append output). my $parserF= Bio::Tools::Phylo::PAML->new (-file => ">mlc", -dir => "./"); Kevin Brown Center for Innovations in Medicine Biodesign Institute Arizona State University > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Dave Messina > Sent: Monday, August 29, 2011 10:44 AM > To: Lorenzo Carretero Paulet > Cc: bioperl-l at lists.open-bio.org > Subject: Re: [Bioperl-l] Saving Codeml Output file > > Hi Lorenzo, > > > and subsequently parsing it using > > my $parserF= Bio::Tools::Phylo::PAML->new (-file => "mlc", -dir => > "./"); > > > > However, I get the following message. > > > > ------------- EXCEPTION ------------- > > MSG: Could not open mlc: No such file or directory > > > > > > what I guess means the output file is not being saved in the previous > step. > > > > > Your interpretation could be correct. I think though that it might be > that > the -dir parameter you specify, "./", is not correct. Are you seeing > the mlc > file in the '.' (current working) dir? > > If I remember correctly, by default the mlc file is created in a > temporary > directory in /scratch or /tmp, and the save_tempfiles flag simply keeps > that > temporary directory from being deleted. > > I don't have the docs in front of me, but I believe there's a way to > get the > path of the temp directory that B::T::P::PAML is using. If so, you can > use > that path as the value for the -dir parameter. > > Let me know if not, though, and we can follow up on this. > > Dave > > PS - also, could you verify that you're using the latest versions of > bioperl-live and bioperl-run from Github? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From scott at scottcain.net Mon Aug 29 14:34:41 2011 From: scott at scottcain.net (Scott Cain) Date: Mon, 29 Aug 2011 14:34:41 -0400 Subject: [Bioperl-l] pls help.. In-Reply-To: References: <92CA808D-16F0-4F08-BC44-8A0C06292EA8@scottcain.net> <1D308407-17A9-4203-9D6C-D71FA0FD74D0@illinois.edu> Message-ID: Hi Ravi, Sorry I took a while to get back to you; I was on vacation last week. Also, please keep correspondence on the bioperl mailing list. If you had, perhaps somebody else would have provided another answer by now. I found the bug in the genbank2gff3 script that causes this problem. You have a few options for how to proceed: 1. Split the multi-genbank file into individual files, put them in a directory, and point the script at that directory (with the --dir flag). If you do this, you won't have to do anything with your BioPerl installation. 2. Get a fresh checkout of bioperl-live from git and install BioPerl from it, as I just committed the fix to the master branch. 3. Manually apply the fix that I just put into master. The diff is here: https://github.com/bioperl/bioperl-live/commit/1cff7d541e704a1f35d85bb27a0ab5911d89f8df Scott On Tue, Aug 23, 2011 at 12:55 AM, Ravi Devani wrote: > Yes the script works but have you seen the gff file generated by it. It has > multiple entries for the same features. And the file keeps on growing in > size with thw same features repeated many times. Thats the problem.. > > Thanking you, > Ravi > > > > -- ------------------------------------------------------------------------ Scott Cain, Ph. D.? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? scott at scottcain dot net GMOD Coordinator (http://gmod.org/)? ? ? ? ? ? ? ? ? ?? 216-392-3087 Ontario Institute for Cancer Research From locarpau at upvnet.upv.es Mon Aug 29 14:56:50 2011 From: locarpau at upvnet.upv.es (Lorenzo Carretero Paulet) Date: Mon, 29 Aug 2011 20:56:50 +0200 Subject: [Bioperl-l] Saving Codeml Output file In-Reply-To: References: <9645AF32-5EC3-41FA-9A32-45B6B92E31FD@illinois.edu> <4DF56976.8080704@upvnet.upv.es> <9866C4A4-AC36-4A25-B38F-3006A7BB0F11@illinois.edu> <1314636433.4e5bc291a40c6@webmail.upv.es> Message-ID: <1314644210.4e5be0f277c05@webmail.upv.es> Thanks Dave, Yes. I do not found the output file in the current directory, or in the temp directory. Using my $tmpdir = $codeml_factory->tempdir(); my $parserF= Bio::Tools::Phylo::PAML->new ( -file => "mlc", -dir => "$tmpdir" ); I still get the same error message. I'm using Bioperl version 1.006901. Cheers, Lorenzo Mensaje citado por Dave Messina : > Hi Lorenzo, > > > and subsequently parsing it using > > my $parserF= Bio::Tools::Phylo::PAML->new (-file => "mlc", -dir => "./"); > > > > However, I get the following message. > > > > ------------- EXCEPTION ------------- > > MSG: Could not open mlc: No such file or directory > > > > > > what I guess means the output file is not being saved in the previous step. > > > > > Your interpretation could be correct. I think though that it might be that > the -dir parameter you specify, "./", is not correct. Are you seeing the mlc > file in the '.' (current working) dir? > > If I remember correctly, by default the mlc file is created in a temporary > directory in /scratch or /tmp, and the save_tempfiles flag simply keeps that > temporary directory from being deleted. > > I don't have the docs in front of me, but I believe there's a way to get the > path of the temp directory that B::T::P::PAML is using. If so, you can use > that path as the value for the -dir parameter. > > Let me know if not, though, and we can follow up on this. > > Dave > > PS - also, could you verify that you're using the latest versions of > bioperl-live and bioperl-run from Github? > From locarpau at upvnet.upv.es Mon Aug 29 15:05:49 2011 From: locarpau at upvnet.upv.es (Lorenzo Carretero Paulet) Date: Mon, 29 Aug 2011 21:05:49 +0200 Subject: [Bioperl-l] Saving Codeml Output file In-Reply-To: <1A4207F8295607498283FE9E93B775B407CCB29D@EX02.asurite.ad.asu.edu> References: <9645AF32-5EC3-41FA-9A32-45B6B92E31FD@illinois.edu><4DF56976.8080704@upvnet.upv.es><9866C4A4-AC36-4A25-B38F-3006A7BB0F11@illinois.edu><1314636433.4e5bc291a40c6@webmail.upv.es> <1A4207F8295607498283FE9E93B775B407CCB29D@EX02.asurite.ad.asu.edu> Message-ID: <1314644749.4e5be30d78cb7@webmail.upv.es> Kevin, Still the same. The previous message is preceeded by: Filehandle GEN11 opened only for output at /Library/Perl//5.10.0/Bio/Root/IO.pm line 571 which points to # if the buffer been filled by _pushback then return the buffer # contents, rather than read from the filehandle if( @{$self->{'_readbuffer'} || [] } ) { $line = shift @{$self->{'_readbuffer'}}; } else { $line = <$fh>; } from the inner subroutine _readline of /Bio/Root/IO.pm Best, L Mensaje citado por Kevin Brown : > Opening a file for output that does not exist requires the > or >> > redirector (depending on if you want to overwrite or append output). > > my $parserF= Bio::Tools::Phylo::PAML->new (-file => ">mlc", -dir => > "./"); > > > > Kevin Brown > Center for Innovations in Medicine > Biodesign Institute > Arizona State University > > -----Original Message----- > > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > > bounces at lists.open-bio.org] On Behalf Of Dave Messina > > Sent: Monday, August 29, 2011 10:44 AM > > To: Lorenzo Carretero Paulet > > Cc: bioperl-l at lists.open-bio.org > > Subject: Re: [Bioperl-l] Saving Codeml Output file > > > > Hi Lorenzo, > > > > > > and subsequently parsing it using > > > my $parserF= Bio::Tools::Phylo::PAML->new (-file => "mlc", -dir => > > "./"); > > > > > > However, I get the following message. > > > > > > ------------- EXCEPTION ------------- > > > MSG: Could not open mlc: No such file or directory > > > > > > > > > > what I guess means the output file is not being saved in the > previous > > step. > > > > > > > > > Your interpretation could be correct. I think though that it might be > > that > > the -dir parameter you specify, "./", is not correct. Are you seeing > > the mlc > > file in the '.' (current working) dir? > > > > If I remember correctly, by default the mlc file is created in a > > temporary > > directory in /scratch or /tmp, and the save_tempfiles flag simply > keeps > > that > > temporary directory from being deleted. > > > > I don't have the docs in front of me, but I believe there's a way to > > get the > > path of the temp directory that B::T::P::PAML is using. If so, you can > > use > > that path as the value for the -dir parameter. > > > > Let me know if not, though, and we can follow up on this. > > > > Dave > > > > PS - also, could you verify that you're using the latest versions of > > bioperl-live and bioperl-run from Github? > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From Kevin.M.Brown at asu.edu Mon Aug 29 15:19:53 2011 From: Kevin.M.Brown at asu.edu (Kevin Brown) Date: Mon, 29 Aug 2011 12:19:53 -0700 Subject: [Bioperl-l] Saving Codeml Output file In-Reply-To: <1314636433.4e5bc291a40c6@webmail.upv.es> References: <9645AF32-5EC3-41FA-9A32-45B6B92E31FD@illinois.edu><4DF56976.8080704@upvnet.upv.es><9866C4A4-AC36-4A25-B38F-3006A7BB0F11@illinois.edu> <1314636433.4e5bc291a40c6@webmail.upv.es> Message-ID: <1A4207F8295607498283FE9E93B775B407CCB2D3@EX02.asurite.ad.asu.edu> OK, went back to the original message. And here's where the problem actually originates... my $codeml_factory = new Bio::Tools::Run::Phylo::PAML::Codeml ( # this should cause it to create a file called mlc -outfile => '>mlc', -save_tempfiles => 1, -alignment => $codon_MSA, -tree => $biotree, -params => { 'verbose' => 1, 'noisy' => 9, 'runmode' => 0, #user tree 'seqtype' => 1, 'model' => $model, 'NSsites' => $NSsites, 'fix_omega' => $fix_omega, 'omega' => $omega, 'ncatG' => $ncatG, 'icode' => 0, #* 0:universal code; 1:mammalian mt; 2-10:see below (5:ciliate nuclear) #'fix_alpha' => 0, #'fix_kappa' => 0, #'RateAncestor' => 0, 'CodonFreq' => 2, 'cleandata' => 1, # remove sites with amibguity data (1 yes, 0 no), 'ndata' => 1 }, ); Kevin Brown Center for Innovations in Medicine Biodesign Institute Arizona State University > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Lorenzo Carretero Paulet > Sent: Monday, August 29, 2011 9:47 AM > To: bioperl-l at lists.open-bio.org > Subject: [Bioperl-l] Saving Codeml Output file > > Hi all, > I'm running codeml from the PAML package using the corresponding > Bioperl > wrapper. I'd like to save the output file as -outfile => 'mlc', as in: > my $codeml_factory = new Bio::Tools::Run::Phylo::PAML::Codeml > ( -outfile => 'mlc', > -save_tempfiles => 1, > -alignment => $codon_MSA, > -tree => $biotree, > -params => > { > #'outfile' =>'mlc', > 'verbose' => 1, > 'noisy' => 9, > 'runmode' => 0, #user tree > 'seqtype' => 1, > 'model' => $model, > 'NSsites' => $NSsites, > 'fix_omega' => $fix_omega, > 'omega' => $omega, > 'ncatG' => $ncatG, > 'icode' => 0, #* 0:universal code; 1:mammalian mt; 2-10:see below > (5:ciliate > nuclear) > #'fix_alpha' => 0, > #'fix_kappa' => > 0, #'RateAncestor' => 0, > 'CodonFreq' => 2, > 'cleandata' => > 1, # remove sites with amibguity data (1 yes, 0 no), > 'ndata' => 1 > }, > ); > > and subsequently parsing it using > my $parserF= Bio::Tools::Phylo::PAML->new (-file => "mlc", -dir => > "./"); > > However, I get the following message. > > ------------- EXCEPTION ------------- > MSG: Could not open mlc: No such file or directory > STACK Bio::Root::IO::_initialize_io > /Library/Perl//5.10.0/Bio/Root/IO.pm:351 > STACK Bio::Tools::Phylo::PAML::new > /Library/Perl//5.10.0/Bio/Tools/Phylo/PAML.pm:239 > STACK main::BranchSiteEvolAnalysis > /Users/marioafares/Documents/workspace/BactEvolGen/BactEvolGen.pl:1421 > STACK toplevel > /Users/marioafares/Documents/workspace/BactEvolGen/BactEvolGen.pl:939 > ------------------------------------- > > what I guess means the output file is not being saved in the previous > step. > Anyone knows what's wrong. > Tnak you very much in advance for your help. > Cheers, > Lorenzo > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From locarpau at upvnet.upv.es Mon Aug 29 19:19:46 2011 From: locarpau at upvnet.upv.es (Lorenzo Carretero Paulet) Date: Tue, 30 Aug 2011 01:19:46 +0200 Subject: [Bioperl-l] Saving Codeml Output file In-Reply-To: <1A4207F8295607498283FE9E93B775B407CCB2D3@EX02.asurite.ad.asu.edu> References: <9645AF32-5EC3-41FA-9A32-45B6B92E31FD@illinois.edu><4DF56976.8080704@upvnet.upv.es><9866C4A4-AC36-4A25-B38F-3006A7BB0F11@illinois.edu> <1314636433.4e5bc291a40c6@webmail.upv.es> <1A4207F8295607498283FE9E93B775B407CCB2D3@EX02.asurite.ad.asu.edu> Message-ID: <1314659986.4e5c1e9268078@webmail.upv.es> Kevin, That's pretty reasonable, but unfortunately still doesn't run. Even if I create the file as $outfile and give it as value to the wrapper as -outfile =>$outfile. It seems as if Bio::Tools::Run::Phylo::PAML::Codeml failed at creating the outfile. Did anyone manage to generate the outfile from Bio::Tools::Run::Phylo::PAML::Codeml. Cheers, Lorenzo Mensaje citado por Kevin Brown : > OK, went back to the original message. > > And here's where the problem actually originates... > > my $codeml_factory = new Bio::Tools::Run::Phylo::PAML::Codeml > ( > # this should cause it to create a file > called mlc > -outfile => '>mlc', > -save_tempfiles => 1, > -alignment => > $codon_MSA, > -tree => > $biotree, > -params => > { > 'verbose' => 1, > 'noisy' => 9, > 'runmode' => 0, #user tree > 'seqtype' => 1, > 'model' => $model, > 'NSsites' => $NSsites, > 'fix_omega' => $fix_omega, > 'omega' => $omega, > 'ncatG' => $ncatG, > 'icode' => 0, #* 0:universal code; 1:mammalian mt; 2-10:see > below (5:ciliate nuclear) > #'fix_alpha' => 0, > #'fix_kappa' => 0, > #'RateAncestor' => 0, > 'CodonFreq' => 2, > 'cleandata' => 1, # remove sites with amibguity data (1 yes, 0 > no), > 'ndata' => 1 > }, > ); > > > Kevin Brown > Center for Innovations in Medicine > Biodesign Institute > Arizona State University > > > -----Original Message----- > > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > > bounces at lists.open-bio.org] On Behalf Of Lorenzo Carretero Paulet > > Sent: Monday, August 29, 2011 9:47 AM > > To: bioperl-l at lists.open-bio.org > > Subject: [Bioperl-l] Saving Codeml Output file > > > > Hi all, > > I'm running codeml from the PAML package using the corresponding > > Bioperl > > wrapper. I'd like to save the output file as -outfile => 'mlc', as in: > > my $codeml_factory = new Bio::Tools::Run::Phylo::PAML::Codeml > > ( -outfile => 'mlc', > > -save_tempfiles => 1, > > -alignment => > $codon_MSA, > > -tree => > $biotree, > > -params => > > { > > #'outfile' =>'mlc', > > 'verbose' => 1, > > 'noisy' => 9, > > 'runmode' => 0, #user tree > > 'seqtype' => 1, > > 'model' => $model, > > 'NSsites' => $NSsites, > > 'fix_omega' => $fix_omega, > > 'omega' => $omega, > > 'ncatG' => $ncatG, > > 'icode' => 0, #* 0:universal code; 1:mammalian mt; 2-10:see below > > (5:ciliate > > nuclear) > > #'fix_alpha' => 0, > > #'fix_kappa' => > > 0, #'RateAncestor' > => 0, > > 'CodonFreq' => > 2, > > 'cleandata' => > > 1, # remove sites with amibguity data (1 yes, 0 no), > > 'ndata' => 1 > > > }, > > ); > > > > and subsequently parsing it using > > my $parserF= Bio::Tools::Phylo::PAML->new (-file => "mlc", -dir => > > "./"); > > > > However, I get the following message. > > > > ------------- EXCEPTION ------------- > > MSG: Could not open mlc: No such file or directory > > STACK Bio::Root::IO::_initialize_io > > /Library/Perl//5.10.0/Bio/Root/IO.pm:351 > > STACK Bio::Tools::Phylo::PAML::new > > /Library/Perl//5.10.0/Bio/Tools/Phylo/PAML.pm:239 > > STACK main::BranchSiteEvolAnalysis > > /Users/marioafares/Documents/workspace/BactEvolGen/BactEvolGen.pl:1421 > > STACK toplevel > > /Users/marioafares/Documents/workspace/BactEvolGen/BactEvolGen.pl:939 > > ------------------------------------- > > > > what I guess means the output file is not being saved in the previous > > step. > > Anyone knows what's wrong. > > Tnak you very much in advance for your help. > > Cheers, > > Lorenzo > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From jason.stajich at gmail.com Mon Aug 29 20:05:57 2011 From: jason.stajich at gmail.com (Jason Stajich) Date: Mon, 29 Aug 2011 17:05:57 -0700 Subject: [Bioperl-l] Saving Codeml Output file In-Reply-To: <1314659986.4e5c1e9268078@webmail.upv.es> References: <9645AF32-5EC3-41FA-9A32-45B6B92E31FD@illinois.edu><4DF56976.8080704@upvnet.upv.es><9866C4A4-AC36-4A25-B38F-3006A7BB0F11@illinois.edu> <1314636433.4e5bc291a40c6@webmail.upv.es> <1A4207F8295607498283FE9E93B775B407CCB2D3@EX02.asurite.ad.asu.edu> <1314659986.4e5c1e9268078@webmail.upv.es> Message-ID: I think you are mistaken on how to use the factory running objects and associated parser. You don't have to instantiate a parser as this is what is returned by the run command. The whole point is you don't need to get to the tempdir or specify opening of the mlc file or all the other output files from the program. you get to use the parser to get the data out and then it cleans up afterwards so you can run many iterations of runs in separate folders without having to cleanup afterwards. http://www.bioperl.org/wiki/HOWTO:PAML my $factory = Bio::Tools::Run::Phylo::PAML::Codeml->new( ... ); my ($rc,$parser) = $factory->run( ); if( my $result = $parser->next_result ) { # $result is a Bio::Tools::Phylo::PAML object } On Aug 29, 2011, at 4:19 PM, Lorenzo Carretero Paulet wrote: > Kevin, > That's pretty reasonable, but unfortunately still doesn't run. Even if I create > the file as $outfile and give it as value to the wrapper as -outfile > =>$outfile. It seems as if Bio::Tools::Run::Phylo::PAML::Codeml failed at > creating the outfile. Did anyone manage to generate the outfile from > Bio::Tools::Run::Phylo::PAML::Codeml. > Cheers, > Lorenzo > > Mensaje citado por Kevin Brown : > >> OK, went back to the original message. >> >> And here's where the problem actually originates... >> >> my $codeml_factory = new Bio::Tools::Run::Phylo::PAML::Codeml >> ( >> # this should cause it to create a file >> called mlc >> -outfile => '>mlc', >> -save_tempfiles => 1, >> -alignment => >> $codon_MSA, >> -tree => >> $biotree, >> -params => >> { >> 'verbose' => 1, >> 'noisy' => 9, >> 'runmode' => 0, #user tree >> 'seqtype' => 1, >> 'model' => $model, >> 'NSsites' => $NSsites, >> 'fix_omega' => $fix_omega, >> 'omega' => $omega, >> 'ncatG' => $ncatG, >> 'icode' => 0, #* 0:universal code; 1:mammalian mt; 2-10:see >> below (5:ciliate nuclear) >> #'fix_alpha' => 0, >> #'fix_kappa' => 0, >> #'RateAncestor' => 0, >> 'CodonFreq' => 2, >> 'cleandata' => 1, # remove sites with amibguity data (1 yes, 0 >> no), >> 'ndata' => 1 >> }, >> ); >> >> >> Kevin Brown >> Center for Innovations in Medicine >> Biodesign Institute >> Arizona State University >> >>> -----Original Message----- >>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- >>> bounces at lists.open-bio.org] On Behalf Of Lorenzo Carretero Paulet >>> Sent: Monday, August 29, 2011 9:47 AM >>> To: bioperl-l at lists.open-bio.org >>> Subject: [Bioperl-l] Saving Codeml Output file >>> >>> Hi all, >>> I'm running codeml from the PAML package using the corresponding >>> Bioperl >>> wrapper. I'd like to save the output file as -outfile => 'mlc', as in: >>> my $codeml_factory = new Bio::Tools::Run::Phylo::PAML::Codeml >>> ( -outfile => 'mlc', >>> -save_tempfiles => 1, >>> -alignment => >> $codon_MSA, >>> -tree => >> $biotree, >>> -params => >>> { >>> #'outfile' =>'mlc', >>> 'verbose' => 1, >>> 'noisy' => 9, >>> 'runmode' => 0, #user tree >>> 'seqtype' => 1, >>> 'model' => $model, >>> 'NSsites' => $NSsites, >>> 'fix_omega' => $fix_omega, >>> 'omega' => $omega, >>> 'ncatG' => $ncatG, >>> 'icode' => 0, #* 0:universal code; 1:mammalian mt; 2-10:see below >>> (5:ciliate >>> nuclear) >>> #'fix_alpha' => 0, >>> #'fix_kappa' => >>> 0, #'RateAncestor' >> => 0, >>> 'CodonFreq' => >> 2, >>> 'cleandata' => >>> 1, # remove sites with amibguity data (1 yes, 0 no), >>> 'ndata' => 1 >>> >> }, >>> ); >>> >>> and subsequently parsing it using >>> my $parserF= Bio::Tools::Phylo::PAML->new (-file => "mlc", -dir => >>> "./"); >>> >>> However, I get the following message. >>> >>> ------------- EXCEPTION ------------- >>> MSG: Could not open mlc: No such file or directory >>> STACK Bio::Root::IO::_initialize_io >>> /Library/Perl//5.10.0/Bio/Root/IO.pm:351 >>> STACK Bio::Tools::Phylo::PAML::new >>> /Library/Perl//5.10.0/Bio/Tools/Phylo/PAML.pm:239 >>> STACK main::BranchSiteEvolAnalysis >>> /Users/marioafares/Documents/workspace/BactEvolGen/BactEvolGen.pl:1421 >>> STACK toplevel >>> /Users/marioafares/Documents/workspace/BactEvolGen/BactEvolGen.pl:939 >>> ------------------------------------- >>> >>> what I guess means the output file is not being saved in the previous >>> step. >>> Anyone knows what's wrong. >>> Tnak you very much in advance for your help. >>> Cheers, >>> Lorenzo >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From fs5 at sanger.ac.uk Tue Aug 30 05:45:46 2011 From: fs5 at sanger.ac.uk (Frank Schwach) Date: Tue, 30 Aug 2011 10:45:46 +0100 Subject: [Bioperl-l] primer3redux 0.09 add_targets is not there In-Reply-To: <3BE41688-C163-4EA1-AF6A-34A6052FCFEA@illinois.edu> References: <3BE41688-C163-4EA1-AF6A-34A6052FCFEA@illinois.edu> Message-ID: <1314697546.3797.8.camel@deskpro15336.internal.sanger.ac.uk> Yes, I still have the primer3redux doc on my TODO list. Sorry, haven't had the time to do this lately but will loook into this as soon as I can. Frank On Mon, 2011-08-22 at 15:10 -0500, Chris Fields wrote: > On Aug 22, 2011, at 2:52 PM, Anand Patel wrote: > > > my $primer3 = Bio::Tools::Run::Primer3Redux->new(-outfile => > > "temp.out", -path => "/usr/bin/primer3_core"); > > > > If I use this: > > $primer3->add_targets( > > 'PRIMER_OPT_TM'=>$PRIMER_OPT_TM,'PRIMER_MIN_TM'=>$PRIMER_MIN_TM, > > 'PRIMER_MAX_TM'=>$PRIMER_MAX_TM, > > 'PRIMER_PAIR_MAX_DIFF_TM'=>$PRIMER_MAX_DIFF_TM, > > 'PRIMER_MAX_SIZE'=>$PRIMER_MAX_SIZE,'PRIMER_OPT_SIZE'=>$PRIMER_OPT_SIZE, > > 'PRIMER_MIN_SIZE'=>$PRIMER_MIN_SIZE, > > 'PRIMER_MAX_GC'=>$PRIMER_MAX_GC, > > 'PRIMER_OPT_GC_PERCENT'=>$PRIMER_OPT_GC_PERCENT, > > 'PRIMER_MIN_GC'=>$PRIMER_MIN_GC, > > 'SEQUENCE_TARGET'=>$TARGET, 'PRIMER_PRODUCT_SIZE_RANGE' > > =>$PRIMER_PRODUCT_SIZE_RANGE); > > > > I get: > > Can't locate object method "add_targets" via package > > "Bio::Tools::Run::Primer3Redux" at p3ra.pl line 31, line 1. > > > > On the other hand, if I change that line to: > > $primer3->set_parameters( > > 'PRIMER_OPT_TM'=>$PRIMER_OPT_TM,'PRIMER_MIN_TM'=>$PRIMER_MIN_TM, > > 'PRIMER_MAX_TM'=>$PRIMER_MAX_TM, > > 'PRIMER_PAIR_MAX_DIFF_TM'=>$PRIMER_MAX_DIFF_TM, > > 'PRIMER_MAX_SIZE'=>$PRIMER_MAX_SIZE,'PRIMER_OPT_SIZE'=>$PRIMER_OPT_SIZE, > > 'PRIMER_MIN_SIZE'=>$PRIMER_MIN_SIZE, > > 'PRIMER_MAX_GC'=>$PRIMER_MAX_GC, > > 'PRIMER_OPT_GC_PERCENT'=>$PRIMER_OPT_GC_PERCENT, > > 'PRIMER_MIN_GC'=>$PRIMER_MIN_GC, > > 'SEQUENCE_TARGET'=>$TARGET, 'PRIMER_PRODUCT_SIZE_RANGE' > > =>$PRIMER_PRODUCT_SIZE_RANGE); > > > > It works. When I looked at the source code for Primer3Redux, I > > couldn't find add_targets, but set_parameters looked like it might > > work, so I used that instead, and it worked. > > > > But I see over in the github that there are other issues with the > > documentation (how primer3redux's result object is now 3 deep rather > > than 2 deep). Not sure if this is in that category or not. > > That is true; documentation was to be updated but that hasn't happened yet (haven't had the free time to work specifically on this, and I think fschwach was to work on some HOWTO documentation). I do plan on an update in the next few weeks to address the various Issues on github, if you can file this as well it would help. > > I have to go back and look at the history of add_targets() reative to primer3 bioperl code, but I don't think this was part of the commit history of Bio::Tools::Run::Primer3Redux (maybe for the old version, Bio::Tools::Run::Primer3), so that is probably cruft left over from the update. Would be easy enough to alias it for convenience... > > chris > > > Thanks, > > Anand > ... > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. From manju.rawat2 at gmail.com Tue Aug 30 07:22:33 2011 From: manju.rawat2 at gmail.com (Manju Rawat) Date: Tue, 30 Aug 2011 07:22:33 -0400 Subject: [Bioperl-l] Bioperl query.... Message-ID: Hey Pls help me.. I am very new in Bioperl.. And i want to use blast report in my programming.. But i dnt know how to use it...pls tell me how to use HSP,gaps.etc methods??/ how to use them to extract valus from blast file.. Thanks Manju Rawat From roy.chaudhuri at gmail.com Tue Aug 30 07:25:32 2011 From: roy.chaudhuri at gmail.com (Roy Chaudhuri) Date: Tue, 30 Aug 2011 12:25:32 +0100 Subject: [Bioperl-l] Bioperl query.... In-Reply-To: References: Message-ID: <4E5CC8AC.8050800@gmail.com> Hi Manju, See: http://www.bioperl.org/wiki/HOWTO:SearchIO Cheers, Roy. On 30/08/2011 12:22, Manju Rawat wrote: > Hey Pls help me.. > I am very new in Bioperl.. > And i want to use blast report in my programming.. > But i dnt know how to use it...pls tell me how to use HSP,gaps.etc > methods??/ > how to use them to extract valus from blast file.. > > Thanks > Manju Rawat > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Tue Aug 30 09:54:19 2011 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 30 Aug 2011 08:54:19 -0500 Subject: [Bioperl-l] primer3redux 0.09 add_targets is not there In-Reply-To: <1314697546.3797.8.camel@deskpro15336.internal.sanger.ac.uk> References: <3BE41688-C163-4EA1-AF6A-34A6052FCFEA@illinois.edu> <1314697546.3797.8.camel@deskpro15336.internal.sanger.ac.uk> Message-ID: <8063FB1D-4557-4D1B-B9EF-9833ECD440E9@illinois.edu> S'okay, we're all a bit busy :P chris On Aug 30, 2011, at 4:45 AM, Frank Schwach wrote: > Yes, I still have the primer3redux doc on my TODO list. Sorry, haven't > had the time to do this lately but will loook into this as soon as I > can. > Frank > > > On Mon, 2011-08-22 at 15:10 -0500, Chris Fields wrote: >> On Aug 22, 2011, at 2:52 PM, Anand Patel wrote: >> >>> my $primer3 = Bio::Tools::Run::Primer3Redux->new(-outfile => >>> "temp.out", -path => "/usr/bin/primer3_core"); >>> >>> If I use this: >>> $primer3->add_targets( >>> 'PRIMER_OPT_TM'=>$PRIMER_OPT_TM,'PRIMER_MIN_TM'=>$PRIMER_MIN_TM, >>> 'PRIMER_MAX_TM'=>$PRIMER_MAX_TM, >>> 'PRIMER_PAIR_MAX_DIFF_TM'=>$PRIMER_MAX_DIFF_TM, >>> 'PRIMER_MAX_SIZE'=>$PRIMER_MAX_SIZE,'PRIMER_OPT_SIZE'=>$PRIMER_OPT_SIZE, >>> 'PRIMER_MIN_SIZE'=>$PRIMER_MIN_SIZE, >>> 'PRIMER_MAX_GC'=>$PRIMER_MAX_GC, >>> 'PRIMER_OPT_GC_PERCENT'=>$PRIMER_OPT_GC_PERCENT, >>> 'PRIMER_MIN_GC'=>$PRIMER_MIN_GC, >>> 'SEQUENCE_TARGET'=>$TARGET, 'PRIMER_PRODUCT_SIZE_RANGE' >>> =>$PRIMER_PRODUCT_SIZE_RANGE); >>> >>> I get: >>> Can't locate object method "add_targets" via package >>> "Bio::Tools::Run::Primer3Redux" at p3ra.pl line 31, line 1. >>> >>> On the other hand, if I change that line to: >>> $primer3->set_parameters( >>> 'PRIMER_OPT_TM'=>$PRIMER_OPT_TM,'PRIMER_MIN_TM'=>$PRIMER_MIN_TM, >>> 'PRIMER_MAX_TM'=>$PRIMER_MAX_TM, >>> 'PRIMER_PAIR_MAX_DIFF_TM'=>$PRIMER_MAX_DIFF_TM, >>> 'PRIMER_MAX_SIZE'=>$PRIMER_MAX_SIZE,'PRIMER_OPT_SIZE'=>$PRIMER_OPT_SIZE, >>> 'PRIMER_MIN_SIZE'=>$PRIMER_MIN_SIZE, >>> 'PRIMER_MAX_GC'=>$PRIMER_MAX_GC, >>> 'PRIMER_OPT_GC_PERCENT'=>$PRIMER_OPT_GC_PERCENT, >>> 'PRIMER_MIN_GC'=>$PRIMER_MIN_GC, >>> 'SEQUENCE_TARGET'=>$TARGET, 'PRIMER_PRODUCT_SIZE_RANGE' >>> =>$PRIMER_PRODUCT_SIZE_RANGE); >>> >>> It works. When I looked at the source code for Primer3Redux, I >>> couldn't find add_targets, but set_parameters looked like it might >>> work, so I used that instead, and it worked. >>> >>> But I see over in the github that there are other issues with the >>> documentation (how primer3redux's result object is now 3 deep rather >>> than 2 deep). Not sure if this is in that category or not. >> >> That is true; documentation was to be updated but that hasn't happened yet (haven't had the free time to work specifically on this, and I think fschwach was to work on some HOWTO documentation). I do plan on an update in the next few weeks to address the various Issues on github, if you can file this as well it would help. >> >> I have to go back and look at the history of add_targets() reative to primer3 bioperl code, but I don't think this was part of the commit history of Bio::Tools::Run::Primer3Redux (maybe for the old version, Bio::Tools::Run::Primer3), so that is probably cruft left over from the update. Would be easy enough to alias it for convenience... >> >> chris >> >>> Thanks, >>> Anand >> ... >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > -- > The Wellcome Trust Sanger Institute is operated by Genome Research > Limited, a charity registered in England with number 1021457 and a > company registered in England with number 2742969, whose registered > office is 215 Euston Road, London, NW1 2BE. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From locarpau at upvnet.upv.es Tue Aug 30 10:58:51 2011 From: locarpau at upvnet.upv.es (Lorenzo Carretero Paulet) Date: Tue, 30 Aug 2011 16:58:51 +0200 Subject: [Bioperl-l] Saving Codeml Output file In-Reply-To: References: <9645AF32-5EC3-41FA-9A32-45B6B92E31FD@illinois.edu><4DF56976.8080704@upvnet.upv.es><9866C4A4-AC36-4A25-B38F-3006A7BB0F11@illinois.edu> <1314636433.4e5bc291a40c6@webmail.upv.es> <1A4207F8295607498283FE9E93B775B407CCB2D3@EX02.asurite.ad.asu.edu> <1314659986.4e5c1e9268078@webmail.upv.es> Message-ID: <1314716331.4e5cfaab4958e@webmail.upv.es> Thanks Jason, Ok, I see. That's what I was triying at the beggining. This runs OK in my scripts for branch-specific models. However, when I try branch-site models (NSsites > 0) and try to parse the results using my $model_result= $paml_result->get_NSSite_results I start to have problems. According to Dumper, I'm able to generate a Bio::Tools::Phylo::PAML object $paml_result but this doesn't store any Bio::Tools::Phylo::PAML::ModelResult that could be accessed using get_NSSite_results. See below a little piece of code to illustrate what I'm saying. my $codeml_factory = new Bio::Tools::Run::Phylo::PAML::Codeml ( -alignment => $codon_MSA, -tree => $biotree, -params => { ... ...parameter values ... }, ); my ($rc,$parser) = $codeml_factory->run(); # or run($dna_aln,$biotree) #$codeml_factory->cleanup(); my $paml_result = $parser->next_result; say Dumper $paml_result; #This returns a true Bio::Tools::Phylo::PAML::Result object!!! my $model_result= $paml_result->get_NSSite_results; say Dumper $model_result; #This doesn't return a true Bio::Tools::Phylo::PAML::ModelResult object ($VAR1 = 0;)!!! $ns_string = "model ".$model_result->model_num."\n".$model_result->model_description()."\n".$model_result->time_used."\n"; As no ModelResult object is generated, the script stops returning: Can't call method "model_num" without a package or object reference That's why I was trying to save the mlc output file and parse it, instead of parsing directly the Bio::Tools::Phylo::PAML object. Best, Lorenzo PS: I?m using paml version 4.4b, July 2010 and Bioperl 1.006901. on mac osx Mensaje citado por Jason Stajich : > I think you are mistaken on how to use the factory running objects and > associated parser. > > You don't have to instantiate a parser as this is what is returned by the run > command. The whole point is you don't need to get to the tempdir or specify > opening of the mlc file or all the other output files from the program. you > get to use the parser to get the data out and then it cleans up afterwards so > you can run many iterations of runs in separate folders without having to > cleanup afterwards. > > http://www.bioperl.org/wiki/HOWTO:PAML > > my $factory = Bio::Tools::Run::Phylo::PAML::Codeml->new( ... ); > my ($rc,$parser) = $factory->run( ); > > if( my $result = $parser->next_result ) { > # $result is a Bio::Tools::Phylo::PAML object > } > > > On Aug 29, 2011, at 4:19 PM, Lorenzo Carretero Paulet wrote: > > > Kevin, > > That's pretty reasonable, but unfortunately still doesn't run. Even if I > create > > the file as $outfile and give it as value to the wrapper as -outfile > > =>$outfile. It seems as if Bio::Tools::Run::Phylo::PAML::Codeml failed at > > creating the outfile. Did anyone manage to generate the outfile from > > Bio::Tools::Run::Phylo::PAML::Codeml. > > Cheers, > > Lorenzo > > > > Mensaje citado por Kevin Brown : > > > >> OK, went back to the original message. > >> > >> And here's where the problem actually originates... > >> > >> my $codeml_factory = new Bio::Tools::Run::Phylo::PAML::Codeml > >> ( > >> # this should cause it to create a file > >> called mlc > >> -outfile => '>mlc', > >> -save_tempfiles => 1, > >> -alignment => > >> $codon_MSA, > >> -tree => > >> $biotree, > >> -params => > >> { > >> 'verbose' => 1, > >> 'noisy' => 9, > >> 'runmode' => 0, #user tree > >> 'seqtype' => 1, > >> 'model' => $model, > >> 'NSsites' => $NSsites, > >> 'fix_omega' => $fix_omega, > >> 'omega' => $omega, > >> 'ncatG' => $ncatG, > >> 'icode' => 0, #* 0:universal code; 1:mammalian mt; 2-10:see > >> below (5:ciliate nuclear) > >> #'fix_alpha' => 0, > >> #'fix_kappa' => 0, > >> #'RateAncestor' => 0, > >> 'CodonFreq' => 2, > >> 'cleandata' => 1, # remove sites with amibguity data (1 yes, 0 > >> no), > >> 'ndata' => 1 > >> }, > >> ); > >> > >> > >> Kevin Brown > >> Center for Innovations in Medicine > >> Biodesign Institute > >> Arizona State University > >> > >>> -----Original Message----- > >>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > >>> bounces at lists.open-bio.org] On Behalf Of Lorenzo Carretero Paulet > >>> Sent: Monday, August 29, 2011 9:47 AM > >>> To: bioperl-l at lists.open-bio.org > >>> Subject: [Bioperl-l] Saving Codeml Output file > >>> > >>> Hi all, > >>> I'm running codeml from the PAML package using the corresponding > >>> Bioperl > >>> wrapper. I'd like to save the output file as -outfile => 'mlc', as in: > >>> my $codeml_factory = new Bio::Tools::Run::Phylo::PAML::Codeml > >>> ( -outfile => 'mlc', > >>> -save_tempfiles => 1, > >>> -alignment => > >> $codon_MSA, > >>> -tree => > >> $biotree, > >>> -params => > >>> { > >>> #'outfile' =>'mlc', > >>> 'verbose' => 1, > >>> 'noisy' => 9, > >>> 'runmode' => 0, #user tree > >>> 'seqtype' => 1, > >>> 'model' => $model, > >>> 'NSsites' => $NSsites, > >>> 'fix_omega' => $fix_omega, > >>> 'omega' => $omega, > >>> 'ncatG' => $ncatG, > >>> 'icode' => 0, #* 0:universal code; 1:mammalian mt; 2-10:see below > >>> (5:ciliate > >>> nuclear) > >>> #'fix_alpha' => 0, > >>> #'fix_kappa' => > >>> 0, #'RateAncestor' > >> => 0, > >>> 'CodonFreq' => > >> 2, > >>> 'cleandata' => > >>> 1, # remove sites with amibguity data (1 yes, 0 no), > >>> 'ndata' => 1 > >>> > >> }, > >>> ); > >>> > >>> and subsequently parsing it using > >>> my $parserF= Bio::Tools::Phylo::PAML->new (-file => "mlc", -dir => > >>> "./"); > >>> > >>> However, I get the following message. > >>> > >>> ------------- EXCEPTION ------------- > >>> MSG: Could not open mlc: No such file or directory > >>> STACK Bio::Root::IO::_initialize_io > >>> /Library/Perl//5.10.0/Bio/Root/IO.pm:351 > >>> STACK Bio::Tools::Phylo::PAML::new > >>> /Library/Perl//5.10.0/Bio/Tools/Phylo/PAML.pm:239 > >>> STACK main::BranchSiteEvolAnalysis > >>> /Users/marioafares/Documents/workspace/BactEvolGen/BactEvolGen.pl:1421 > >>> STACK toplevel > >>> /Users/marioafares/Documents/workspace/BactEvolGen/BactEvolGen.pl:939 > >>> ------------------------------------- > >>> > >>> what I guess means the output file is not being saved in the previous > >>> step. > >>> Anyone knows what's wrong. > >>> Tnak you very much in advance for your help. > >>> Cheers, > >>> Lorenzo > >>> _______________________________________________ > >>> Bioperl-l mailing list > >>> Bioperl-l at lists.open-bio.org > >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >> > >> _______________________________________________ > >> Bioperl-l mailing list > >> Bioperl-l at lists.open-bio.org > >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >> > > > > > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From shalabh.sharma7 at gmail.com Tue Aug 30 11:26:00 2011 From: shalabh.sharma7 at gmail.com (shalabh sharma) Date: Tue, 30 Aug 2011 11:26:00 -0400 Subject: [Bioperl-l] Bioperl query.... In-Reply-To: <4E5CC8AC.8050800@gmail.com> References: <4E5CC8AC.8050800@gmail.com> Message-ID: Hi Manju, Just follow the link sent by Roy. It also contain some useful example scripts. What i am suggesting is , you should run a blast on a very small data set that you can inspect easily and manually. Then parse it using SeachIO (follow the link) and you will get a fair idea that how it works. -Shalabh On Tue, Aug 30, 2011 at 7:25 AM, Roy Chaudhuri wrote: > Hi Manju, > > See: > http://www.bioperl.org/wiki/**HOWTO:SearchIO > > Cheers, > Roy. > > > On 30/08/2011 12:22, Manju Rawat wrote: > >> Hey Pls help me.. >> I am very new in Bioperl.. >> And i want to use blast report in my programming.. >> But i dnt know how to use it...pls tell me how to use HSP,gaps.etc >> methods??/ >> how to use them to extract valus from blast file.. >> >> Thanks >> Manju Rawat >> ______________________________**_________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/**mailman/listinfo/bioperl-l >> > > ______________________________**_________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/**mailman/listinfo/bioperl-l > -- Shalabh Sharma Scientific Computing Professional Associate (Bioinformatics Specialist) Department of Marine Sciences University of Georgia Athens, GA 30602-3636 From longbow0 at gmail.com Wed Aug 31 11:48:16 2011 From: longbow0 at gmail.com (longbow leo) Date: Wed, 31 Aug 2011 10:48:16 -0500 Subject: [Bioperl-l] How to color leaves of a tree by Bio::Tree::Draw::Cladogram? Message-ID: Dear all, I am using the module Bio::Tree::Draw::Cladogram to create a tree diagram. But when I tried to color the tree leaves, the diagram was still without any colors. How can I color tree leave? Thanks in advance. Here is my script: ###################################################################### #!/usr/bin/perl use strict; use warnings; use Bio::TreeIO; use Bio::Tree::Draw::Cladogram; my $treei = Bio::TreeIO->new( -fh => \*DATA, -format => 'newick', ); my $tree = $treei->next_tree; # Color node 'B' to red my ($nodeB) = $tree->find_node( -id => 'B' ); $nodeB->add_tag_value('Rcolor', 1); $nodeB->add_tag_value('Gcolor', 0); $nodeB->add_tag_value('Bcolor', 0); my $cg = Bio::Tree::Draw::Cladogram->new( -tree => $tree, ); $cg->print( -file => 'mytree.eps' ); __DATA__ (((A:5,B:5)90:2,C:4)25:3,D:10); ###################################################################### Regards, Haizhou From roy.chaudhuri at gmail.com Wed Aug 31 12:02:30 2011 From: roy.chaudhuri at gmail.com (Roy Chaudhuri) Date: Wed, 31 Aug 2011 17:02:30 +0100 Subject: [Bioperl-l] How to color leaves of a tree by Bio::Tree::Draw::Cladogram? In-Reply-To: References: Message-ID: <4E5E5B16.9070704@gmail.com> Hi Haizhou, I think you need to specify -colors=>1 in your Bio::Tree::Draw::Cladogram constructor: my $cg = Bio::Tree::Draw::Cladogram->new( -tree => $tree, -colors => 1 ); Not sure why this isn't on by default. Roy. On 31/08/2011 16:48, longbow leo wrote: > Dear all, > > I am using the module Bio::Tree::Draw::Cladogram to create a tree diagram. > But when I tried to color the tree leaves, the diagram was still without any > colors. > > How can I color tree leave? Thanks in advance. > > Here is my script: > > ###################################################################### > > > #!/usr/bin/perl > > use strict; > use warnings; > > use Bio::TreeIO; > use Bio::Tree::Draw::Cladogram; > > my $treei = Bio::TreeIO->new( > -fh => \*DATA, > -format => 'newick', > ); > > my $tree = $treei->next_tree; > > # Color node 'B' to red > my ($nodeB) = $tree->find_node( -id => 'B' ); > > $nodeB->add_tag_value('Rcolor', 1); > $nodeB->add_tag_value('Gcolor', 0); > $nodeB->add_tag_value('Bcolor', 0); > > my $cg = Bio::Tree::Draw::Cladogram->new( > -tree => $tree, > ); > > $cg->print( -file => 'mytree.eps' ); > > __DATA__ > (((A:5,B:5)90:2,C:4)25:3,D:10); > > > ###################################################################### > > Regards, > > Haizhou > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Mon Aug 1 04:07:38 2011 From: cjfields at illinois.edu (Chris Fields) Date: Sun, 31 Jul 2011 23:07:38 -0500 Subject: [Bioperl-l] BioPerl Test requirements Message-ID: <79E5ED11-1D3F-436F-AA02-D922EAEB123A@illinois.edu> All, We are currently using a BioPerl-specific module for running tests called Bio::Root::Test. It is essentially a wrapper module, re-exporting all the methods for Test::More, Test::Exception, and Test::Warn. One problem: it currently expects a copy of Test::Warn and Test::Exception in each repository as a fallback. Another problem: these included modules appear to be triggering dependencies with debian packaging. As an example of one hidden dependency, the included Test::Warn requires Array::Compare, which converted to Moose a few years ago, so you automatically have to install the entire Moose dependency tree, even though Bioperl doesn't require it (not a slam on Moose, you really SHOULD be using Moose these days. No, really :). Anway, more recent versions of Test::Warn don't have this requirement, but as we package an old version of this module we get stuck with the dependencies until we (manually) update this for each repository. Ick. I think the best solution is to remove the bioperl-local modules in t/lib and list Test::Most instead as a 'build_requires' in Build.PL, e.g. the module is only necessary for the build phase so is optionally installed. Test::Most essentially does exactly the same thing as Bio::Root::Test and more; it also includes Test::Deep and Test::Diff (Bio::Root::Test has a few additional methods of use as well). As this will require developers to use Test::Most instead, though, I though it would be worth asking on the list to see if there are any objections. Any thoughts? chris From cjfields at illinois.edu Mon Aug 1 04:42:39 2011 From: cjfields at illinois.edu (Chris Fields) Date: Sun, 31 Jul 2011 23:42:39 -0500 Subject: [Bioperl-l] protaparam In-Reply-To: References: Message-ID: <44853A9D-9E78-469E-B8D8-B06EBDB5F780@illinois.edu> Shachi, My guess is this is not a BioPerl-specific issue, but that the web service interface has changed or is no longer active. Unfortunately this is one module that has no tests associated with it, so this passed through the cracks. You are more than welcome to file a bug on this, but if the service is inactive we'll likely immediately deprecate the module. chris On Jul 28, 2011, at 11:46 PM, Shachi Gahoi wrote: > Dear All, > > If anybody know how to rum protparam using bioperl please let me know. > > > Thanks in advance > > -- > Regards, > Shachi > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From jason.stajich at gmail.com Mon Aug 1 07:12:32 2011 From: jason.stajich at gmail.com (Jason Stajich) Date: Sun, 31 Jul 2011 23:12:32 -0800 Subject: [Bioperl-l] Fwd: Bio::Tools::Run::Phylo::Phyml, tree_string References: Message-ID: <3521B67E-D158-492A-8A60-025D6C5C9934@gmail.com> Heikki - can you take a look at this when you get time - I'm unclear what the BIONJ string is used for? Begin forwarded message: > From: Tristan Lefebure > Date: July 27, 2011 6:12:16 AM AKDT > To: bioperl mailing list > Subject: Re: [Bioperl-l] Bio::Tools::Run::Phylo::Phyml, tree_string > > done: > https://redmine.open-bio.org/issues/3273 > > -- > Tristan > > On Tue, Jul 26, 2011 at 9:43 PM, Chris Fields wrote: >> That's an odd one. Could you file this on redmine? >> >> chris >> >> On Jul 26, 2011, at 10:14 AM, Tristan Lefebure wrote: >> >>> Ouups, I found a typo in my post, it should read: >>> >>> I am not quite sure I understand why tree_string() from >>> Bio::Tools::Run::Phylo::Phyml returns >>> a string that looks like that (I removed the end of the tree): >>> >>> BIONJ(((((((('92':0.0114354726,'12':0.0472591023)0.0000000000:0.0000005859,... >>> >>> On Tue, Jul 26, 2011 at 4:47 PM, Tristan Lefebure >>> wrote: >>>> Hi there, >>>> I am not quite sure I understand why tree_string() from Bio::Tools::Run::Phylo::Phyml returns >>>> a string that looks like that (I removed the end of the tree): >>>> >>>> Tree is BIONJ(((((((('92':0.0114354726,'12':0.0472591023)0.0000000000:0.0000005859,... >>>> >>>> Why do we have this 'Tree is BIONJ' thing? >>>> >>>> A quick look at the code in the _run() function gives : >>>> >>>> { >>>> open(my $FH_TREE, "<", $tree_file) >>>> || $self->throw("Phyml call ($command) did not give an output: $?"); >>>> local $/; >>>> $self->{_tree} .= <$FH_TREE>; >>>> } >>>> >>>> Why appending something to $self->{_tree}? What about? >>>> $self->{_tree} = <$FH_TREE>; >>>> >>>> I was about to fill a bug report, but then I saw that in Phyml.t: >>>> >>>> is substr($factory->tree_string, 0, 9), 'BIONJ(SIN', 'tree_string()'; >>>> >>>> Well, I am lost. Any help much appreciated... >>>> >>>> -- >>>> Tristan >>>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From David.Messina at sbc.su.se Mon Aug 1 09:09:47 2011 From: David.Messina at sbc.su.se (Dave Messina) Date: Mon, 1 Aug 2011 11:09:47 +0200 Subject: [Bioperl-l] BioPerl Test requirements In-Reply-To: <79E5ED11-1D3F-436F-AA02-D922EAEB123A@illinois.edu> References: <79E5ED11-1D3F-436F-AA02-D922EAEB123A@illinois.edu> Message-ID: Sounds good, Chris. Go for it. Dave From hlapp at drycafe.net Mon Aug 1 20:30:18 2011 From: hlapp at drycafe.net (Hilmar Lapp) Date: Mon, 1 Aug 2011 16:30:18 -0400 Subject: [Bioperl-l] BioPerl Test requirements In-Reply-To: <79E5ED11-1D3F-436F-AA02-D922EAEB123A@illinois.edu> References: <79E5ED11-1D3F-436F-AA02-D922EAEB123A@illinois.edu> Message-ID: I think the small burden this change incurs for each developer is well outweighed by the reduced maintenance and installation burden. Go for it. -hilmar On Aug 1, 2011, at 12:07 AM, Chris Fields wrote: > All, > > We are currently using a BioPerl-specific module for running tests > called Bio::Root::Test. It is essentially a wrapper module, re- > exporting all the methods for Test::More, Test::Exception, and > Test::Warn. One problem: it currently expects a copy of Test::Warn > and Test::Exception in each repository as a fallback. Another > problem: these included modules appear to be triggering dependencies > with debian packaging. > > As an example of one hidden dependency, the included Test::Warn > requires Array::Compare, which converted to Moose a few years ago, > so you automatically have to install the entire Moose dependency > tree, even though Bioperl doesn't require it (not a slam on Moose, > you really SHOULD be using Moose these days. No, really :). > > Anway, more recent versions of Test::Warn don't have this > requirement, but as we package an old version of this module we get > stuck with the dependencies until we (manually) update this for each > repository. Ick. > > I think the best solution is to remove the bioperl-local modules in > t/lib and list Test::Most instead as a 'build_requires' in Build.PL, > e.g. the module is only necessary for the build phase so is > optionally installed. Test::Most essentially does exactly the same > thing as Bio::Root::Test and more; it also includes Test::Deep and > Test::Diff (Bio::Root::Test has a few additional methods of use as > well). > > As this will require developers to use Test::Most instead, though, I > though it would be worth asking on the list to see if there are any > objections. Any thoughts? > > > chris > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : =========================================================== From cjfields at illinois.edu Mon Aug 1 20:34:56 2011 From: cjfields at illinois.edu (Chris Fields) Date: Mon, 1 Aug 2011 15:34:56 -0500 Subject: [Bioperl-l] BioPerl Test requirements In-Reply-To: References: <79E5ED11-1D3F-436F-AA02-D922EAEB123A@illinois.edu> Message-ID: <0D28A228-53D1-4843-B99D-9F8A48132EA2@illinois.edu> Okay, will do. I'll initially test on a branch and then pull in. Thanks for the feedback Hilmar and Dave! chris On Aug 1, 2011, at 3:30 PM, Hilmar Lapp wrote: > I think the small burden this change incurs for each developer is well outweighed by the reduced maintenance and installation burden. Go for it. > > -hilmar > > On Aug 1, 2011, at 12:07 AM, Chris Fields wrote: > >> All, >> >> We are currently using a BioPerl-specific module for running tests called Bio::Root::Test. It is essentially a wrapper module, re-exporting all the methods for Test::More, Test::Exception, and Test::Warn. One problem: it currently expects a copy of Test::Warn and Test::Exception in each repository as a fallback. Another problem: these included modules appear to be triggering dependencies with debian packaging. >> >> As an example of one hidden dependency, the included Test::Warn requires Array::Compare, which converted to Moose a few years ago, so you automatically have to install the entire Moose dependency tree, even though Bioperl doesn't require it (not a slam on Moose, you really SHOULD be using Moose these days. No, really :). >> >> Anway, more recent versions of Test::Warn don't have this requirement, but as we package an old version of this module we get stuck with the dependencies until we (manually) update this for each repository. Ick. >> >> I think the best solution is to remove the bioperl-local modules in t/lib and list Test::Most instead as a 'build_requires' in Build.PL, e.g. the module is only necessary for the build phase so is optionally installed. Test::Most essentially does exactly the same thing as Bio::Root::Test and more; it also includes Test::Deep and Test::Diff (Bio::Root::Test has a few additional methods of use as well). >> >> As this will require developers to use Test::Most instead, though, I though it would be worth asking on the list to see if there are any objections. Any thoughts? >> >> >> chris >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : > =========================================================== > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From hlapp at drycafe.net Mon Aug 1 22:36:27 2011 From: hlapp at drycafe.net (Hilmar Lapp) Date: Mon, 1 Aug 2011 18:36:27 -0400 Subject: [Bioperl-l] Job opportunity: User Interface Design and Web Application Developer Message-ID: <7F0AE58E-6052-469B-ACD0-207FAD060472@drycafe.net> (Apologies if you have received this already or if this is considered spam - we're trying to reach out as broad as possible and I know that quite a few in the Bio* communities would be well qualified. Please feel free to pass on to anyone who might be interested, or might know someone who is.) User Interface Design and Web Application Developer The National Evolutionary Synthesis Center (NESCent) seeks a creative and enthusiastic individual to design user interfaces and web applications for scientific applications that manage, analyze, visualize and share data in support of evolutionary research. The incumbent will work as part of a small informatics team in close collaboration with domain scientists. NESCent (http://nescent.org) is an NSF-funded center dedicated to cross-disciplinary research in evolutionary science. Our informatics team works closely with visiting and resident scientists to support their custom software and database development needs (http://informatics.nescent.org ), and collaborates broadly with other biodiversity informatics projects. All NESCent software products are open-source, and the Center has a number of initiatives to actively promote collaborative development of community software resources. Above all, we are enthusiastic about our work, about the mission of the Center, and about the contribution of informatics to that mission. Job description: The incumbent will design and develop user interfaces and web applications for databases and other software tools for sponsored scientists and staff. The job responsibilities include all stages of the software development process, including requirements gathering, design, implementation, release packaging and documentation, as part of a small team (typically 2-3 individuals). We expect the incumbent to present their work at conferences and contribute to publications with scientific collaborators; interact regularly with visiting and resident scientists, other members of the informatics team and Center staff; and generally serve as an expert resource for Center personnel. The position provides opportunities for professional development and encourages research into new technologies. Most informatics staff work at our Durham NC offices, located adjacent to Duke University, but we support a wide range of technologies for virtual communication with off-site staff and collaborators. Salary range: $70,000 - $80,000, depending on education and experience Required Qualifications: * Demonstrated success collaborating with clients on custom software solutions * Experience with various stages of the software development cycle * Expertise in development and testing of user interface designs * Excellent communication skills, both virtual and face-to-face Preferred Qualifications: * M.S. or Ph.D. in Computer Science, Bioinformatics or related field * Demonstrated interest in science, particularly biology * Expertise in dynamic and interactive web technologies (JavaScript, CGI) * Expertise in rapid application development and respective programming technologies and languages (e.g., modern scripting languages and web-application frameworks such as Python/Django, Ruby/ Ruby-on-Rails, and Perl/Catalyst). * Expertise in graphic design * Expertise in data visualization and/or scientific data integration * Expertise in software usability design and assessment * Expertise in web service (SOAP, REST, XML, JSON) and semantic web technologies * Fluency in Java programming * Prior experience in relational database programming (PostgreSQL or MySQL) * Experience with open-source, and collaborative, software development How to apply: Please send cover letter, resume and contact information for three references to Dr. Karen Cranston, Training Coordinator and Bioinformatics Project Manager (karen.cranston at nescent.org); Please also complete the online application at the University of North Carolina HR website: http://bit.ly/r9HQ8r. Informal inquires or requests for additional information may be directed to Dr. Cranston by email or phone (+1-919-613-2275). Closing date is August 15, 2011. -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : =========================================================== From florent.angly at gmail.com Tue Aug 2 00:09:51 2011 From: florent.angly at gmail.com (Florent Angly) Date: Tue, 02 Aug 2011 10:09:51 +1000 Subject: [Bioperl-l] BioPerl Test requirements In-Reply-To: <0D28A228-53D1-4843-B99D-9F8A48132EA2@illinois.edu> References: <79E5ED11-1D3F-436F-AA02-D922EAEB123A@illinois.edu> <0D28A228-53D1-4843-B99D-9F8A48132EA2@illinois.edu> Message-ID: <4E37404F.1040001@gmail.com> If Test::Most gives more testing capabilities and makes packaging Bioperl easier, I think it's pretty sweet! Florent On 02/08/11 06:34, Chris Fields wrote: > Okay, will do. I'll initially test on a branch and then pull in. Thanks for the feedback Hilmar and Dave! > > chris > > On Aug 1, 2011, at 3:30 PM, Hilmar Lapp wrote: > >> I think the small burden this change incurs for each developer is well outweighed by the reduced maintenance and installation burden. Go for it. >> >> -hilmar >> >> On Aug 1, 2011, at 12:07 AM, Chris Fields wrote: >> >>> All, >>> >>> We are currently using a BioPerl-specific module for running tests called Bio::Root::Test. It is essentially a wrapper module, re-exporting all the methods for Test::More, Test::Exception, and Test::Warn. One problem: it currently expects a copy of Test::Warn and Test::Exception in each repository as a fallback. Another problem: these included modules appear to be triggering dependencies with debian packaging. >>> >>> As an example of one hidden dependency, the included Test::Warn requires Array::Compare, which converted to Moose a few years ago, so you automatically have to install the entire Moose dependency tree, even though Bioperl doesn't require it (not a slam on Moose, you really SHOULD be using Moose these days. No, really :). >>> >>> Anway, more recent versions of Test::Warn don't have this requirement, but as we package an old version of this module we get stuck with the dependencies until we (manually) update this for each repository. Ick. >>> >>> I think the best solution is to remove the bioperl-local modules in t/lib and list Test::Most instead as a 'build_requires' in Build.PL, e.g. the module is only necessary for the build phase so is optionally installed. Test::Most essentially does exactly the same thing as Bio::Root::Test and more; it also includes Test::Deep and Test::Diff (Bio::Root::Test has a few additional methods of use as well). >>> >>> As this will require developers to use Test::Most instead, though, I though it would be worth asking on the list to see if there are any objections. Any thoughts? >>> >>> >>> chris >>> >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> -- >> =========================================================== >> : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : >> =========================================================== >> >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From hartzell at alerce.com Tue Aug 2 00:06:54 2011 From: hartzell at alerce.com (George Hartzell) Date: Mon, 1 Aug 2011 17:06:54 -0700 Subject: [Bioperl-l] BioPerl Test requirements In-Reply-To: <79E5ED11-1D3F-436F-AA02-D922EAEB123A@illinois.edu> References: <79E5ED11-1D3F-436F-AA02-D922EAEB123A@illinois.edu> Message-ID: <20023.16286.89015.854814@gargle.gargle.HOWL> Sounds great. g. From carandraug+dev at gmail.com Tue Aug 2 14:00:32 2011 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Tue, 2 Aug 2011 15:00:32 +0100 Subject: [Bioperl-l] wiki administrator needed Message-ID: Hi! I have a problem with the bioperl wiki and have sent a support request to 'support at open-bio.org' as instructed here (http://www.bioperl.org/wiki/About_site#Help_with_Wiki_Problems ). I got the ticket ID #966. This was 2 weeks ago. Can someone with administrator rights on the wiki do something about it? Thanks in advance, Carn? Draug From p.j.a.cock at googlemail.com Tue Aug 2 14:56:30 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Tue, 2 Aug 2011 15:56:30 +0100 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: References: Message-ID: 2011/8/2 Carn? Draug : > Hi! > > I have a problem with the bioperl wiki and have sent a support request > to 'support at open-bio.org' as instructed here > (http://www.bioperl.org/wiki/About_site#Help_with_Wiki_Problems ). I > got the ticket ID #966. This was 2 weeks ago. Can someone with > administrator rights on the wiki do something about it? > > Thanks in advance, > Carn? Draug What was the problem with the wiki (for the benefit of those of us who might be able to fix it but are not on the support system and didn't get your email)? Peter From carandraug+dev at gmail.com Tue Aug 2 15:06:10 2011 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Tue, 2 Aug 2011 16:06:10 +0100 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: References: Message-ID: 2011/8/2 Peter Cock : > 2011/8/2 Carn? Draug : >> I have a problem with the bioperl wiki and have sent a support request >> to 'support at open-bio.org' as instructed here >> (http://www.bioperl.org/wiki/About_site#Help_with_Wiki_Problems ). I >> got the ticket ID #966. This was 2 weeks ago. Can someone with >> administrator rights on the wiki do something about it? > > What was the problem with the wiki (for the benefit of those > of us who might be able to fix it but are not on the support > system and didn't get your email)? Guess there should be no problem mentioning this on this open mailing list. Here's the e-mail I sent back then: When logging with OpenID, I accidentally created a new account. Now I can't use that OpenID for my real account since it's connected to that other account. It also doesn't let me remove that OpenID from that account. My real account has the nickname 'Carandraug'. The account I created by accident has the nickname '~carandraug' (because I was trying to connect my account with the OpenID of https://launchpad.net/~carandraug Could someone please remove the '~carandraug' account? I couldn't find a button to do so. From hlapp at drycafe.net Tue Aug 2 16:25:48 2011 From: hlapp at drycafe.net (Hilmar Lapp) Date: Tue, 2 Aug 2011 12:25:48 -0400 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: References: Message-ID: <0E50C8F0-2FCD-4AAB-83DD-DF6229E382C5@drycafe.net> I don't think the wiki allows removing of accounts (only blocking). Someone would have to go into the MySQL database and do that. -hilmar On Aug 2, 2011, at 11:06 AM, Carn? Draug wrote: > 2011/8/2 Peter Cock : >> 2011/8/2 Carn? Draug : >>> I have a problem with the bioperl wiki and have sent a support >>> request >>> to 'support at open-bio.org' as instructed here >>> (http://www.bioperl.org/wiki/About_site#Help_with_Wiki_Problems ). I >>> got the ticket ID #966. This was 2 weeks ago. Can someone with >>> administrator rights on the wiki do something about it? >> >> What was the problem with the wiki (for the benefit of those >> of us who might be able to fix it but are not on the support >> system and didn't get your email)? > > Guess there should be no problem mentioning this on this open mailing > list. Here's the e-mail I sent back then: > > When logging with OpenID, I accidentally created a new account. Now I > can't use that OpenID for my real account since it's connected to that > other account. It also doesn't let me remove that OpenID from that > account. > > My real account has the nickname 'Carandraug'. > > The account I created by accident has the nickname '~carandraug' > (because I was trying to connect my account with the OpenID of > https://launchpad.net/~carandraug > > Could someone please remove the '~carandraug' account? I couldn't find > a button to do so. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : =========================================================== From p.j.a.cock at googlemail.com Tue Aug 2 16:27:11 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Tue, 2 Aug 2011 17:27:11 +0100 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: <0E50C8F0-2FCD-4AAB-83DD-DF6229E382C5@drycafe.net> References: <0E50C8F0-2FCD-4AAB-83DD-DF6229E382C5@drycafe.net> Message-ID: 2011/8/2 Hilmar Lapp : > I don't think the wiki allows removing of accounts (only blocking). Someone > would have to go into the MySQL database and do that. The MediaWiki FAQ says don't do that, but does mention an optional add-on for merging wiki user accounts. We could block the unwanted account instead. Peter From cjfields at illinois.edu Tue Aug 2 16:35:36 2011 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 2 Aug 2011 11:35:36 -0500 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: References: <0E50C8F0-2FCD-4AAB-83DD-DF6229E382C5@drycafe.net> Message-ID: I don't know if blocking that account will solve to OpenID problem (that it is associated with the bad account), but maybe merging that account and Carn?'s good one will work. Maybe it's worth looking at the add-on. chris On Aug 2, 2011, at 11:27 AM, Peter Cock wrote: > 2011/8/2 Hilmar Lapp : >> I don't think the wiki allows removing of accounts (only blocking). Someone >> would have to go into the MySQL database and do that. > > The MediaWiki FAQ says don't do that, but does mention an > optional add-on for merging wiki user accounts. > > We could block the unwanted account instead. > > Peter > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Tue Aug 2 16:38:01 2011 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 2 Aug 2011 11:38:01 -0500 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: References: <0E50C8F0-2FCD-4AAB-83DD-DF6229E382C5@drycafe.net> Message-ID: Carn?, Try logging in with the bad account, then go under 'my preferences'. There is an OpenID tag; this lists your OpenIDs, along with a 'delete' button. See if deleting the OpenID helps. chris On Aug 2, 2011, at 11:27 AM, Peter Cock wrote: > 2011/8/2 Hilmar Lapp : >> I don't think the wiki allows removing of accounts (only blocking). Someone >> would have to go into the MySQL database and do that. > > The MediaWiki FAQ says don't do that, but does mention an > optional add-on for merging wiki user accounts. > > We could block the unwanted account instead. > > Peter > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From carandraug+dev at gmail.com Tue Aug 2 16:58:41 2011 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Tue, 2 Aug 2011 17:58:41 +0100 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: References: <0E50C8F0-2FCD-4AAB-83DD-DF6229E382C5@drycafe.net> Message-ID: On 2 August 2011 17:38, Chris Fields wrote: > Try logging in with the bad account, then go under 'my preferences'. ?There is an OpenID tag; this lists your OpenIDs, along with a 'delete' button. ?See if deleting the OpenID helps. I had try that the first time. However, it didn't let me do it because that OpenID was the one used to create the account. Carn? From ihok at hotmail.com Tue Aug 2 17:29:43 2011 From: ihok at hotmail.com (Jack Tanner) Date: Tue, 2 Aug 2011 13:29:43 -0400 Subject: [Bioperl-l] fastq quality with initial @ Message-ID: i've got a fastq file with PHRED quality strings that sometimes start with '@'. this breaks the _index_file routine in Bio/Index/Fastq.pm. i would've filed this in bugzilla, but i'm not authorized to do that. From cjfields at illinois.edu Tue Aug 2 18:59:00 2011 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 2 Aug 2011 13:59:00 -0500 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: References: <0E50C8F0-2FCD-4AAB-83DD-DF6229E382C5@drycafe.net> Message-ID: Let's see if we can get the merge account add-in working, then. chris On Aug 2, 2011, at 11:58 AM, Carn? Draug wrote: > On 2 August 2011 17:38, Chris Fields wrote: >> Try logging in with the bad account, then go under 'my preferences'. There is an OpenID tag; this lists your OpenIDs, along with a 'delete' button. See if deleting the OpenID helps. > > I had try that the first time. However, it didn't let me do it because > that OpenID was the one used to create the account. > > Carn? > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Tue Aug 2 19:00:47 2011 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 2 Aug 2011 14:00:47 -0500 Subject: [Bioperl-l] fastq quality with initial @ In-Reply-To: References: Message-ID: <441DB637-5586-488F-8943-FEA4D56C276B@illinois.edu> On Aug 2, 2011, at 12:29 PM, Jack Tanner wrote: > > i've got a fastq file with PHRED quality strings that sometimes start with '@'. this breaks the _index_file routine in Bio/Index/Fastq.pm. > i would've filed this in bugzilla, but i'm not authorized to do that. We no longer use bugzilla (as of v 1.6.900); see here: http://www.bioperl.org/wiki/Bugs Just register for an account and submit. I would check the latest code before doing so, just in case it has been fixed. chris From bosborne11 at verizon.net Tue Aug 2 20:24:54 2011 From: bosborne11 at verizon.net (Brian Osborne) Date: Tue, 02 Aug 2011 16:24:54 -0400 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: References: <0E50C8F0-2FCD-4AAB-83DD-DF6229E382C5@drycafe.net> Message-ID: Chris, This is the one I've used: http://www.mediawiki.org/wiki/Extension:User_Merge_and_Delete BIO On Aug 2, 2011, at 2:59 PM, Chris Fields wrote: > Let's see if we can get the merge account add-in working, then. > > chris > > On Aug 2, 2011, at 11:58 AM, Carn? Draug wrote: > >> On 2 August 2011 17:38, Chris Fields wrote: >>> Try logging in with the bad account, then go under 'my preferences'. There is an OpenID tag; this lists your OpenIDs, along with a 'delete' button. See if deleting the OpenID helps. >> >> I had try that the first time. However, it didn't let me do it because >> that OpenID was the one used to create the account. >> >> Carn? >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Tue Aug 2 22:01:42 2011 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 2 Aug 2011 17:01:42 -0500 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: References: <0E50C8F0-2FCD-4AAB-83DD-DF6229E382C5@drycafe.net> Message-ID: <0CE0963F-3D4C-4EDB-A77F-859598E24DE2@illinois.edu> Carn?, I installed the add-in, merged the old account (~carandraug) into the one specified (Carandraug ), and deleted the old account. See if that works. chris On Aug 2, 2011, at 1:59 PM, Chris Fields wrote: > Let's see if we can get the merge account add-in working, then. > > chris > > On Aug 2, 2011, at 11:58 AM, Carn? Draug wrote: > >> On 2 August 2011 17:38, Chris Fields wrote: >>> Try logging in with the bad account, then go under 'my preferences'. There is an OpenID tag; this lists your OpenIDs, along with a 'delete' button. See if deleting the OpenID helps. >> >> I had try that the first time. However, it didn't let me do it because >> that OpenID was the one used to create the account. >> >> Carn? >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From carandraug+dev at gmail.com Tue Aug 2 22:19:38 2011 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Tue, 2 Aug 2011 23:19:38 +0100 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: <0CE0963F-3D4C-4EDB-A77F-859598E24DE2@illinois.edu> References: <0E50C8F0-2FCD-4AAB-83DD-DF6229E382C5@drycafe.net> <0CE0963F-3D4C-4EDB-A77F-859598E24DE2@illinois.edu> Message-ID: On 2 August 2011 23:01, Chris Fields wrote: > Carn?, > > I installed the add-in, merged the old account (~carandraug) into the one specified (Carandraug ), and deleted the old account. ?See if that works. > > chris When I try to add this OpenID to my account, I still get the error: "That is someone else's OpenID." If I try to log in with this OpenID, after saying that I'm logged in successfully, the site still looks as if I'm not logged in, with a button to 'log in' and an IP address instead of a username. Another problem that I have when logging is that sometimes mediawiki sends 'https://login.launchpad.net/ id/y7xtYzD' instead of 'https://login.launchpad.net/~carandraug' to the launchpad server. I don't know what's causing this. Trying to backspace and delete what may be invisible characters before and after the string sometimes solves this. This happens even though I type this character by character so if there's any invisble stuff on the form it must be there before. This occurs when using Iceweasel 3.5 (in Debian), Firefox 3.6 (in Ubuntu) and Firefox 5 (in MacOSX). Carn? Draug From cjfields at illinois.edu Tue Aug 2 22:39:19 2011 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 2 Aug 2011 17:39:19 -0500 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: References: <0E50C8F0-2FCD-4AAB-83DD-DF6229E382C5@drycafe.net> <0CE0963F-3D4C-4EDB-A77F-859598E24DE2@illinois.edu> Message-ID: On Aug 2, 2011, at 5:19 PM, Carn? Draug wrote: > On 2 August 2011 23:01, Chris Fields wrote: >> Carn?, >> >> I installed the add-in, merged the old account (~carandraug) into the one specified (Carandraug ), and deleted the old account. See if that works. >> >> chris > > When I try to add this OpenID to my account, I still get the error: > "That is someone else's OpenID." Apparently UserMerge doesn't clean up empty OpenID. I found that one (login.launchpad.net/~carandraug) and manually deleted it. The user ID it was associated with no longer existed in the user tables. Kinda wondered if that would happen... > If I try to log in with this OpenID, after saying that I'm logged in > successfully, the site still looks as if I'm not logged in, with a > button to 'log in' and an IP address instead of a username. > > Another problem that I have when logging is that sometimes mediawiki > sends 'https://login.launchpad.net/ id/y7xtYzD' instead of > 'https://login.launchpad.net/~carandraug' to the launchpad server. I > don't know what's causing this. Trying to backspace and delete what > may be invisible characters before and after the string sometimes > solves this. This happens even though I type this character by > character so if there's any invisble stuff on the form it must be > there before. This occurs when using Iceweasel 3.5 (in Debian), > Firefox 3.6 (in Ubuntu) and Firefox 5 (in MacOSX). > > Carn? Draug Not sure myself, sounds like a MW bug. See if the OpenID works first, then maybe we can address that. chris From carandraug+dev at gmail.com Tue Aug 2 22:56:49 2011 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Tue, 2 Aug 2011 23:56:49 +0100 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: References: <0E50C8F0-2FCD-4AAB-83DD-DF6229E382C5@drycafe.net> <0CE0963F-3D4C-4EDB-A77F-859598E24DE2@illinois.edu> Message-ID: 2011/8/2 Chris Fields : > On Aug 2, 2011, at 5:19 PM, Carn? Draug wrote: >> On 2 August 2011 23:01, Chris Fields wrote: >>> Carn?, >>> >>> I installed the add-in, merged the old account (~carandraug) into the one specified (Carandraug ), and deleted the old account. ?See if that works. >>> >> >> When I try to add this OpenID to my account, I still get the error: >> "That is someone else's OpenID." > > Apparently UserMerge doesn't clean up empty OpenID. ?I found that one (login.launchpad.net/~carandraug) and manually deleted it. ?The user ID it was associated with no longer existed in the user tables. This is solved. I connected my account with this OpenID and can now log in with it. Thank you >> Another problem that I have when logging is that sometimes mediawiki >> sends 'https://login.launchpad.net/ id/y7xtYzD' instead of >> 'https://login.launchpad.net/~carandraug' to the launchpad server. I >> don't know what's causing this. Trying to backspace and delete what >> may be invisible characters before and after the string sometimes >> solves this. This happens even though I type this character by >> character so if there's any invisble stuff on the form it must be >> there before. This occurs when using Iceweasel 3.5 (in Debian), >> Firefox 3.6 (in Ubuntu) and Firefox 5 (in MacOSX). This still happens sometimes. It just happened now. I had also fill a support request about this issue some weeks ago (ticket #965). No idea what's been causing this. Carn? From cjfields at illinois.edu Wed Aug 3 01:55:23 2011 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 2 Aug 2011 20:55:23 -0500 Subject: [Bioperl-l] wiki administrator needed In-Reply-To: References: <0E50C8F0-2FCD-4AAB-83DD-DF6229E382C5@drycafe.net> <0CE0963F-3D4C-4EDB-A77F-859598E24DE2@illinois.edu> Message-ID: On Aug 2, 2011, at 5:56 PM, Carn? Draug wrote: > 2011/8/2 Chris Fields : >> On Aug 2, 2011, at 5:19 PM, Carn? Draug wrote: >>> On 2 August 2011 23:01, Chris Fields wrote: >>>> Carn?, >>>> >>>> I installed the add-in, merged the old account (~carandraug) into the one specified (Carandraug ), and deleted the old account. See if that works. >>>> >>> >>> When I try to add this OpenID to my account, I still get the error: >>> "That is someone else's OpenID." >> >> Apparently UserMerge doesn't clean up empty OpenID. I found that one (login.launchpad.net/~carandraug) and manually deleted it. The user ID it was associated with no longer existed in the user tables. > > This is solved. I connected my account with this OpenID and can now > log in with it. Thank you No problem. Apparently there is a bug fix in the more recent versions of OpenID and UserMerge, I'll add a redmine task to make sure they get updated (have my hands full right now, and OpenID can sometimes be tricky to debug). >>> Another problem that I have when logging is that sometimes mediawiki >>> sends 'https://login.launchpad.net/ id/y7xtYzD' instead of >>> 'https://login.launchpad.net/~carandraug' to the launchpad server. I >>> don't know what's causing this. Trying to backspace and delete what >>> may be invisible characters before and after the string sometimes >>> solves this. This happens even though I type this character by >>> character so if there's any invisble stuff on the form it must be >>> there before. This occurs when using Iceweasel 3.5 (in Debian), >>> Firefox 3.6 (in Ubuntu) and Firefox 5 (in MacOSX). > > This still happens sometimes. It just happened now. I had also fill a > support request about this issue some weeks ago (ticket #965). No idea > what's been causing this. > > Carn? Okay, as long as it's noted somewhere. chris From kai.blin at biotech.uni-tuebingen.de Wed Aug 3 08:55:04 2011 From: kai.blin at biotech.uni-tuebingen.de (Kai Blin) Date: Wed, 03 Aug 2011 10:55:04 +0200 Subject: [Bioperl-l] Bio::Root::IO _readline/_pushback behavior Message-ID: <4E390CE8.2050100@biotech.uni-tuebingen.de> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi folks, as I mentioned on https://redmine.open-bio.org/issues/3264 there is something odd going on with Bio::Root::IO's _readline/_pushback functions. This seems to be intentional, at least there is a test case asserting the behaviour I'm seeing. It his however very confusing to the unexpecting programmer using the code. One assumption I'd immediately make would be that if I have code that does a $foo = $io->_readline; $io->_pushback($foo); $bar = $io->_readline;, $foo will be the same string as $bar, regardless what other pieces of the code did. Currently, this is not the case, because the readbuffer that _pushback pushes back into has new strings appended to the end but readline removes them from the front. This easily violates the "principle of least surprise", so I think we should change the readbuffer to a stack. As far as I can tell, changing the _pushback function to "unshift" instead of "push" to the readbuffer breaks only the Root/RootIO.t test designed to test the old behaviour. I don't see any other tests failing on my system that don't fail without this patch. Any comments from the core devs? Cheers, Kai - -- Dipl.-Inform. Kai Blin kai.blin at biotech.uni-tuebingen.de Institute for Microbiology and Infection Medicine Division of Microbiology/Biotechnology Eberhard-Karls-Universit?t T?bingen Auf der Morgenstelle 28 Phone : ++49 7071 29-78841 D-72076 T?bingen Fax : ++49 7071 29-5979 Germany Homepage: http://www.mikrobio.uni-tuebingen.de/ag_wohlleben -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJOOQzoAAoJEKM5lwBiwTTPO6QIAMDN1bAm1FFD98F0rhN7TCpW sV2sLkQDESK9YjCxp3kAqCpg7ZCArcA5l7HmEdAZFTzdFnsfnvKJmNB86C30QXJs 6XcYSbvBIPQdhjK7WIhG2pANItiTxKTGgXDZklVjgj2dVT4kSkCgdGYAAMssT1hn n1/jkBJu5uuCq43Wv5Ia+wEhdN0M+xgKc9x7MF/ikO2qr6x24odMNTW8VgyLsYie p9M68U23aStip2rxV1hrhZzbnjLz66V6O9fIEHmm5CYLfcGXkcrclzLIeptepSj1 bj/7dWIdXy8VnoSNx4RbckHSkMbdIkmyPKzmoYFN7p3FvmrSXsOmB6nfD0hEkbY= =S5ff -----END PGP SIGNATURE----- From shelly.mh at gmail.com Tue Aug 2 10:19:33 2011 From: shelly.mh at gmail.com (Shelly M) Date: Tue, 2 Aug 2011 13:19:33 +0300 Subject: [Bioperl-l] question regarding Bio::DB::CUTG Message-ID: Hello, My name is Shelly and I'm a student at the Hebrew university of Jerusalem. I'm trying to use the package Bio::DB::CUTG but I have some trouble retrieving the right table for a given organism. For example, if I write my $cdtable = Bio::DB::CUTG->new(-sp =>'Mus musculus'); I get a warring message :MSG: too many species - not a unique species id, and it return _species => mitochondrion Mus musculus. So my question is what is the exact format for retrieving the the specific organism? Thanks a lot for the help, Shelly From maximilien1er at gmail.com Wed Aug 3 02:50:44 2011 From: maximilien1er at gmail.com (=?ISO-8859-1?Q?Maxime_D=E9raspe?=) Date: Tue, 2 Aug 2011 19:50:44 -0700 (PDT) Subject: [Bioperl-l] Genbank file : bad features (tag) order with /translation Message-ID: Hi, when I parse a genbank file no matter what I do, the / translation="MKAV.." tag value of a CDS never appear in the last place as it should be. Other tags like /note= /product comes after / translation which it's not the usual practice with genbank file. Could anyone have an idea how to deal with it... put /translation tag value in the last place when I write the genbank file. Thank you ! Max From shachigahoimbi at gmail.com Wed Aug 3 06:00:44 2011 From: shachigahoimbi at gmail.com (Shachi Gahoi) Date: Wed, 3 Aug 2011 11:30:44 +0530 Subject: [Bioperl-l] How to show branch length value in tree Message-ID: Dear All I am using Bio::Tree modules for constructing and drawing tree. *I am unable to show branch length value in tree. * Please tell me How can I do this, if anybody knows. Here is my script which i am using...and i also attached generated tree. Thanks in advance ################################################################################################ use Bio::AlignIO; use Bio::Align::ProteinStatistics; use Bio::Tree::DistanceFactory; use Bio::TreeIO; use Bio::Tree::Draw::Cladogram; # for a dna alignment # can also use ProteinStatistics my $alnio = Bio::AlignIO->new(-file => 'ADP.aln', -format=>'clustalw'); my $dfactory = Bio::Tree::DistanceFactory->new(-method => 'UPGMA'); my $stats = Bio::Align::ProteinStatistics->new; my $treeout = Bio::TreeIO->new(-format => 'newick', -file =>'>ADP1.dnd'); while( my $aln = $alnio->next_aln ) { my $mat = $stats->distance(-method => 'Kimura', -align => $aln); my $tree = $dfactory->make_tree($mat); $treeout->write_tree($tree); } my $dir = shift || '.'; opendir(DIR, $dir) || die $!; for my $file ( readdir(DIR) ) { next unless $file =~ /(\S+)\.dnd$/; my $stem = $1; my $treeio = Bio::TreeIO->new('-format' => 'newick', '-file' => "$dir/$file"); if( my $t1 = $treeio->next_tree ) { my $obj1 = Bio::Tree::Draw::Cladogram->new(-bootstrap => 1, -tree => $t1, -compact => 0); $obj1->print(-file => "$dir/$stem.eps"); } } ######################################################################################################## -- Regards, Shachi -------------- next part -------------- A non-text attachment was scrubbed... Name: ADP1.dnd Type: application/octet-stream Size: 1369 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ADP1.eps Type: application/postscript Size: 17718 bytes Desc: not available URL: From cjfields at illinois.edu Wed Aug 3 13:10:20 2011 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 3 Aug 2011 08:10:20 -0500 Subject: [Bioperl-l] Question to Bio::SearchIO::infernal.pm In-Reply-To: <4E32E14B020000EE00004F57@gwia1.boku.ac.at> References: <4E32E14B020000EE00004F57@gwia1.boku.ac.at> Message-ID: Nadine, Hard to guess w/o seeing the report, but I'm not terribly surprised. I believe I only coded for simple 1 CM reports, IIRC. You'll have to file this as a bug on redmine along with an example. chris On Jul 29, 2011, at 9:35 AM, Nadine Elpida Tatto wrote: > Hi There! > > > > I was wondering if you would or can help me. > > > I have an infernal report containing about 2000 CMs from an infernal run against Rfam.cm. To parse this report I wanted to use Bio::SearchIO::infernal.pm. Unfortunately this turned out to be a problem for me, because "$parser->next_result" only delivers the result for the first CM in the report and nothing more. > > > My code: > #!/usr/bin/perl -w > > > use strict;use Data::Dumper; > use Bio::SearchIO; > > > my $infile = $ARGV[0]; # infernal report > my $parser = Bio::SearchIO->new(-format => 'Infernal', > -file => $infile); > > > while( my $result = $parser->next_result ) { > print $result->query_name . "\n"; > } > > > exit; > > > > > The output: > > > ntatto:~$ ./infernalParser.pl infernal.output > 5S_rRNA > ntatto:~$ > > > > > I would expect the following (like parsing a blast report): > > > ntatto:~$ ./infernalParser.pl infernal.output > 5S_rRNA > 5_8S_rRNA > U1 > ... > ntatto:~$ > > > > I would be glad for help. > > > Thank you in advance. > > > Best Regards > > > N Tatto > > > > > > > > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From p.j.a.cock at googlemail.com Wed Aug 3 13:46:06 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 3 Aug 2011 14:46:06 +0100 Subject: [Bioperl-l] Genbank file : bad features (tag) order with /translation In-Reply-To: References: Message-ID: 2011/8/3 Maxime D?raspe : > Hi, > > when I parse a genbank file no matter what I do, the / > translation="MKAV.." tag value of a CDS never appear in the last place > as it should be. Other tags like /note= /product comes after / > translation which it's not the usual practice with genbank file. Could > anyone have an idea how to deal with it... put /translation tag value > in the last place when I write the genbank file. > > Thank you ! > > Max Hi Max, I'm not aware of anything in the feature table specification about the order of the feature qualifiers (the "tags" like /note and /product). See http://www.ncbi.nlm.nih.gov/collab/FT/ I suspect BioPerl is using a hash (Biopython uses a dictionary) for the feature qualifiers, which would discard the order. Why do you care about the order? Peter From roy.chaudhuri at gmail.com Wed Aug 3 13:58:22 2011 From: roy.chaudhuri at gmail.com (Roy Chaudhuri) Date: Wed, 03 Aug 2011 14:58:22 +0100 Subject: [Bioperl-l] How to show branch length value in tree In-Reply-To: References: Message-ID: <4E3953FE.5080304@gmail.com> Hi Shachi, I don't think you can draw labels on branches using Bio::Tree::Draw::Cladogram. However, it will draw node labels, so you could copy the branch lengths over to the node ids: my $obj1 = Bio::Tree::Draw::Cladogram->new(-bootstrap => 1, -tree => $t1, -compact => 0); for my $node ($tree->get_nodes) { $node->id($node->branch_length) if defined $node->branch_length; } $obj1->print(-file => "$dir/$stem.eps") Incidentally, in your script you write the tree out to a file, then read it back in using TreeIO. This is unnecessary, you can use $tree directly as input to Bio::Tree::Draw::Cladogram. Alternatively, you could write out a newick file and use non-Bioperl software such as njplot or MEGA to draw your tree with labelled branch lengths. Cheers, Roy. On 03/08/2011 07:00, Shachi Gahoi wrote: > Dear All > > I am using Bio::Tree modules for constructing and drawing tree. *I am unable > to show branch length value in tree. > * > Please tell me How can I do this, if anybody knows. > > Here is my script which i am using...and i also attached generated tree. > > Thanks in advance > > ################################################################################################ > > use Bio::AlignIO; > use Bio::Align::ProteinStatistics; > use Bio::Tree::DistanceFactory; > use Bio::TreeIO; > use Bio::Tree::Draw::Cladogram; > > # for a dna alignment > # can also use ProteinStatistics > > my $alnio = Bio::AlignIO->new(-file => 'ADP.aln', -format=>'clustalw'); > > my $dfactory = Bio::Tree::DistanceFactory->new(-method => 'UPGMA'); > > my $stats = Bio::Align::ProteinStatistics->new; > > my $treeout = Bio::TreeIO->new(-format => 'newick', -file =>'>ADP1.dnd'); > > while( my $aln = $alnio->next_aln ) > { > my $mat = $stats->distance(-method => 'Kimura', -align => $aln); > > my $tree = $dfactory->make_tree($mat); > $treeout->write_tree($tree); > } > > my $dir = shift || '.'; > > opendir(DIR, $dir) || die $!; > for my $file ( readdir(DIR) ) > { > next unless $file =~ /(\S+)\.dnd$/; > my $stem = $1; > my $treeio = Bio::TreeIO->new('-format' => 'newick', > '-file' => "$dir/$file"); > > if( my $t1 = $treeio->next_tree ) > { > my $obj1 = Bio::Tree::Draw::Cladogram->new(-bootstrap => 1, > -tree => $t1, > -compact => 0); > $obj1->print(-file => "$dir/$stem.eps"); > } > } > > ######################################################################################################## > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From roy.chaudhuri at gmail.com Wed Aug 3 14:01:18 2011 From: roy.chaudhuri at gmail.com (Roy Chaudhuri) Date: Wed, 03 Aug 2011 15:01:18 +0100 Subject: [Bioperl-l] How to show branch length value in tree In-Reply-To: <4E3953FE.5080304@gmail.com> References: <4E3953FE.5080304@gmail.com> Message-ID: <4E3954AE.2080401@gmail.com> Sorry, the code had a typo, it should be: my $obj1 = Bio::Tree::Draw::Cladogram->new(-bootstrap => 1, -tree => $t1, -compact => 0); for my $node ($t1->get_nodes) { $node->id($node->branch_length) if defined $node->branch_length; } $obj1->print(-file => "$dir/$stem.eps") On 03/08/2011 14:58, Roy Chaudhuri wrote: > Hi Shachi, > > I don't think you can draw labels on branches using > Bio::Tree::Draw::Cladogram. However, it will draw node labels, so you > could copy the branch lengths over to the node ids: > > my $obj1 = Bio::Tree::Draw::Cladogram->new(-bootstrap => 1, > -tree => $t1, > -compact => 0); > for my $node ($tree->get_nodes) { > $node->id($node->branch_length) if defined $node->branch_length; > } > $obj1->print(-file => "$dir/$stem.eps") > > Incidentally, in your script you write the tree out to a file, then read > it back in using TreeIO. This is unnecessary, you can use $tree directly > as input to Bio::Tree::Draw::Cladogram. > > Alternatively, you could write out a newick file and use non-Bioperl > software such as njplot or MEGA to draw your tree with labelled branch > lengths. > > Cheers, > Roy. > > On 03/08/2011 07:00, Shachi Gahoi wrote: >> Dear All >> >> I am using Bio::Tree modules for constructing and drawing tree. *I am unable >> to show branch length value in tree. >> * >> Please tell me How can I do this, if anybody knows. >> >> Here is my script which i am using...and i also attached generated tree. >> >> Thanks in advance >> >> ################################################################################################ >> >> use Bio::AlignIO; >> use Bio::Align::ProteinStatistics; >> use Bio::Tree::DistanceFactory; >> use Bio::TreeIO; >> use Bio::Tree::Draw::Cladogram; >> >> # for a dna alignment >> # can also use ProteinStatistics >> >> my $alnio = Bio::AlignIO->new(-file => 'ADP.aln', -format=>'clustalw'); >> >> my $dfactory = Bio::Tree::DistanceFactory->new(-method => 'UPGMA'); >> >> my $stats = Bio::Align::ProteinStatistics->new; >> >> my $treeout = Bio::TreeIO->new(-format => 'newick', -file =>'>ADP1.dnd'); >> >> while( my $aln = $alnio->next_aln ) >> { >> my $mat = $stats->distance(-method => 'Kimura', -align => $aln); >> >> my $tree = $dfactory->make_tree($mat); >> $treeout->write_tree($tree); >> } >> >> my $dir = shift || '.'; >> >> opendir(DIR, $dir) || die $!; >> for my $file ( readdir(DIR) ) >> { >> next unless $file =~ /(\S+)\.dnd$/; >> my $stem = $1; >> my $treeio = Bio::TreeIO->new('-format' => 'newick', >> '-file' => "$dir/$file"); >> >> if( my $t1 = $treeio->next_tree ) >> { >> my $obj1 = Bio::Tree::Draw::Cladogram->new(-bootstrap => 1, >> -tree => $t1, >> -compact => 0); >> $obj1->print(-file => "$dir/$stem.eps"); >> } >> } >> >> ######################################################################################################## >> >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > From cjfields at illinois.edu Wed Aug 3 14:08:33 2011 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 3 Aug 2011 09:08:33 -0500 Subject: [Bioperl-l] Genbank file : bad features (tag) order with /translation In-Reply-To: References: Message-ID: <4585DD3A-8E0A-4820-BA34-8154146A0BC8@illinois.edu> On Aug 3, 2011, at 8:46 AM, Peter Cock wrote: > 2011/8/3 Maxime D?raspe : >> Hi, >> >> when I parse a genbank file no matter what I do, the / >> translation="MKAV.." tag value of a CDS never appear in the last place >> as it should be. Other tags like /note= /product comes after / >> translation which it's not the usual practice with genbank file. Could >> anyone have an idea how to deal with it... put /translation tag value >> in the last place when I write the genbank file. >> >> Thank you ! >> >> Max > > Hi Max, > > I'm not aware of anything in the feature table specification > about the order of the feature qualifiers (the "tags" like /note > and /product). See http://www.ncbi.nlm.nih.gov/collab/FT/ > > I suspect BioPerl is using a hash (Biopython uses a dictionary) > for the feature qualifiers, which would discard the order. > > Why do you care about the order? > > Peter Yes, it uses a hash based on the feature tags. Not sure how Biopython handles it but my guess is something similar (Peter?). The output order was never a chief concern of ours. To tell the truth our main focus has never been simple conversion, except to transform data into a format that is more manageable/normalized. For those interested in making this change, all the code for printing features is in one method in Bio::SeqIO::genbank, _print_GenBank_FTHelper(). The best way to handle this would be to allow an optional coderef/callback that takes the feature (or the tags) and allows custom sorting and printing; I don't want to get into messy semantics on how to specifically sort tags, best to let the user decide. chris From cjfields at illinois.edu Wed Aug 3 14:16:37 2011 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 3 Aug 2011 09:16:37 -0500 Subject: [Bioperl-l] Bio::Root::IO _readline/_pushback behavior In-Reply-To: <4E390CE8.2050100@biotech.uni-tuebingen.de> References: <4E390CE8.2050100@biotech.uni-tuebingen.de> Message-ID: On Aug 3, 2011, at 3:55 AM, Kai Blin wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hi folks, > > as I mentioned on https://redmine.open-bio.org/issues/3264 there is > something odd going on with Bio::Root::IO's _readline/_pushback > functions. This seems to be intentional, at least there is a test case > asserting the behaviour I'm seeing. It his however very confusing to the > unexpecting programmer using the code. > > One assumption I'd immediately make would be that if I have code that > does a $foo = $io->_readline; $io->_pushback($foo); $bar = > $io->_readline;, $foo will be the same string as $bar, regardless what > other pieces of the code did. Currently, this is not the case, because > the readbuffer that _pushback pushes back into has new strings appended > to the end but readline removes them from the front. I think this test is performed in the regressions already, but if not then it is more than welcome. > This easily violates the "principle of least surprise", so I think we > should change the readbuffer to a stack. As far as I can tell, changing > the _pushback function to "unshift" instead of "push" to the readbuffer > breaks only the Root/RootIO.t test designed to test the old behaviour. I > don't see any other tests failing on my system that don't fail without > this patch. > > Any comments from the core devs? I don't have a problem with that beyond the change to the RootIO.t tests (it implies a specific behavior that some developers expect, so is a very subtle API change). However, this is how one would expect it, to be more like an 'unread' stack instead of a queue. In fact, there is a module I used for Biome's pushback/readline called IO::Unread that implements an IO layer for mimicing this behavior, might be worth looking into. > Cheers, > Kai chris Christopher Fields Senior Research Scientist National Center for Supercomputing Applications Institute for Genomic Biology University of Illinois Urbana-Champaign 1206 W. Gregory Dr. , MC-195 Urbana, IL 61801 From p.j.a.cock at googlemail.com Wed Aug 3 14:45:21 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 3 Aug 2011 15:45:21 +0100 Subject: [Bioperl-l] Genbank file : bad features (tag) order with /translation In-Reply-To: <4585DD3A-8E0A-4820-BA34-8154146A0BC8@illinois.edu> References: <4585DD3A-8E0A-4820-BA34-8154146A0BC8@illinois.edu> Message-ID: On Wed, Aug 3, 2011 at 3:08 PM, Chris Fields wrote: > > Yes, it uses a hash based on the feature tags. ?Not sure how Biopython > handles it but my guess is something similar (Peter?). Yes, we key on the feature qualifier (e.g. note or product) and the values are a list of qualifier values (e.g. you can have two notes). > The output order was never a chief concern of ours. ?To tell the truth > our main focus has never been simple conversion, except to transform > data into a format that is more manageable/normalized. > > For those interested in making this change, all the code ?for printing > features is in one method in Bio::SeqIO::genbank, _print_GenBank_FTHelper(). >?The best way to handle this would be to allow an optional coderef/callback > that takes the feature (or the tags) and allows custom sorting and printing; > I don't want to get into messy semantics on how to specifically sort tags, > best to let the user decide. For Biopython switching from the default dictionary (hash type) to an order preserving dictionary would be one option. I too have no wish to try and implement qualifier sorting without an explicit standard. Peter From maximilien1er at gmail.com Wed Aug 3 14:48:05 2011 From: maximilien1er at gmail.com (=?ISO-8859-1?Q?Maxime_D=E9raspe?=) Date: Wed, 3 Aug 2011 07:48:05 -0700 (PDT) Subject: [Bioperl-l] Genbank file : bad features (tag) order with /translation In-Reply-To: References: Message-ID: <7717ecd0-c9b6-4ecc-895f-db60ca89e259@l7g2000vbz.googlegroups.com> > Hi Max, > > I'm not aware of anything in the feature table specification > about the order of the feature qualifiers (the "tags" like /note > and /product). Seehttp://www.ncbi.nlm.nih.gov/collab/FT/ > > I suspect BioPerl is using a hash (Biopython uses a dictionary) > for the feature qualifiers, which would discard the order. > > Why do you care about the order? > > Peter > Hi Peter, I care about the order for the submission to ncbi. But I guess they will reformat the file before getting it in their database. It's also visually better when the translation of the protein comes in the end of the annotation for the CDS and not before /product, /note .... Anyway maybe I'll reformat the file in sequin table for a direct submission to ncbi with sequin. Thank you. Max From p.j.a.cock at googlemail.com Wed Aug 3 16:00:01 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 3 Aug 2011 17:00:01 +0100 Subject: [Bioperl-l] Genbank file : bad features (tag) order with /translation In-Reply-To: <7717ecd0-c9b6-4ecc-895f-db60ca89e259@l7g2000vbz.googlegroups.com> References: <7717ecd0-c9b6-4ecc-895f-db60ca89e259@l7g2000vbz.googlegroups.com> Message-ID: 2011/8/3 Maxime D?raspe : >> >> Why do you care about the order? >> > > Hi Peter, > > I care about the order for the submission to ncbi. Do the NCBI have some guidelines which ask for a particular order? > But I guess they > will reformat the file before getting it in their database. They seem to generate the official GenBank files from their database - so I doubt the input order matters. > It's also > visually better when the translation of the protein comes in the end > of the annotation for the CDS and not before /product, /note .... I do see your point, but if that were the only motivation I wouldn't want to make generating GenBank output any more complicated than it already is. > Anyway maybe I'll reformat the file in sequin table for a direct > submission to ncbi with sequin. > > Thank you. > > Max Peter From cjfields at illinois.edu Wed Aug 3 16:52:02 2011 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 3 Aug 2011 11:52:02 -0500 Subject: [Bioperl-l] Genbank file : bad features (tag) order with /translation In-Reply-To: References: <7717ecd0-c9b6-4ecc-895f-db60ca89e259@l7g2000vbz.googlegroups.com> Message-ID: On Aug 3, 2011, at 11:00 AM, Peter Cock wrote: > 2011/8/3 Maxime D?raspe : >>> >>> Why do you care about the order? >>> >> >> Hi Peter, >> >> I care about the order for the submission to ncbi. > > Do the NCBI have some guidelines which ask for a particular order? No, beyond the feature table there is no specification that indicates such that I am aware of. Submitted data is tabular; sequin is a nicer GUI API for getting data into a useful format for submission to NCBI, where data is converted to ASN.1 I believe. >> But I guess they >> will reformat the file before getting it in their database. > > They seem to generate the official GenBank files from their > database - so I doubt the input order matters. Yep, that's correct. If NCBI ruled the world everyone would be using ASN.1 (b/c that's what they use internally). >> It's also >> visually better when the translation of the protein comes in the end >> of the annotation for the CDS and not before /product, /note .... > > I do see your point, but if that were the only motivation I wouldn't > want to make generating GenBank output any more complicated > than it already is. ... >> Anyway maybe I'll reformat the file in sequin table for a direct >> submission to ncbi with sequin. >> >> Thank you. >> >> Max > > Peter Maxime, I find most users try to avoid using GenBank format except when absolutely needed. There is a very good reason Sequin and tbl2asn are used by NCBI for submissions; they end up generating simple tabular data that is easier to feed into their internal ASN.1 format. Genbank is a nice human-readable format, but structure-wise I find it's a pain to deal with, not to mention the variant third-party 'genbank' data that users want us to handle. We try to support generation of output within reason, but that's never been our primary goal. As long as the output generated is capable of being re-read by our parsers with the data intact and generates sane data we're pretty happy. Saying that, any additions to deal with this are perfectly welcome (I pointed out one mechanism that could be used), but they would have to address the concerns Peter and I alluded to previously, and it would be nice to evaluate how any changes affect performance. You are more than welcome to submit this as a feature request using our redmine server (including patches if you do this yourself): https://redmine.open-bio.org/ chris From cjfields at illinois.edu Wed Aug 3 17:10:31 2011 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 3 Aug 2011 12:10:31 -0500 Subject: [Bioperl-l] Genbank file : bad features (tag) order with /translation In-Reply-To: <5D9B8006-A20E-4AAE-88EF-0A1DCA56B26E@verizon.net> References: <7717ecd0-c9b6-4ecc-895f-db60ca89e259@l7g2000vbz.googlegroups.com> <5D9B8006-A20E-4AAE-88EF-0A1DCA56B26E@verizon.net> Message-ID: <51452A39-42B7-4BBF-9F50-A37419E75454@illinois.edu> IMHO I find genbank too unwieldy, but it's nice to know the output works for NCBI submission. chris On Aug 3, 2011, at 12:06 PM, Brian Osborne wrote: > Peter, > > I currently use BioPerl and SeqIO::genbank to create the *gbf files for NCBI submission, they've always accepted them. In fact I think they don't even use them, I believe they use the *tbl, *fsa, and *agp files and the ASN file as data sources. > > Brian O > > On Aug 3, 2011, at 12:52 PM, Chris Fields wrote: > >> On Aug 3, 2011, at 11:00 AM, Peter Cock wrote: >> >>> 2011/8/3 Maxime D?raspe : >>>>> >>>>> Why do you care about the order? >>>>> >>>> >>>> Hi Peter, >>>> >>>> I care about the order for the submission to ncbi. >>> >>> Do the NCBI have some guidelines which ask for a particular order? >> >> No, beyond the feature table there is no specification that indicates such that I am aware of. Submitted data is tabular; sequin is a nicer GUI API for getting data into a useful format for submission to NCBI, where data is converted to ASN.1 I believe. >> >>>> But I guess they >>>> will reformat the file before getting it in their database. >>> >>> They seem to generate the official GenBank files from their >>> database - so I doubt the input order matters. >> >> Yep, that's correct. If NCBI ruled the world everyone would be using ASN.1 (b/c that's what they use internally). >> >>>> It's also >>>> visually better when the translation of the protein comes in the end >>>> of the annotation for the CDS and not before /product, /note .... >>> >>> I do see your point, but if that were the only motivation I wouldn't >>> want to make generating GenBank output any more complicated >>> than it already is. >> ... >>>> Anyway maybe I'll reformat the file in sequin table for a direct >>>> submission to ncbi with sequin. >>>> >>>> Thank you. >>>> >>>> Max >>> >>> Peter >> >> >> Maxime, I find most users try to avoid using GenBank format except when absolutely needed. There is a very good reason Sequin and tbl2asn are used by NCBI for submissions; they end up generating simple tabular data that is easier to feed into their internal ASN.1 format. Genbank is a nice human-readable format, but structure-wise I find it's a pain to deal with, not to mention the variant third-party 'genbank' data that users want us to handle. >> >> We try to support generation of output within reason, but that's never been our primary goal. As long as the output generated is capable of being re-read by our parsers with the data intact and generates sane data we're pretty happy. >> >> Saying that, any additions to deal with this are perfectly welcome (I pointed out one mechanism that could be used), but they would have to address the concerns Peter and I alluded to previously, and it would be nice to evaluate how any changes affect performance. You are more than welcome to submit this as a feature request using our redmine server (including patches if you do this yourself): >> >> https://redmine.open-bio.org/ >> >> chris >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > From bosborne11 at verizon.net Wed Aug 3 17:06:05 2011 From: bosborne11 at verizon.net (Brian Osborne) Date: Wed, 03 Aug 2011 13:06:05 -0400 Subject: [Bioperl-l] Genbank file : bad features (tag) order with /translation In-Reply-To: References: <7717ecd0-c9b6-4ecc-895f-db60ca89e259@l7g2000vbz.googlegroups.com> Message-ID: <5D9B8006-A20E-4AAE-88EF-0A1DCA56B26E@verizon.net> Peter, I currently use BioPerl and SeqIO::genbank to create the *gbf files for NCBI submission, they've always accepted them. In fact I think they don't even use them, I believe they use the *tbl, *fsa, and *agp files and the ASN file as data sources. Brian O On Aug 3, 2011, at 12:52 PM, Chris Fields wrote: > On Aug 3, 2011, at 11:00 AM, Peter Cock wrote: > >> 2011/8/3 Maxime D?raspe : >>>> >>>> Why do you care about the order? >>>> >>> >>> Hi Peter, >>> >>> I care about the order for the submission to ncbi. >> >> Do the NCBI have some guidelines which ask for a particular order? > > No, beyond the feature table there is no specification that indicates such that I am aware of. Submitted data is tabular; sequin is a nicer GUI API for getting data into a useful format for submission to NCBI, where data is converted to ASN.1 I believe. > >>> But I guess they >>> will reformat the file before getting it in their database. >> >> They seem to generate the official GenBank files from their >> database - so I doubt the input order matters. > > Yep, that's correct. If NCBI ruled the world everyone would be using ASN.1 (b/c that's what they use internally). > >>> It's also >>> visually better when the translation of the protein comes in the end >>> of the annotation for the CDS and not before /product, /note .... >> >> I do see your point, but if that were the only motivation I wouldn't >> want to make generating GenBank output any more complicated >> than it already is. > ... >>> Anyway maybe I'll reformat the file in sequin table for a direct >>> submission to ncbi with sequin. >>> >>> Thank you. >>> >>> Max >> >> Peter > > > Maxime, I find most users try to avoid using GenBank format except when absolutely needed. There is a very good reason Sequin and tbl2asn are used by NCBI for submissions; they end up generating simple tabular data that is easier to feed into their internal ASN.1 format. Genbank is a nice human-readable format, but structure-wise I find it's a pain to deal with, not to mention the variant third-party 'genbank' data that users want us to handle. > > We try to support generation of output within reason, but that's never been our primary goal. As long as the output generated is capable of being re-read by our parsers with the data intact and generates sane data we're pretty happy. > > Saying that, any additions to deal with this are perfectly welcome (I pointed out one mechanism that could be used), but they would have to address the concerns Peter and I alluded to previously, and it would be nice to evaluate how any changes affect performance. You are more than welcome to submit this as a feature request using our redmine server (including patches if you do this yourself): > > https://redmine.open-bio.org/ > > chris > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From lskatz at gmail.com Wed Aug 3 21:01:24 2011 From: lskatz at gmail.com (Lee Katz) Date: Wed, 3 Aug 2011 17:01:24 -0400 Subject: [Bioperl-l] SeqIO: paired end reads Message-ID: Hi all! I was wondering how to construct paired end reads from scratch. I know the locations of certain sequences across the genome with a high degree of confidence and so I want to give them to my assembler as paired end reads, along with my other sequence runs (454 and Illumina runs). I plan to use Newbler. My only problem is that I do not know the correct format in order to specify distance and sequences for a paired end reads run, and so I hope that there is a SeqIO solution. At the least, I hope that one bioperl member can point me to where the definition of the paired end reads file format is...? Thank you! --Lee From jason.stajich at gmail.com Wed Aug 3 21:17:01 2011 From: jason.stajich at gmail.com (Jason Stajich) Date: Wed, 3 Aug 2011 13:17:01 -0800 Subject: [Bioperl-l] SeqIO: paired end reads In-Reply-To: References: Message-ID: <57EA9809-E999-43EF-B340-9A552A4A3FB6@gmail.com> it depends on the assembler - For Illumina usually the paired ends end with /1 /2 and they have the same ID but are in two different files. Depends on if you are using interleaved paired reads or in two separate files. some just expect the paired reads to be mated by virtue of being in same order in two files. the ABYSS and Velvet manuals both explain what is expected so you will want to check on what are Newbler's assumptions on how the paired ends are encoded. There are simulator tools if that is what you are trying to do in the end? checkout wgsim which comes with samtools or try dnaa On Aug 3, 2011, at 1:01 PM, Lee Katz wrote: > Hi all! I was wondering how to construct paired end reads from scratch. I > know the locations of certain sequences across the genome with a high degree > of confidence and so I want to give them to my assembler as paired end > reads, along with my other sequence runs (454 and Illumina runs). I plan to > use Newbler. > > My only problem is that I do not know the correct format in order to specify > distance and sequences for a paired end reads run, and so I hope that there > is a SeqIO solution. At the least, I hope that one bioperl member can point > me to where the definition of the paired end reads file format is...? > > Thank you! > > --Lee > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From roy.chaudhuri at gmail.com Thu Aug 4 11:22:23 2011 From: roy.chaudhuri at gmail.com (Roy Chaudhuri) Date: Thu, 04 Aug 2011 12:22:23 +0100 Subject: [Bioperl-l] How to show branch length value in tree In-Reply-To: References: <4E3953FE.5080304@gmail.com> <4E3954AE.2080401@gmail.com> Message-ID: <4E3A80EF.2010409@gmail.com> Hi Shachi, Please keep replies on the mailing list, that way others can follow the discussion. As I mentioned, it is not possible to draw njplot-style trees with labelled branches using Bio::Tree::Draw::Cladogram, it currently only labels nodes (you could perhaps add branch labels as a feature request on Redmine). The code I gave overwrites the existing "leaf" node ids (the accessions) with branch lengths, if you want to also keep the existing labels you could try something like: for my $node ($t1->get_nodes) { if ($node->is_Leaf) { $node->id($node->branch_length.' '.$node->id); } else { $node->id($node->branch_length) } } Cheers, Roy. On 04/08/2011 05:36, Shachi Gahoi wrote: > Thank You so much. Now branch length is coming in tree. > > But I want Accesssion number in place of node id. > > I attached snapshot of tree as I want. Please tell me how can I do this. > > > > > On Wed, Aug 3, 2011 at 7:31 PM, Roy Chaudhuri > wrote: > > Sorry, the code had a typo, it should be: > > > my $obj1 = Bio::Tree::Draw::Cladogram->__new(-bootstrap => 1, > -tree => $t1, > -compact => 0); > for my $node ($t1->get_nodes) { > > $node->id($node->branch___length) if defined $node->branch_length; > } > $obj1->print(-file => "$dir/$stem.eps") > > On 03/08/2011 14:58, Roy Chaudhuri wrote: > > Hi Shachi, > > I don't think you can draw labels on branches using > Bio::Tree::Draw::Cladogram. However, it will draw node labels, > so you > could copy the branch lengths over to the node ids: > > my $obj1 = Bio::Tree::Draw::Cladogram->__new(-bootstrap => 1, > -tree => $t1, > -compact => 0); > for my $node ($tree->get_nodes) { > $node->id($node->branch___length) if defined > $node->branch_length; > } > $obj1->print(-file => "$dir/$stem.eps") > > Incidentally, in your script you write the tree out to a file, > then read > it back in using TreeIO. This is unnecessary, you can use $tree > directly > as input to Bio::Tree::Draw::Cladogram. > > Alternatively, you could write out a newick file and use non-Bioperl > software such as njplot or MEGA to draw your tree with labelled > branch > lengths. > > Cheers, > Roy. > > On 03/08/2011 07:00, Shachi Gahoi wrote: > > Dear All > > I am using Bio::Tree modules for constructing and drawing > tree. *I am unable > to show branch length value in tree. > * > Please tell me How can I do this, if anybody knows. > > Here is my script which i am using...and i also attached > generated tree. > > Thanks in advance > > ##############################__##############################__##############################__###### > > use Bio::AlignIO; > use Bio::Align::ProteinStatistics; > use Bio::Tree::DistanceFactory; > use Bio::TreeIO; > use Bio::Tree::Draw::Cladogram; > > # for a dna alignment > # can also use ProteinStatistics > > my $alnio = Bio::AlignIO->new(-file => 'ADP.aln', > -format=>'clustalw'); > > my $dfactory = Bio::Tree::DistanceFactory->__new(-method => > 'UPGMA'); > > my $stats = Bio::Align::ProteinStatistics-__>new; > > my $treeout = Bio::TreeIO->new(-format => 'newick', -file > =>'>ADP1.dnd'); > > while( my $aln = $alnio->next_aln ) > { > my $mat = $stats->distance(-method => 'Kimura', -align > => $aln); > > my $tree = $dfactory->make_tree($mat); > $treeout->write_tree($tree); > } > > my $dir = shift || '.'; > > opendir(DIR, $dir) || die $!; > for my $file ( readdir(DIR) ) > { > next unless $file =~ /(\S+)\.dnd$/; > my $stem = $1; > my $treeio = Bio::TreeIO->new('-format' => 'newick', > '-file' => "$dir/$file"); > > if( my $t1 = $treeio->next_tree ) > { > my $obj1 = > Bio::Tree::Draw::Cladogram->__new(-bootstrap => 1, > -tree > => $t1, > > -compact => 0); > $obj1->print(-file => "$dir/$stem.eps"); > } > } > > ##############################__##############################__##############################__############## > > > > > _________________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/__mailman/listinfo/bioperl-l > > > > > > > > -- > Regards, > Shachi From razi.khaja at gmail.com Thu Aug 4 17:39:28 2011 From: razi.khaja at gmail.com (Razi Khaja) Date: Thu, 4 Aug 2011 13:39:28 -0400 Subject: [Bioperl-l] BioPerl on GitHub will not install Message-ID: All, I just checked out the latest development version of BioPerl from GitHub and found that it does not install because bp_das_server.pl is missing. Building BioPerl 'blib/script/bp_das_server.pl' and 'blib/script/bp_das_server.pl' are identical (not copied) at /opt/bioperl-live/Bio/Root/Build.pm line 219 Use of uninitialized value in utime at /usr/lib/perl5/5.8.8/File/Copy.pm line 218. Use of uninitialized value in utime at /usr/lib/perl5/5.8.8/File/Copy.pm line 218. Can't rename 'blib/script/bp_das_server.pl' to 'blib/script/bp_das_server.pl': No such file or directory at /opt/bioperl-live/Bio/Root/Build.pm line 219. After copying the bp_das_server.pl that I had from a previous installation to 'blib/script', I was able to ./Build test and ./Build install the development version I checked out. Could someone test out this problem and fix it on github? if it really is a problem? Thanks, Razi From cjfields at illinois.edu Thu Aug 4 17:42:48 2011 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 4 Aug 2011 12:42:48 -0500 Subject: [Bioperl-l] [Bioperl-guts-l] BioPerl on GitHub will not install In-Reply-To: References: Message-ID: <007DAD37-BC86-4D1F-8C40-816890661F7D@illinois.edu> Yes, I can replicate that. It's from the recent renaming for scripts. I'll look into it. chris On Aug 4, 2011, at 12:39 PM, Razi Khaja wrote: > All, > > I just checked out the latest development version of BioPerl from GitHub and > found that it does not install because bp_das_server.pl is missing. > > Building BioPerl > 'blib/script/bp_das_server.pl' and 'blib/script/bp_das_server.pl' are > identical (not copied) at /opt/bioperl-live/Bio/Root/Build.pm line 219 > Use of uninitialized value in utime at /usr/lib/perl5/5.8.8/File/Copy.pm > line 218. > Use of uninitialized value in utime at /usr/lib/perl5/5.8.8/File/Copy.pm > line 218. > Can't rename 'blib/script/bp_das_server.pl' to 'blib/script/bp_das_server.pl': > No such file or directory at /opt/bioperl-live/Bio/Root/Build.pm line 219. > > After copying the bp_das_server.pl that I had from a previous installation > to 'blib/script', I was able to ./Build test and ./Build install the > development version I checked out. > > Could someone test out this problem and fix it on github? if it really is a > problem? > > Thanks, > > Razi > _______________________________________________ > Bioperl-guts-l mailing list > Bioperl-guts-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-guts-l From hlapp at drycafe.net Thu Aug 4 21:31:52 2011 From: hlapp at drycafe.net (Hilmar Lapp) Date: Thu, 4 Aug 2011 17:31:52 -0400 Subject: [Bioperl-l] Bio::Root::IO _readline/_pushback behavior In-Reply-To: References: <4E390CE8.2050100@biotech.uni-tuebingen.de> Message-ID: I agree. In fact I'm surprised that $io->_pushback() does not act like unshift() - that's I thought how it is used. -hilmar On Aug 3, 2011, at 10:16 AM, Chris Fields wrote: > On Aug 3, 2011, at 3:55 AM, Kai Blin wrote: > >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA1 >> >> Hi folks, >> >> as I mentioned on https://redmine.open-bio.org/issues/3264 there is >> something odd going on with Bio::Root::IO's _readline/_pushback >> functions. This seems to be intentional, at least there is a test >> case >> asserting the behaviour I'm seeing. It his however very confusing >> to the >> unexpecting programmer using the code. >> >> One assumption I'd immediately make would be that if I have code that >> does a $foo = $io->_readline; $io->_pushback($foo); $bar = >> $io->_readline;, $foo will be the same string as $bar, regardless >> what >> other pieces of the code did. Currently, this is not the case, >> because >> the readbuffer that _pushback pushes back into has new strings >> appended >> to the end but readline removes them from the front. > > I think this test is performed in the regressions already, but if > not then it is more than welcome. > >> This easily violates the "principle of least surprise", so I think we >> should change the readbuffer to a stack. As far as I can tell, >> changing >> the _pushback function to "unshift" instead of "push" to the >> readbuffer >> breaks only the Root/RootIO.t test designed to test the old >> behaviour. I >> don't see any other tests failing on my system that don't fail >> without >> this patch. >> >> Any comments from the core devs? > > I don't have a problem with that beyond the change to the RootIO.t > tests (it implies a specific behavior that some developers expect, > so is a very subtle API change). However, this is how one would > expect it, to be more like an 'unread' stack instead of a queue. In > fact, there is a module I used for Biome's pushback/readline called > IO::Unread that implements an IO layer for mimicing this behavior, > might be worth looking into. > >> Cheers, >> Kai > > chris > > > Christopher Fields > Senior Research Scientist > National Center for Supercomputing Applications > Institute for Genomic Biology > University of Illinois Urbana-Champaign > 1206 W. Gregory Dr. , MC-195 > Urbana, IL 61801 > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : =========================================================== From cjfields at illinois.edu Thu Aug 4 21:42:30 2011 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 4 Aug 2011 16:42:30 -0500 Subject: [Bioperl-l] Bio::Root::IO _readline/_pushback behavior In-Reply-To: References: <4E390CE8.2050100@biotech.uni-tuebingen.de> Message-ID: <4196E008-4A81-41E5-A4F9-F9F8D3851E5C@illinois.edu> Yeah, it's a queue; the 'buffering' is a simple internal array using push/shift. I say we merge the change in from the branch and fix any modules accordingly. chris On Aug 4, 2011, at 4:31 PM, Hilmar Lapp wrote: > I agree. In fact I'm surprised that $io->_pushback() does not act like unshift() - that's I thought how it is used. > > -hilmar > > On Aug 3, 2011, at 10:16 AM, Chris Fields wrote: > >> On Aug 3, 2011, at 3:55 AM, Kai Blin wrote: >> >>> -----BEGIN PGP SIGNED MESSAGE----- >>> Hash: SHA1 >>> >>> Hi folks, >>> >>> as I mentioned on https://redmine.open-bio.org/issues/3264 there is >>> something odd going on with Bio::Root::IO's _readline/_pushback >>> functions. This seems to be intentional, at least there is a test case >>> asserting the behaviour I'm seeing. It his however very confusing to the >>> unexpecting programmer using the code. >>> >>> One assumption I'd immediately make would be that if I have code that >>> does a $foo = $io->_readline; $io->_pushback($foo); $bar = >>> $io->_readline;, $foo will be the same string as $bar, regardless what >>> other pieces of the code did. Currently, this is not the case, because >>> the readbuffer that _pushback pushes back into has new strings appended >>> to the end but readline removes them from the front. >> >> I think this test is performed in the regressions already, but if not then it is more than welcome. >> >>> This easily violates the "principle of least surprise", so I think we >>> should change the readbuffer to a stack. As far as I can tell, changing >>> the _pushback function to "unshift" instead of "push" to the readbuffer >>> breaks only the Root/RootIO.t test designed to test the old behaviour. I >>> don't see any other tests failing on my system that don't fail without >>> this patch. >>> >>> Any comments from the core devs? >> >> I don't have a problem with that beyond the change to the RootIO.t tests (it implies a specific behavior that some developers expect, so is a very subtle API change). However, this is how one would expect it, to be more like an 'unread' stack instead of a queue. In fact, there is a module I used for Biome's pushback/readline called IO::Unread that implements an IO layer for mimicing this behavior, might be worth looking into. >> >>> Cheers, >>> Kai >> >> chris >> >> >> Christopher Fields >> Senior Research Scientist >> National Center for Supercomputing Applications >> Institute for Genomic Biology >> University of Illinois Urbana-Champaign >> 1206 W. Gregory Dr. , MC-195 >> Urbana, IL 61801 >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : > =========================================================== > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Thu Aug 4 22:11:29 2011 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 4 Aug 2011 17:11:29 -0500 Subject: [Bioperl-l] [Bioperl-guts-l] BioPerl on GitHub will not install In-Reply-To: <007DAD37-BC86-4D1F-8C40-816890661F7D@illinois.edu> References: <007DAD37-BC86-4D1F-8C40-816890661F7D@illinois.edu> Message-ID: <0A691C42-539E-45A1-B44F-7B0B5D8DE3D8@illinois.edu> Now fixed on github. There was some cruft left in Bio::Root::Build that didn't deal with the recent script renaming. chris On Aug 4, 2011, at 12:42 PM, Chris Fields wrote: > Yes, I can replicate that. It's from the recent renaming for scripts. I'll look into it. > > chris > > On Aug 4, 2011, at 12:39 PM, Razi Khaja wrote: > >> All, >> >> I just checked out the latest development version of BioPerl from GitHub and >> found that it does not install because bp_das_server.pl is missing. >> >> Building BioPerl >> 'blib/script/bp_das_server.pl' and 'blib/script/bp_das_server.pl' are >> identical (not copied) at /opt/bioperl-live/Bio/Root/Build.pm line 219 >> Use of uninitialized value in utime at /usr/lib/perl5/5.8.8/File/Copy.pm >> line 218. >> Use of uninitialized value in utime at /usr/lib/perl5/5.8.8/File/Copy.pm >> line 218. >> Can't rename 'blib/script/bp_das_server.pl' to 'blib/script/bp_das_server.pl': >> No such file or directory at /opt/bioperl-live/Bio/Root/Build.pm line 219. >> >> After copying the bp_das_server.pl that I had from a previous installation >> to 'blib/script', I was able to ./Build test and ./Build install the >> development version I checked out. >> >> Could someone test out this problem and fix it on github? if it really is a >> problem? >> >> Thanks, >> >> Razi >> _______________________________________________ >> Bioperl-guts-l mailing list >> Bioperl-guts-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-guts-l > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From shachigahoimbi at gmail.com Fri Aug 5 05:40:11 2011 From: shachigahoimbi at gmail.com (Shachi Gahoi) Date: Fri, 5 Aug 2011 11:10:11 +0530 Subject: [Bioperl-l] How to show branch length value in tree In-Reply-To: <4E3A80EF.2010409@gmail.com> References: <4E3953FE.5080304@gmail.com> <4E3954AE.2080401@gmail.com> <4E3A80EF.2010409@gmail.com> Message-ID: Instead of both node id and accession, Can I replace node id with accession? On Thu, Aug 4, 2011 at 4:52 PM, Roy Chaudhuri wrote: > Hi Shachi, > > Please keep replies on the mailing list, that way others can follow the > discussion. > > As I mentioned, it is not possible to draw njplot-style trees with labelled > branches using Bio::Tree::Draw::Cladogram, it currently only labels nodes > (you could perhaps add branch labels as a feature request on Redmine). > > The code I gave overwrites the existing "leaf" node ids (the accessions) > with branch lengths, if you want to also keep the existing labels you could > try something like: > > > for my $node ($t1->get_nodes) { > if ($node->is_Leaf) { > $node->id($node->branch_**length.' '.$node->id); > } else { > > $node->id($node->branch_**length) > } > } > > Cheers, > Roy. > > > On 04/08/2011 05:36, Shachi Gahoi wrote: > >> Thank You so much. Now branch length is coming in tree. >> >> But I want Accesssion number in place of node id. >> >> I attached snapshot of tree as I want. Please tell me how can I do this. >> >> >> >> >> On Wed, Aug 3, 2011 at 7:31 PM, Roy Chaudhuri > >> wrote: >> >> Sorry, the code had a typo, it should be: >> >> >> my $obj1 = Bio::Tree::Draw::Cladogram->__**new(-bootstrap => 1, >> -tree => $t1, >> -compact => 0); >> for my $node ($t1->get_nodes) { >> >> $node->id($node->branch___**length) if defined >> $node->branch_length; >> } >> $obj1->print(-file => "$dir/$stem.eps") >> >> On 03/08/2011 14:58, Roy Chaudhuri wrote: >> >> Hi Shachi, >> >> I don't think you can draw labels on branches using >> Bio::Tree::Draw::Cladogram. However, it will draw node labels, >> so you >> could copy the branch lengths over to the node ids: >> >> my $obj1 = Bio::Tree::Draw::Cladogram->__**new(-bootstrap => 1, >> -tree => $t1, >> -compact => 0); >> for my $node ($tree->get_nodes) { >> $node->id($node->branch___**length) if defined >> $node->branch_length; >> } >> $obj1->print(-file => "$dir/$stem.eps") >> >> Incidentally, in your script you write the tree out to a file, >> then read >> it back in using TreeIO. This is unnecessary, you can use $tree >> directly >> as input to Bio::Tree::Draw::Cladogram. >> >> Alternatively, you could write out a newick file and use >> non-Bioperl >> software such as njplot or MEGA to draw your tree with labelled >> branch >> lengths. >> >> Cheers, >> Roy. >> >> On 03/08/2011 07:00, Shachi Gahoi wrote: >> >> Dear All >> >> I am using Bio::Tree modules for constructing and drawing >> tree. *I am unable >> to show branch length value in tree. >> * >> Please tell me How can I do this, if anybody knows. >> >> Here is my script which i am using...and i also attached >> generated tree. >> >> Thanks in advance >> >> ##############################**__############################ >> **##__##########################**####__###### >> >> use Bio::AlignIO; >> use Bio::Align::ProteinStatistics; >> use Bio::Tree::DistanceFactory; >> use Bio::TreeIO; >> use Bio::Tree::Draw::Cladogram; >> >> # for a dna alignment >> # can also use ProteinStatistics >> >> my $alnio = Bio::AlignIO->new(-file => 'ADP.aln', >> -format=>'clustalw'); >> >> my $dfactory = Bio::Tree::DistanceFactory->__**new(-method => >> 'UPGMA'); >> >> my $stats = Bio::Align::ProteinStatistics-**__>new; >> >> my $treeout = Bio::TreeIO->new(-format => 'newick', -file >> =>'>ADP1.dnd'); >> >> while( my $aln = $alnio->next_aln ) >> { >> my $mat = $stats->distance(-method => 'Kimura', -align >> => $aln); >> >> my $tree = $dfactory->make_tree($mat); >> $treeout->write_tree($tree); >> } >> >> my $dir = shift || '.'; >> >> opendir(DIR, $dir) || die $!; >> for my $file ( readdir(DIR) ) >> { >> next unless $file =~ /(\S+)\.dnd$/; >> my $stem = $1; >> my $treeio = Bio::TreeIO->new('-format' => 'newick', >> '-file' => "$dir/$file"); >> >> if( my $t1 = $treeio->next_tree ) >> { >> my $obj1 = >> Bio::Tree::Draw::Cladogram->__**new(-bootstrap => 1, >> -tree >> => $t1, >> >> -compact => 0); >> $obj1->print(-file => "$dir/$stem.eps"); >> } >> } >> >> ##############################**__############################ >> **##__##########################**####__############## >> >> >> >> >> ______________________________**___________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> >> > >> >> http://lists.open-bio.org/__**mailman/listinfo/bioperl-l >> >> > >> >> >> >> >> >> >> -- >> Regards, >> Shachi >> > > -- Regards, Shachi From kai.blin at biotech.uni-tuebingen.de Fri Aug 5 08:40:57 2011 From: kai.blin at biotech.uni-tuebingen.de (Kai Blin) Date: Fri, 05 Aug 2011 10:40:57 +0200 Subject: [Bioperl-l] Bio::Root::IO _readline/_pushback behavior In-Reply-To: <4196E008-4A81-41E5-A4F9-F9F8D3851E5C@illinois.edu> References: <4E390CE8.2050100@biotech.uni-tuebingen.de> <4196E008-4A81-41E5-A4F9-F9F8D3851E5C@illinois.edu> Message-ID: <4E3BAC99.8050806@biotech.uni-tuebingen.de> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2011-08-04 23:42, Chris Fields wrote: > Yeah, it's a queue; the 'buffering' is a simple internal array using > push/shift. I say we merge the change in from the branch and fix > any modules accordingly. Ok, I'm happy to take care of it, if people can tell me how to find and fix modules that use the old assumption. My initial attempt right after making the change was to run the test suite, which came up clean apart from the RootIO.t case that my patch now modifies as well. Cheers, Kai - -- Dipl.-Inform. Kai Blin kai.blin at biotech.uni-tuebingen.de Institute for Microbiology and Infection Medicine Division of Microbiology/Biotechnology Eberhard-Karls-Universit?t T?bingen Auf der Morgenstelle 28 Phone : ++49 7071 29-78841 D-72076 T?bingen Fax : ++49 7071 29-5979 Germany Homepage: http://www.mikrobio.uni-tuebingen.de/ag_wohlleben -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJOO6yZAAoJEKM5lwBiwTTPdjsH/0ELbz9VYIzxlpx+QZ3Jvd55 KTXVP+oOzjIDlOdxbdqYR0w04VXnpkQek3hVt0mbreuKvtdMJY/YhRwZLiOzYSak ruhswUJQnm3K2vkaqpgLESIIUASneFrW7ezfV3R9q/Ov730GBDAtkLTEk7cVV5Cg W515ixJtNC7v6fZmNFJZudQbcUYYgy+8BFgvNUaSoH8YqubMXzjFXknBWeWT0qco ivHjqIc6Nkap799ijPiLEU7ArI1pEOB2jyvjntIocFR72imbo7e86RaVHJCNl/N7 GFbRGoH2m7LVeWFYuNM3vsTS3W4KVLg9U/8UBysykR3uoHAVJhm4T5nCT4NKE/w= =z6QZ -----END PGP SIGNATURE----- From roy.chaudhuri at gmail.com Fri Aug 5 10:54:32 2011 From: roy.chaudhuri at gmail.com (Roy Chaudhuri) Date: Fri, 05 Aug 2011 11:54:32 +0100 Subject: [Bioperl-l] How to show branch length value in tree In-Reply-To: References: <4E3953FE.5080304@gmail.com> <4E3954AE.2080401@gmail.com> <4E3A80EF.2010409@gmail.com> Message-ID: <4E3BCBE8.4030303@gmail.com> In that case then you only want to add branch lengths to non-leaf nodes, so it would be: for my $node ($t1->get_nodes) { $node->id($node->branch_length) unless $node->is_Leaf } On 05/08/2011 06:40, Shachi Gahoi wrote: > > Instead of both node id and accession, Can I replace node id with accession? > > > On Thu, Aug 4, 2011 at 4:52 PM, Roy Chaudhuri > wrote: > > Hi Shachi, > > Please keep replies on the mailing list, that way others can follow > the discussion. > > As I mentioned, it is not possible to draw njplot-style trees with > labelled branches using Bio::Tree::Draw::Cladogram, it currently > only labels nodes (you could perhaps add branch labels as a feature > request on Redmine). > > The code I gave overwrites the existing "leaf" node ids (the > accessions) with branch lengths, if you want to also keep the > existing labels you could try something like: > > > for my $node ($t1->get_nodes) { > if ($node->is_Leaf) { > $node->id($node->branch___length.' '.$node->id); > } else { > > $node->id($node->branch___length) > } > } > > Cheers, > Roy. > > > On 04/08/2011 05:36, Shachi Gahoi wrote: > > Thank You so much. Now branch length is coming in tree. > > But I want Accesssion number in place of node id. > > I attached snapshot of tree as I want. Please tell me how can I > do this. > > > > > On Wed, Aug 3, 2011 at 7:31 PM, Roy Chaudhuri > > >> wrote: > > Sorry, the code had a typo, it should be: > > > my $obj1 = Bio::Tree::Draw::Cladogram->____new(-bootstrap => 1, > -tree => $t1, > -compact => 0); > for my $node ($t1->get_nodes) { > > $node->id($node->branch_____length) if defined > $node->branch_length; > } > $obj1->print(-file => "$dir/$stem.eps") > > On 03/08/2011 14:58, Roy Chaudhuri wrote: > > Hi Shachi, > > I don't think you can draw labels on branches using > Bio::Tree::Draw::Cladogram. However, it will draw node > labels, > so you > could copy the branch lengths over to the node ids: > > my $obj1 = > Bio::Tree::Draw::Cladogram->____new(-bootstrap => 1, > -tree => > $t1, > -compact => > 0); > for my $node ($tree->get_nodes) { > $node->id($node->branch_____length) if defined > $node->branch_length; > } > $obj1->print(-file => "$dir/$stem.eps") > > Incidentally, in your script you write the tree out to a > file, > then read > it back in using TreeIO. This is unnecessary, you can > use $tree > directly > as input to Bio::Tree::Draw::Cladogram. > > Alternatively, you could write out a newick file and use > non-Bioperl > software such as njplot or MEGA to draw your tree with > labelled > branch > lengths. > > Cheers, > Roy. > > On 03/08/2011 07:00, Shachi Gahoi wrote: > > Dear All > > I am using Bio::Tree modules for constructing and > drawing > tree. *I am unable > to show branch length value in tree. > * > Please tell me How can I do this, if anybody knows. > > Here is my script which i am using...and i also attached > generated tree. > > Thanks in advance > > > ##############################____############################__##__##########################__####__###### > > use Bio::AlignIO; > use Bio::Align::ProteinStatistics; > use Bio::Tree::DistanceFactory; > use Bio::TreeIO; > use Bio::Tree::Draw::Cladogram; > > # for a dna alignment > # can also use ProteinStatistics > > my $alnio = Bio::AlignIO->new(-file => 'ADP.aln', > -format=>'clustalw'); > > my $dfactory = > Bio::Tree::DistanceFactory->____new(-method => > 'UPGMA'); > > my $stats = Bio::Align::ProteinStatistics-____>new; > > my $treeout = Bio::TreeIO->new(-format => 'newick', > -file > =>'>ADP1.dnd'); > > while( my $aln = $alnio->next_aln ) > { > my $mat = $stats->distance(-method => 'Kimura', > -align > => $aln); > > my $tree = $dfactory->make_tree($mat); > $treeout->write_tree($tree); > } > > my $dir = shift || '.'; > > opendir(DIR, $dir) || die $!; > for my $file ( readdir(DIR) ) > { > next unless $file =~ /(\S+)\.dnd$/; > my $stem = $1; > my $treeio = Bio::TreeIO->new('-format' => > 'newick', > '-file' => "$dir/$file"); > > if( my $t1 = $treeio->next_tree ) > { > my $obj1 = > Bio::Tree::Draw::Cladogram->____new(-bootstrap => 1, > > -tree > => $t1, > > -compact => 0); > $obj1->print(-file => "$dir/$stem.eps"); > } > } > > > ##############################____############################__##__##########################__####__############## > > > > > ___________________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > > > > http://lists.open-bio.org/____mailman/listinfo/bioperl-l > > > > > > > > > > -- > Regards, > Shachi > > > > > > -- > Regards, > Shachi From lskatz at gmail.com Fri Aug 5 14:32:50 2011 From: lskatz at gmail.com (Lee Katz) Date: Fri, 5 Aug 2011 10:32:50 -0400 Subject: [Bioperl-l] SeqIO: paired end reads In-Reply-To: <57EA9809-E999-43EF-B340-9A552A4A3FB6@gmail.com> References: <57EA9809-E999-43EF-B340-9A552A4A3FB6@gmail.com> Message-ID: Thank you. I figured out through the Newbler manual that there is a linker sequence to separate the paired end reads. Then, the forum at http://seqanswers.com/forums/showthread.php?t=12940 showed me that the linker sequence is "GTTGGAACCGAAAGGGTTTGAATTCAAACCCTTTCGGTTCCAAC". I think a useful addition to bioperl could be to have paired end reads. This is outside of the domain of bioperl, but now I am left wondering how I could specify the distance between reads in Newbler, if the linker sequence is fixed. On Wed, Aug 3, 2011 at 5:17 PM, Jason Stajich wrote: > it depends on the assembler - For Illumina usually the paired ends end with > /1 /2 and they have the same ID but are in two different files. Depends on > if you are using interleaved paired reads or in two separate files. some > just expect the paired reads to be mated by virtue of being in same order in > two files. the ABYSS and Velvet manuals both explain what is expected so > you will want to check on what are Newbler's assumptions on how the paired > ends are encoded. > > There are simulator tools if that is what you are trying to do in the end? > checkout wgsim which comes with samtools or try dnaa > > > On Aug 3, 2011, at 1:01 PM, Lee Katz wrote: > > > Hi all! I was wondering how to construct paired end reads from scratch. > I > > know the locations of certain sequences across the genome with a high > degree > > of confidence and so I want to give them to my assembler as paired end > > reads, along with my other sequence runs (454 and Illumina runs). I plan > to > > use Newbler. > > > > My only problem is that I do not know the correct format in order to > specify > > distance and sequences for a paired end reads run, and so I hope that > there > > is a SeqIO solution. At the least, I hope that one bioperl member can > point > > me to where the definition of the paired end reads file format is...? > > > > Thank you! > > > > --Lee > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From cjfields at illinois.edu Fri Aug 5 15:50:42 2011 From: cjfields at illinois.edu (Chris Fields) Date: Fri, 5 Aug 2011 10:50:42 -0500 Subject: [Bioperl-l] Bio::Root::IO _readline/_pushback behavior In-Reply-To: <4E3BAC99.8050806@biotech.uni-tuebingen.de> References: <4E390CE8.2050100@biotech.uni-tuebingen.de> <4196E008-4A81-41E5-A4F9-F9F8D3851E5C@illinois.edu> <4E3BAC99.8050806@biotech.uni-tuebingen.de> Message-ID: <86DE321E-E532-4089-9B89-E257DB37CE46@illinois.edu> I would just go based on the test suite for now. If we run into others that don't have tests we need to add new tests for those anyway. chris On Aug 5, 2011, at 3:40 AM, Kai Blin wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 2011-08-04 23:42, Chris Fields wrote: > >> Yeah, it's a queue; the 'buffering' is a simple internal array using >> push/shift. I say we merge the change in from the branch and fix >> any modules accordingly. > > Ok, I'm happy to take care of it, if people can tell me how to find and > fix modules that use the old assumption. My initial attempt right after > making the change was to run the test suite, which came up clean apart > from the RootIO.t case that my patch now modifies as well. > > Cheers, > Kai > > - -- > Dipl.-Inform. Kai Blin kai.blin at biotech.uni-tuebingen.de > Institute for Microbiology and Infection Medicine > Division of Microbiology/Biotechnology > Eberhard-Karls-Universit?t T?bingen > Auf der Morgenstelle 28 Phone : ++49 7071 29-78841 > D-72076 T?bingen Fax : ++49 7071 29-5979 > Germany > Homepage: http://www.mikrobio.uni-tuebingen.de/ag_wohlleben > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.10 (GNU/Linux) > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ > > iQEcBAEBAgAGBQJOO6yZAAoJEKM5lwBiwTTPdjsH/0ELbz9VYIzxlpx+QZ3Jvd55 > KTXVP+oOzjIDlOdxbdqYR0w04VXnpkQek3hVt0mbreuKvtdMJY/YhRwZLiOzYSak > ruhswUJQnm3K2vkaqpgLESIIUASneFrW7ezfV3R9q/Ov730GBDAtkLTEk7cVV5Cg > W515ixJtNC7v6fZmNFJZudQbcUYYgy+8BFgvNUaSoH8YqubMXzjFXknBWeWT0qco > ivHjqIc6Nkap799ijPiLEU7ArI1pEOB2jyvjntIocFR72imbo7e86RaVHJCNl/N7 > GFbRGoH2m7LVeWFYuNM3vsTS3W4KVLg9U/8UBysykR3uoHAVJhm4T5nCT4NKE/w= > =z6QZ > -----END PGP SIGNATURE----- > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Fri Aug 5 20:49:54 2011 From: cjfields at illinois.edu (Chris Fields) Date: Fri, 5 Aug 2011 15:49:54 -0500 Subject: [Bioperl-l] BioPerl Test requirements In-Reply-To: <0D28A228-53D1-4843-B99D-9F8A48132EA2@illinois.edu> References: <79E5ED11-1D3F-436F-AA02-D922EAEB123A@illinois.edu> <0D28A228-53D1-4843-B99D-9F8A48132EA2@illinois.edu> Message-ID: <1FDBD8D4-E8E6-44EB-A18A-7E74A0EF9014@illinois.edu> Okay, I tested this out on a branch and then merged into 'master'. Test::Most is a 'build_requires'; Bio::Root::Test is now just a wrapper for Test::Most methods, with a few extra wrinkles to deal with Test::Warn and a few additional methods. I also removed extraneous modules in t/lib along with Bio::Root::Test::Warn (that code was merged into Bio::Root::Test to keep all evilness in one contained location). The nice thing is the transition didn't require changing any tests. However, this will require some testing across the board to make sure everything's working. Maybe worth getting the code cleaned up for another quick point release prior to the GSoC mayhem to ensue shortly... :) chris On Aug 1, 2011, at 3:34 PM, Chris Fields wrote: > Okay, will do. I'll initially test on a branch and then pull in. Thanks for the feedback Hilmar and Dave! > > chris > > On Aug 1, 2011, at 3:30 PM, Hilmar Lapp wrote: > >> I think the small burden this change incurs for each developer is well outweighed by the reduced maintenance and installation burden. Go for it. >> >> -hilmar >> >> On Aug 1, 2011, at 12:07 AM, Chris Fields wrote: >> >>> All, >>> >>> We are currently using a BioPerl-specific module for running tests called Bio::Root::Test. It is essentially a wrapper module, re-exporting all the methods for Test::More, Test::Exception, and Test::Warn. One problem: it currently expects a copy of Test::Warn and Test::Exception in each repository as a fallback. Another problem: these included modules appear to be triggering dependencies with debian packaging. >>> >>> As an example of one hidden dependency, the included Test::Warn requires Array::Compare, which converted to Moose a few years ago, so you automatically have to install the entire Moose dependency tree, even though Bioperl doesn't require it (not a slam on Moose, you really SHOULD be using Moose these days. No, really :). >>> >>> Anway, more recent versions of Test::Warn don't have this requirement, but as we package an old version of this module we get stuck with the dependencies until we (manually) update this for each repository. Ick. >>> >>> I think the best solution is to remove the bioperl-local modules in t/lib and list Test::Most instead as a 'build_requires' in Build.PL, e.g. the module is only necessary for the build phase so is optionally installed. Test::Most essentially does exactly the same thing as Bio::Root::Test and more; it also includes Test::Deep and Test::Diff (Bio::Root::Test has a few additional methods of use as well). >>> >>> As this will require developers to use Test::Most instead, though, I though it would be worth asking on the list to see if there are any objections. Any thoughts? >>> >>> >>> chris >>> >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> -- >> =========================================================== >> : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : >> =========================================================== >> >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From kai.blin at biotech.uni-tuebingen.de Fri Aug 5 22:35:32 2011 From: kai.blin at biotech.uni-tuebingen.de (Kai Blin) Date: Sat, 06 Aug 2011 00:35:32 +0200 Subject: [Bioperl-l] Bio::Root::IO _readline/_pushback behavior In-Reply-To: <86DE321E-E532-4089-9B89-E257DB37CE46@illinois.edu> References: <4E390CE8.2050100@biotech.uni-tuebingen.de> <4196E008-4A81-41E5-A4F9-F9F8D3851E5C@illinois.edu> <4E3BAC99.8050806@biotech.uni-tuebingen.de> <86DE321E-E532-4089-9B89-E257DB37CE46@illinois.edu> Message-ID: <4E3C7034.2000106@biotech.uni-tuebingen.de> On 2011-08-05 17:50, Chris Fields wrote: > I would just go based on the test suite for now. If we run into > others that don't have tests we need to add new tests for those > anyway. Ok, pushed to master. Cheers, Kai -- Dipl.-Inform. Kai Blin kai.blin at biotech.uni-tuebingen.de Institute for Microbiology and Infection Medicine Division of Microbiology/Biotechnology Eberhard-Karls-University of T?bingen Auf der Morgenstelle 28 Phone : ++49 7071 29-78841 D-72076 T?bingen Fax : ++49 7071 29-5979 Deutschland Homepage: http://www.mikrobio.uni-tuebingen.de/ag_wohlleben From shachigahoimbi at gmail.com Sat Aug 6 04:25:43 2011 From: shachigahoimbi at gmail.com (Shachi Gahoi) Date: Sat, 6 Aug 2011 09:55:43 +0530 Subject: [Bioperl-l] How to show branch length value in tree In-Reply-To: <4E3BCBE8.4030303@gmail.com> References: <4E3953FE.5080304@gmail.com> <4E3954AE.2080401@gmail.com> <4E3A80EF.2010409@gmail.com> <4E3BCBE8.4030303@gmail.com> Message-ID: Thank you so much. Please tell me one more thing, *can I reduce branch length font? * On Fri, Aug 5, 2011 at 4:24 PM, Roy Chaudhuri wrote: > In that case then you only want to add branch lengths to non-leaf nodes, so > it would be: > > > for my $node ($t1->get_nodes) { > $node->id($node->branch_**length) unless $node->is_Leaf > > } > > > On 05/08/2011 06:40, Shachi Gahoi wrote: > >> >> Instead of both node id and accession, Can I replace node id with >> accession? >> >> >> On Thu, Aug 4, 2011 at 4:52 PM, Roy Chaudhuri > >> wrote: >> >> Hi Shachi, >> >> Please keep replies on the mailing list, that way others can follow >> the discussion. >> >> As I mentioned, it is not possible to draw njplot-style trees with >> labelled branches using Bio::Tree::Draw::Cladogram, it currently >> only labels nodes (you could perhaps add branch labels as a feature >> request on Redmine). >> >> The code I gave overwrites the existing "leaf" node ids (the >> accessions) with branch lengths, if you want to also keep the >> existing labels you could try something like: >> >> >> for my $node ($t1->get_nodes) { >> if ($node->is_Leaf) { >> $node->id($node->branch___**length.' '.$node->id); >> } else { >> >> $node->id($node->branch___**length) >> } >> } >> >> Cheers, >> Roy. >> >> >> On 04/08/2011 05:36, Shachi Gahoi wrote: >> >> Thank You so much. Now branch length is coming in tree. >> >> But I want Accesssion number in place of node id. >> >> I attached snapshot of tree as I want. Please tell me how can I >> do this. >> >> >> >> >> On Wed, Aug 3, 2011 at 7:31 PM, Roy Chaudhuri >> >> > >> > >>> >> wrote: >> >> Sorry, the code had a typo, it should be: >> >> >> my $obj1 = Bio::Tree::Draw::Cladogram->__**__new(-bootstrap => >> 1, >> -tree => $t1, >> -compact => 0); >> for my $node ($t1->get_nodes) { >> >> $node->id($node->branch_____**length) if defined >> $node->branch_length; >> } >> $obj1->print(-file => "$dir/$stem.eps") >> >> On 03/08/2011 14:58, Roy Chaudhuri wrote: >> >> Hi Shachi, >> >> I don't think you can draw labels on branches using >> Bio::Tree::Draw::Cladogram. However, it will draw node >> labels, >> so you >> could copy the branch lengths over to the node ids: >> >> my $obj1 = >> Bio::Tree::Draw::Cladogram->__**__new(-bootstrap => 1, >> -tree => >> $t1, >> -compact => >> 0); >> for my $node ($tree->get_nodes) { >> $node->id($node->branch_____**length) if defined >> $node->branch_length; >> } >> $obj1->print(-file => "$dir/$stem.eps") >> >> Incidentally, in your script you write the tree out to a >> file, >> then read >> it back in using TreeIO. This is unnecessary, you can >> use $tree >> directly >> as input to Bio::Tree::Draw::Cladogram. >> >> Alternatively, you could write out a newick file and use >> non-Bioperl >> software such as njplot or MEGA to draw your tree with >> labelled >> branch >> lengths. >> >> Cheers, >> Roy. >> >> On 03/08/2011 07:00, Shachi Gahoi wrote: >> >> Dear All >> >> I am using Bio::Tree modules for constructing and >> drawing >> tree. *I am unable >> to show branch length value in tree. >> * >> Please tell me How can I do this, if anybody knows. >> >> Here is my script which i am using...and i also >> attached >> generated tree. >> >> Thanks in advance >> >> >> ##############################**____##########################** >> ##__##__######################**####__####__###### >> >> use Bio::AlignIO; >> use Bio::Align::ProteinStatistics; >> use Bio::Tree::DistanceFactory; >> use Bio::TreeIO; >> use Bio::Tree::Draw::Cladogram; >> >> # for a dna alignment >> # can also use ProteinStatistics >> >> my $alnio = Bio::AlignIO->new(-file => 'ADP.aln', >> -format=>'clustalw'); >> >> my $dfactory = >> Bio::Tree::DistanceFactory->__**__new(-method => >> 'UPGMA'); >> >> my $stats = Bio::Align::ProteinStatistics-**____>new; >> >> my $treeout = Bio::TreeIO->new(-format => 'newick', >> -file >> =>'>ADP1.dnd'); >> >> while( my $aln = $alnio->next_aln ) >> { >> my $mat = $stats->distance(-method => 'Kimura', >> -align >> => $aln); >> >> my $tree = $dfactory->make_tree($mat); >> $treeout->write_tree($tree); >> } >> >> my $dir = shift || '.'; >> >> opendir(DIR, $dir) || die $!; >> for my $file ( readdir(DIR) ) >> { >> next unless $file =~ /(\S+)\.dnd$/; >> my $stem = $1; >> my $treeio = Bio::TreeIO->new('-format' => >> 'newick', >> '-file' => "$dir/$file"); >> >> if( my $t1 = $treeio->next_tree ) >> { >> my $obj1 = >> Bio::Tree::Draw::Cladogram->__**__new(-bootstrap => >> 1, >> >> -tree >> => $t1, >> >> -compact => 0); >> $obj1->print(-file => "$dir/$stem.eps"); >> } >> } >> >> >> ##############################**____##########################** >> ##__##__######################**####__####__############## >> >> >> >> >> ______________________________**_____________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org > bio.org > >> >> >> >> >> >> http://lists.open-bio.org/____**mailman/listinfo/bioperl-l >> >> > >> >> >> >> >> >> >> >> >> >> >> -- >> Regards, >> Shachi >> >> >> >> >> >> -- >> Regards, >> Shachi >> > > -- Regards, Shachi From p.j.a.cock at googlemail.com Sun Aug 7 09:40:52 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Sun, 7 Aug 2011 10:40:52 +0100 Subject: [Bioperl-l] SeqIO: paired end reads In-Reply-To: References: <57EA9809-E999-43EF-B340-9A552A4A3FB6@gmail.com> Message-ID: On Friday, August 5, 2011, Lee Katz wrote: > Thank you. I figured out through the Newbler manual that there is a linker > sequence to separate the paired end reads. Then, the forum at > http://seqanswers.com/forums/showthread.php?t=12940 showed me that the > linker sequence is "GTTGGAACCGAAAGGGTTTGAATTCAAACCCTTTCGGTTCCAAC". There is more than one Roche 454 linker sequence depending on the chemistry used, one is the same as it's reversve complement, one isn't. There is nothing in the SFF file format (nor the Roche specific XML manifest last time I checked) that handles the paired end information explicitly. > I think a useful addition to bioperl could be to have paired end reads. > Maybe, but to do this well you'd want to do flow space alignment of the reads to the linker sequence to find the imperfectly called linker sequences. Personally I use ssf_extract which is a free open source command line tool for this (calling an external aligned tool for paid end 454). > This is outside of the domain of bioperl, but now I am left wondering how I > could specify the distance between reads in Newbler, if the linker sequence > is fixed. How to do that depends on the aligned or assembly tool you are using. Peter From cjfields at illinois.edu Sun Aug 7 15:51:19 2011 From: cjfields at illinois.edu (Chris Fields) Date: Sun, 7 Aug 2011 10:51:19 -0500 Subject: [Bioperl-l] SeqIO: paired end reads In-Reply-To: References: <57EA9809-E999-43EF-B340-9A552A4A3FB6@gmail.com> Message-ID: <19923C8B-6C84-4D9B-8D37-86CAE9BC681E@illinois.edu> On Aug 7, 2011, at 4:40 AM, Peter Cock wrote: > On Friday, August 5, 2011, Lee Katz wrote: >> Thank you. I figured out through the Newbler manual that there is a > linker >> sequence to separate the paired end reads. Then, the forum at >> http://seqanswers.com/forums/showthread.php?t=12940 showed me that the >> linker sequence is "GTTGGAACCGAAAGGGTTTGAATTCAAACCCTTTCGGTTCCAAC". > > There is more than one Roche 454 linker sequence depending on the chemistry > used, one is the same as it's reversve complement, one isn't. > > There is nothing in the SFF file format (nor the Roche specific XML manifest > last time I checked) that handles the paired end information explicitly. Yep, it's all implied AFAIK. >> I think a useful addition to bioperl could be to have paired end reads. >> > > Maybe, but to do this well you'd want to do flow space alignment of the > reads to the linker sequence to find the imperfectly called linker > sequences. > > Personally I use ssf_extract which is a free open source command line tool > for this (calling an external aligned tool for paid end 454). I think it could be done, but I would implement something like this as a wrapper around faster tools (like sff_extract or similar). Implementing the functionality in pure (bio)perl/(bio)python doesn't make much sense if there are newer/faster tools out there. >> This is outside of the domain of bioperl, but now I am left wondering how > I >> could specify the distance between reads in Newbler, if the linker > sequence >> is fixed. > > How to do that depends on the aligned or assembly tool you are using. > > Peter Yep. I don't think there is a defined way to specify that in any format that I know of. chris From Russell.Smithies at agresearch.co.nz Sun Aug 7 21:45:19 2011 From: Russell.Smithies at agresearch.co.nz (Smithies, Russell) Date: Mon, 8 Aug 2011 09:45:19 +1200 Subject: [Bioperl-l] How to show branch length value in tree In-Reply-To: References: <4E3953FE.5080304@gmail.com> <4E3954AE.2080401@gmail.com> <4E3A80EF.2010409@gmail.com> <4E3BCBE8.4030303@gmail.com> Message-ID: <18DF7D20DFEC044098A1062202F5FFF3396074D3C9@exchsth.agresearch.co.nz> The constructor for Bio::Tree::Draw::Cladogram lets you specify the font and size, did you try setting it there? Title : new Usage : my $obj = Bio::Tree::Draw::Cladogram->new(); Function: Builds a new Bio::Tree::Draw::Cladogram object Returns : Bio::Tree::Draw::Cladogram Args : -tree => Bio::Tree::Tree object -second => Bio::Tree::Tree object (optional) -font => font name [string] (optional) <<<<------------- -size => font size [integer] (optional) <<<<------------- -top => top margin [integer] (optional) -bottom => bottom margin [integer] (optional) -left => left margin [integer] (optional) -right => right margin [integer] (optional) -tip => extra tip space [integer] (optional) -column => extra space between cladograms [integer] (optional) -compact => ignore branch lengths [boolean] (optional) -ratio => horizontal to vertical ratio [integer] (optional) -colors => use colors to color edges [boolean] (optional) -bootstrap => draw bootstrap or internal ids [boolean] --Russell > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Shachi Gahoi > Sent: Saturday, 6 August 2011 4:26 p.m. > To: Roy Chaudhuri > Cc: bioperl-l List > Subject: Re: [Bioperl-l] How to show branch length value in tree > > Thank you so much. > > Please tell me one more thing, *can I reduce branch length font? > * > On Fri, Aug 5, 2011 at 4:24 PM, Roy Chaudhuri > wrote: > > > In that case then you only want to add branch lengths to non-leaf > nodes, so > > it would be: > > > > > > for my $node ($t1->get_nodes) { > > $node->id($node->branch_**length) unless $node->is_Leaf > > > > } > > > > > > On 05/08/2011 06:40, Shachi Gahoi wrote: > > > >> > >> Instead of both node id and accession, Can I replace node id with > >> accession? > >> > >> > >> On Thu, Aug 4, 2011 at 4:52 PM, Roy Chaudhuri > >> >> wrote: > >> > >> Hi Shachi, > >> > >> Please keep replies on the mailing list, that way others can > follow > >> the discussion. > >> > >> As I mentioned, it is not possible to draw njplot-style trees > with > >> labelled branches using Bio::Tree::Draw::Cladogram, it currently > >> only labels nodes (you could perhaps add branch labels as a > feature > >> request on Redmine). > >> > >> The code I gave overwrites the existing "leaf" node ids (the > >> accessions) with branch lengths, if you want to also keep the > >> existing labels you could try something like: > >> > >> > >> for my $node ($t1->get_nodes) { > >> if ($node->is_Leaf) { > >> $node->id($node->branch___**length.' '.$node->id); > >> } else { > >> > >> $node->id($node->branch___**length) > >> } > >> } > >> > >> Cheers, > >> Roy. > >> > >> > >> On 04/08/2011 05:36, Shachi Gahoi wrote: > >> > >> Thank You so much. Now branch length is coming in tree. > >> > >> But I want Accesssion number in place of node id. > >> > >> I attached snapshot of tree as I want. Please tell me how can > I > >> do this. > >> > >> > >> > >> > >> On Wed, Aug 3, 2011 at 7:31 PM, Roy Chaudhuri > >> > >> > > >> >> >>> > >> wrote: > >> > >> Sorry, the code had a typo, it should be: > >> > >> > >> my $obj1 = Bio::Tree::Draw::Cladogram->__**__new(- > bootstrap => > >> 1, > >> -tree => > $t1, > >> -compact => > 0); > >> for my $node ($t1->get_nodes) { > >> > >> $node->id($node->branch_____**length) if defined > >> $node->branch_length; > >> } > >> $obj1->print(-file => "$dir/$stem.eps") > >> > >> On 03/08/2011 14:58, Roy Chaudhuri wrote: > >> > >> Hi Shachi, > >> > >> I don't think you can draw labels on branches using > >> Bio::Tree::Draw::Cladogram. However, it will draw > node > >> labels, > >> so you > >> could copy the branch lengths over to the node ids: > >> > >> my $obj1 = > >> Bio::Tree::Draw::Cladogram->__**__new(-bootstrap => 1, > >> -tree > => > >> $t1, > >> -compact > => > >> 0); > >> for my $node ($tree->get_nodes) { > >> $node->id($node->branch_____**length) if > defined > >> $node->branch_length; > >> } > >> $obj1->print(-file => "$dir/$stem.eps") > >> > >> Incidentally, in your script you write the tree out > to a > >> file, > >> then read > >> it back in using TreeIO. This is unnecessary, you can > >> use $tree > >> directly > >> as input to Bio::Tree::Draw::Cladogram. > >> > >> Alternatively, you could write out a newick file and > use > >> non-Bioperl > >> software such as njplot or MEGA to draw your tree > with > >> labelled > >> branch > >> lengths. > >> > >> Cheers, > >> Roy. > >> > >> On 03/08/2011 07:00, Shachi Gahoi wrote: > >> > >> Dear All > >> > >> I am using Bio::Tree modules for constructing and > >> drawing > >> tree. *I am unable > >> to show branch length value in tree. > >> * > >> Please tell me How can I do this, if anybody > knows. > >> > >> Here is my script which i am using...and i also > >> attached > >> generated tree. > >> > >> Thanks in advance > >> > >> > >> > ##############################**____##########################** > >> ##__##__######################**####__####__###### > >> > >> use Bio::AlignIO; > >> use Bio::Align::ProteinStatistics; > >> use Bio::Tree::DistanceFactory; > >> use Bio::TreeIO; > >> use Bio::Tree::Draw::Cladogram; > >> > >> # for a dna alignment > >> # can also use ProteinStatistics > >> > >> my $alnio = Bio::AlignIO->new(-file => 'ADP.aln', > >> -format=>'clustalw'); > >> > >> my $dfactory = > >> Bio::Tree::DistanceFactory->__**__new(-method => > >> 'UPGMA'); > >> > >> my $stats = Bio::Align::ProteinStatistics- > **____>new; > >> > >> my $treeout = Bio::TreeIO->new(-format => > 'newick', > >> -file > >> =>'>ADP1.dnd'); > >> > >> while( my $aln = $alnio->next_aln ) > >> { > >> my $mat = $stats->distance(-method => > 'Kimura', > >> -align > >> => $aln); > >> > >> my $tree = $dfactory->make_tree($mat); > >> $treeout->write_tree($tree); > >> } > >> > >> my $dir = shift || '.'; > >> > >> opendir(DIR, $dir) || die $!; > >> for my $file ( readdir(DIR) ) > >> { > >> next unless $file =~ /(\S+)\.dnd$/; > >> my $stem = $1; > >> my $treeio = Bio::TreeIO->new('-format' => > >> 'newick', > >> '-file' => "$dir/$file"); > >> > >> if( my $t1 = $treeio->next_tree ) > >> { > >> my $obj1 = > >> Bio::Tree::Draw::Cladogram->__**__new(-bootstrap > => > >> 1, > >> > >> -tree > >> => $t1, > >> > >> -compact => 0); > >> $obj1->print(-file => > "$dir/$stem.eps"); > >> } > >> } > >> > >> > >> > ##############################**____##########################** > >> ##__##__######################**####__####__############## > >> > >> > >> > >> > >> > ______________________________**_____________________ > >> Bioperl-l mailing list > >> Bioperl-l at lists.open-bio.org >> bio.org > > >> l at lists.open-__bio.org> > >> bio.org> > >> >> > >> > >> http://lists.open-bio.org/____**mailman/listinfo/bioperl- > l > >> l > >> > > >> l > >> l > >> >> > >> > >> > >> > >> > >> > >> > >> -- > >> Regards, > >> Shachi > >> > >> > >> > >> > >> > >> -- > >> Regards, > >> Shachi > >> > > > > > > > -- > Regards, > Shachi > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= From cjfields at illinois.edu Tue Aug 9 20:10:37 2011 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 9 Aug 2011 15:10:37 -0500 Subject: [Bioperl-l] Question to Bio::SearchIO::infernal.pm In-Reply-To: References: <4E32E14B020000EE00004F57@gwia1.boku.ac.at> Message-ID: <683C7B42-338F-42AE-AF93-11BFB4DB2CB7@illinois.edu> Following this up: Nadine, did you have a bug to report? It's kind of hard to fix this without some example data. chris On Aug 3, 2011, at 8:10 AM, Chris Fields wrote: > Nadine, > > Hard to guess w/o seeing the report, but I'm not terribly surprised. I believe I only coded for simple 1 CM reports, IIRC. You'll have to file this as a bug on redmine along with an example. > > chris > > On Jul 29, 2011, at 9:35 AM, Nadine Elpida Tatto wrote: > >> Hi There! >> >> >> >> I was wondering if you would or can help me. >> >> >> I have an infernal report containing about 2000 CMs from an infernal run against Rfam.cm. To parse this report I wanted to use Bio::SearchIO::infernal.pm. Unfortunately this turned out to be a problem for me, because "$parser->next_result" only delivers the result for the first CM in the report and nothing more. >> >> >> My code: >> #!/usr/bin/perl -w >> >> >> use strict;use Data::Dumper; >> use Bio::SearchIO; >> >> >> my $infile = $ARGV[0]; # infernal report >> my $parser = Bio::SearchIO->new(-format => 'Infernal', >> -file => $infile); >> >> >> while( my $result = $parser->next_result ) { >> print $result->query_name . "\n"; >> } >> >> >> exit; >> >> >> >> >> The output: >> >> >> ntatto:~$ ./infernalParser.pl infernal.output >> 5S_rRNA >> ntatto:~$ >> >> >> >> >> I would expect the following (like parsing a blast report): >> >> >> ntatto:~$ ./infernalParser.pl infernal.output >> 5S_rRNA >> 5_8S_rRNA >> U1 >> ... >> ntatto:~$ >> >> >> >> I would be glad for help. >> >> >> Thank you in advance. >> >> >> Best Regards >> >> >> N Tatto >> >> >> >> >> >> >> >> >> >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From torsten.seemann at infotech.monash.edu.au Sun Aug 14 08:32:46 2011 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Sun, 14 Aug 2011 18:32:46 +1000 Subject: [Bioperl-l] Genbank file : bad features (tag) order with /translation In-Reply-To: <5D9B8006-A20E-4AAE-88EF-0A1DCA56B26E@verizon.net> References: <7717ecd0-c9b6-4ecc-895f-db60ca89e259@l7g2000vbz.googlegroups.com> <5D9B8006-A20E-4AAE-88EF-0A1DCA56B26E@verizon.net> Message-ID: > I currently use BioPerl and SeqIO::genbank to create the *gbf files for NCBI submission, they've always accepted them. In fact I think they don't even use them, I believe they use the *tbl, *fsa, and *agp files and the ASN file as data sources. I'm pretty sure that NCBI/Genbank do *not* accept Genbank files for submission - which I found somewhat ironic! They require an ASN1 formatted file (XML-like hierarchial format, pre-dates XML), which is sometimes given a .sqn extenison if you use the Sequin GUI to prepare it. There are command line tools like "tbl2asn" which will take the .tbl and .fsa files Brian has listed to produce the ASN file too. As far as I know, there is no NCBI tools to take a .gbk and produce the .tbl/.fsa/.agp - does anyone know otherwise? -- --Torsten Seemann --Victorian Bioinformatics Consortium, Dept. Microbiology, Monash University, AUSTRALIA From cjfields at illinois.edu Sun Aug 14 14:22:10 2011 From: cjfields at illinois.edu (Chris Fields) Date: Sun, 14 Aug 2011 09:22:10 -0500 Subject: [Bioperl-l] Genbank file : bad features (tag) order with /translation In-Reply-To: References: <7717ecd0-c9b6-4ecc-895f-db60ca89e259@l7g2000vbz.googlegroups.com> <5D9B8006-A20E-4AAE-88EF-0A1DCA56B26E@verizon.net> Message-ID: <410A8BF5-D5EF-4E7A-B91C-D3DDACBABB75@illinois.edu> Not that I'm aware of, though it shouldn't be hard to set something up using Bio::SeqIO for that. chris On Aug 14, 2011, at 3:32 AM, Torsten Seemann wrote: >> I currently use BioPerl and SeqIO::genbank to create the *gbf files for NCBI submission, they've always accepted them. In fact I think they don't even use them, I believe they use the *tbl, *fsa, and *agp files and the ASN file as data sources. > > I'm pretty sure that NCBI/Genbank do *not* accept Genbank files for > submission - which I found somewhat ironic! > > They require an ASN1 formatted file (XML-like hierarchial format, > pre-dates XML), which is sometimes given a .sqn extenison if you use > the Sequin GUI to prepare it. There are command line tools like > "tbl2asn" which will take the .tbl and .fsa files Brian has listed to > produce the ASN file too. > > As far as I know, there is no NCBI tools to take a .gbk and produce > the .tbl/.fsa/.agp - does anyone know otherwise? > > -- > --Torsten Seemann > --Victorian Bioinformatics Consortium, Dept. Microbiology, Monash > University, AUSTRALIA > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From maximilien1er at gmail.com Sun Aug 14 14:23:39 2011 From: maximilien1er at gmail.com (Maxime =?ISO-8859-1?Q?D=E9raspe?=) Date: Sun, 14 Aug 2011 10:23:39 -0400 Subject: [Bioperl-l] Genbank file : bad features (tag) order with /translation In-Reply-To: References: <7717ecd0-c9b6-4ecc-895f-db60ca89e259@l7g2000vbz.googlegroups.com> <5D9B8006-A20E-4AAE-88EF-0A1DCA56B26E@verizon.net> Message-ID: <1313331819.15034.4.camel@maximilian-home> I know that Artemis from sanger institute can convert a genbank file into a sequin tab file. Then you could use that file to submit it to ncbi with their sequin soft. But I think that the genbank file would be ok too. Max On Sun, 2011-08-14 at 18:32 +1000, Torsten Seemann wrote: > > I currently use BioPerl and SeqIO::genbank to create the *gbf files for NCBI submission, they've always accepted them. In fact I think they don't even use them, I believe they use the *tbl, *fsa, and *agp files and the ASN file as data sources. > > I'm pretty sure that NCBI/Genbank do *not* accept Genbank files for > submission - which I found somewhat ironic! > > They require an ASN1 formatted file (XML-like hierarchial format, > pre-dates XML), which is sometimes given a .sqn extenison if you use > the Sequin GUI to prepare it. There are command line tools like > "tbl2asn" which will take the .tbl and .fsa files Brian has listed to > produce the ASN file too. > > As far as I know, there is no NCBI tools to take a .gbk and produce > the .tbl/.fsa/.agp - does anyone know otherwise? > From punit_vergoboy2004 at yahoo.co.in Thu Aug 18 12:14:54 2011 From: punit_vergoboy2004 at yahoo.co.in (punit kumar) Date: Thu, 18 Aug 2011 17:44:54 +0530 (IST) Subject: [Bioperl-l] query about Bio::Tools::Run::RemoteBlast In-Reply-To: <410A8BF5-D5EF-4E7A-B91C-D3DDACBABB75@illinois.edu> References: <7717ecd0-c9b6-4ecc-895f-db60ca89e259@l7g2000vbz.googlegroups.com> <5D9B8006-A20E-4AAE-88EF-0A1DCA56B26E@verizon.net> <410A8BF5-D5EF-4E7A-B91C-D3DDACBABB75@illinois.edu> Message-ID: <1313669694.59013.YahooMailNeo@web137303.mail.in.yahoo.com> hi friends ,? i am new to Bioperl , and i am using "Bio::Tools::Run::RemoteBlast" for remote blast ?i tried to use this module and i?succeed?a little yet, i want to get the description part of blast alignments which were found against my query sequence, as result is shown in format as given below, which is the out put table of ONLINE BLAST, Sequences producing significant alignments: Accession Description Max score Total score Query coverage E value Links NP_216760.1 acyl carrier protein [Mycobacterium tuberculosis H37Rv] >ref|NP_336774.1| acyl carrier protein [Mycobacterium tuberculosis CDC1551] >ref|NP_855917.1| acyl carrier protein [Mycobacterium bovis AF2122/97] >ref|YP_978350.1| acyl carrier protein [Mycobacterium bovis BCG str. Pasteur 1173P2] >ref|YP_001283588.1| acyl carrier protein [Mycobacterium tuberculosis H37Ra] >ref|YP_001288206.1| acyl carrier protein [Mycobacterium tuberculosis F11] >ref|ZP_02551632.1| acyl carrier protein [Mycobacterium tuberculosis H37Ra] >ref|YP_002645307.1| acyl carrier protein [Mycobacterium bovis BCG str. Tokyo 172] >ref|YP_003031689.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis KZN 1435] >ref|ZP_04925721.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis C] >ref|ZP_04981085.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis str. Haarlem] >ref|ZP_05141736.1| acyl carrier protein [Mycobacterium tuberculosis '98-R604 INH-RIF-EM'] >ref|ZP_06433498.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis T46] >ref|ZP_06437620.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis CPHL_A] >ref|ZP_06443178.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis KZN 605] >ref|ZP_06450592.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis T17] >ref|ZP_06455160.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis K85] >ref|ZP_06504896.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis 02_1987] >ref|ZP_06510220.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis T92] >ref|ZP_06513730.1| meromycolate extension acyl carrier protein acpm [Mycobacterium tuberculosis EAS054] >ref|ZP_06517747.1| meromycolate extension acyl carrier protein acpm [Mycobacterium tuberculosis T85] >ref|ZP_06521786.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis GM 1503] >ref|ZP_06799170.1| acyl carrier protein [Mycobacterium tuberculosis 210] >ref|ZP_06952619.1| acyl carrier protein [Mycobacterium tuberculosis KZN 4207] >ref|ZP_06960948.1| acyl carrier protein [Mycobacterium tuberculosis KZN R506] >ref|ZP_07013145.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis 94_M4241A] >ref|ZP_07414839.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu001] >ref|ZP_07418616.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu002] >ref|ZP_07423348.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu003] >ref|ZP_07427715.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu004] >ref|ZP_07432018.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu005] >ref|ZP_07436410.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu006] >ref|ZP_07440655.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu008] >ref|ZP_07445228.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu007] >ref|ZP_07481045.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu009] >ref|ZP_07485275.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu010] >ref|ZP_07489492.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu011] >ref|ZP_07494023.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu012] >ref|ZP_07816044.1| acyl carrier protein [Mycobacterium tuberculosis KZN V2475] >ref|YP_004723912.1| meromycolate extension acyl carrier protein ACPM [Mycobacterium africanum GM041182] >ref|YP_004745700.1| meromycolate extension acyl carrier protein ACPM [Mycobacterium canettii CIPT 140010059] >sp|P0A4W6.1|ACPM_MYCTU RecName: Full=Meromycolate extension acyl carrier protein; Short=ACP >sp|P0A4W7.1|ACPM_MYCBO RecName: Full=Meromycolate extension acyl carrier protein; Short=ACP >emb|CAA94640.1| MEROMYCOLATE EXTENSION ACYL CARRIER PROTEIN ACPM [Mycobacterium tuberculosis H37Rv] >gb|AAK46588.1| acyl carrier protein [Mycobacterium tuberculosis CDC1551] >emb|CAD97121.1| MEROMYCOLATE EXTENSION ACYL CARRIER PROTEIN ACPM [Mycobacterium bovis AF2122/97] >emb|CAL72249.1| Meromycolate extension acyl carrier protein acpM [Mycobacterium bovis BCG str. Pasteur 1173P2] >gb|EAY60463.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis C] >gb|EBA42598.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis str. Haarlem] >gb|ABQ74026.1| meromycolate extension acyl carrier protein AcpM [Mycobacterium tuberculosis H37Ra] >gb|ABR06604.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis F11] >dbj|BAH26539.1| acyl carrier protein [Mycobacterium bovis BCG str. Tokyo 172] >gb|ACT24794.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis KZN 1435] >gb|EFD13913.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis T46] >gb|EFD18035.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis CPHL_A] >gb|EFD21093.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis KZN 605] >gb|EFD43942.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis K85] >gb|EFD47767.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis T17] >gb|EFD53534.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis 02_1987] >gb|EFD58858.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis T92] >gb|EFD62368.1| meromycolate extension acyl carrier protein acpm [Mycobacterium tuberculosis EAS054] >gb|EFD73930.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis GM 1503] >gb|EFD77945.1| meromycolate extension acyl carrier protein acpm [Mycobacterium tuberculosis T85] >gb|EFI30824.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis 94_M4241A] >gb|EFO74536.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu001] >gb|EFP15742.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu002] >gb|EFP19094.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu003] >gb|EFP22930.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu004] >gb|EFP26734.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu005] >gb|EFP30496.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu006] >gb|EFP33906.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu007] >gb|EFP38213.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu008] >gb|EFP42922.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu009] >gb|EFP46864.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu010] >gb|EFP50800.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu011] >gb|EFP54373.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu012] >gb|EGB28294.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis CDC1551A] >gb|EGE50793.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis W-148] >gb|AEB03875.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis KZN 4207] >gb|AEJ47271.1| acyl carrier protein [Mycobacterium tuberculosis CCDC5079] >gb|AEJ50890.1| acyl carrier protein [Mycobacterium tuberculosis CCDC5180] >emb|CCC27325.1| meromycolate extension acyl carrier protein ACPM [Mycobacterium africanum GM041182] >emb|CCC44598.1| meromycolate extension acyl carrier protein ACPM [Mycobacterium canettii CIPT 140010059] >emb|CCC64838.1| Meromycolate extension acyl carrier protein acpM [Mycobacterium bovis BCG str. Moreau RDJ] 223 223 100% 1e-74 1KLP_A Chain A, The Solution Structure Of Acyl Carrier Protein From Mycobacterium Tuberculosis 220 220 99% 2e-73 ZP_04748738.1 acyl carrier protein [Mycobacterium kansasii ATCC 12478] 165 165 100% 9e-52 ZP_05224070.1 acyl carrier protein [Mycobacterium intracellulare ATCC 13950] 162 162 100% 8e-51 NP_960931.1 acyl carrier protein [Mycobacterium avium subsp. paratuberculosis K-10] >ref|YP_881402.1| acyl carrier protein [Mycobacterium avium 104] >ref|ZP_05216419.1| acyl carrier protein [Mycobacterium avium subsp. avium ATCC 25291] >gb|AAS04314.1| AcpM [Mycobacterium avium subsp. paratuberculosis K-10] >gb|ABK65172.1| acyl carrier protein [Mycobacterium avium 104] >gb|EGO40713.1| acyl carrier protein [Mycobacterium avium subsp. paratuberculosis S397] 162 162 100% 8e-51 NP_302135.1 acyl carrier protein [Mycobacterium leprae TN] >ref|YP_002503765.1| acyl carrier protein [Mycobacterium leprae Br4923] >sp|O69475.1|ACPM_MYCLE RecName: Full=Meromycolate extension acyl carrier protein; Short=ACP >emb|CAA19202.1| acyl carrier protein [Mycobacterium leprae] >emb|CAC30605.1| acyl carrier protein (meromycolate extension) [Mycobacterium leprae] >emb|CAR71749.1| acyl carrier protein (meromycolate extension) [Mycobacterium leprae Br4923] 162 162 100% 2e-50 ZP_07966703.1 hypothetical protein HMPREF9336_03075 [Segniliparus rugosus ATCC BAA-974] >gb|EFV12044.1| hypothetical protein HMPREF9336_03075 [Segniliparus rugosus ATCC BAA-974] 162 162 88% 3e-50 YP_905336.1 acyl carrier protein [Mycobacterium ulcerans Agy99] >ref|YP_001851618.1| meromycolate extension acyl carrier protein AcpM [Mycobacterium marinum M] >gb|ABL03865.1| meromycolate extension acyl carrier protein AcpM [Mycobacterium ulcerans Agy99] >gb|ACC41763.1| meromycolate extension acyl carrier protein AcpM [Mycobacterium marinum M] 161 161 100% 3e-50 ZP_08713925.1 acyl carrier protein [Mycobacterium colombiense CECT 3035] >gb|EGT87768.1| acyl carrier protein [Mycobacterium colombiense CECT 3035] 160 160 100% 6e-50 YP_003660002.1 phosphopantetheine-binding protein [Segniliparus rotundus DSM 44985] >gb|ADG99171.1| phosphopantetheine-binding protein [Segniliparus rotundus DSM 44985] 160 160 88% 8e-50 ? ? ? ? ? ? ? ? ? ? ? where in my code: print "hit name is ",$hit->name, "\n"; # gives me the refrence of aligned sequence ? ? ? print"Score: ".$hsp->score."\n";??# gives me the score of aligned sequence ? ? ?print"E-val: ".$hsp->expect."\n";??# gives me the evalue of aligned sequence ? ? ?print"percent identity: ".$hsp->percent_identity."\n";??# gives me the query coverage ?of aligned sequence i want to use??#print "Description ",$hsp->desc, "\n"; to show the description but i am not getting can any body help me out for this i need to know urgently, thanks to read and i hope i was succesfull to explain my problem . below is the copy of my code i am trying to use : ? use Bio::Tools::Run::RemoteBlast; ? use strict; ? my $v = 1; ? my $prog = 'blastp'; ? my $db ? = 'refseq_protein'; ? my $e_val= '1e-10'; #1e-10 ?my $result; ?#my $code=q| my $answer = my $a / my $b;|; ? ? ? my @params = ( '-prog' => $prog, ? '-data' => $db, ? '-expect' => $e_val ); ? my $factory = Bio::Tools::Run::RemoteBlast->new(@params); ? $v = 1; ? my $str = Bio::SeqIO->new(-file=>'prot.txt' , '-format' => 'fasta' ); ? my $input; ? while($input = $str->next_seq()) ? { ?? ? # ?Blast a sequence against a database: ?? ? my $r = $factory->submit_blast($input); ? print STDERR "waiting..." if( $v > 0 ); ?? ? my %hit_evalue; ? my @evalue; ?? ? while ( my @rids = $factory->each_rid ) { ? ? ? foreach my $rid ( @rids ) { ? ?my $rc = $factory->retrieve_blast($rid); ? ?if( !ref($rc) ) { ? ? ? ?if( $rc < 0 ) { ? ? ? ?$factory->remove_rid($rid); ? ?} ? ? ? ?print STDERR "." if ( $v > 0 ); ? ? ? ?sleep 5; ? ?} else {? ? ? ? ?$factory->remove_rid($rid); ? ? ? ?#print $rid."\n\n"; ? ? ?my $result = $rc->next_result; ? ? ? ? ? ? ?print "db is ", $result->database_name(), "\n"; ? ? ? ?my $count = 0; ? ? ? ?while( my $hit = $result->next_hit ) { ? ?$count++; ? ?#next unless ( $v > 0); ? ?#print "hit name is ", $hit->name, "\n"; ? ?while( my $hsp = $hit->next_hsp ) ?{ ? ? ?print "hit name is ",$hit->name, "\n"; ? ? ?#print "Query name is ",$hsp->desc, "\n"; exit; ? ? ?? ? ? ?print"Score: ".$hsp->score."\n"; ? ? ?print"E-val: ".$hsp->expect."\n"; ? ? ?print"percent identity: ".$hsp->percent_identity."\n"; ?? ?} ? ? ? ? ? ?} ? ?} ? ? ? } ? } ? } From pcantalupo at gmail.com Thu Aug 18 12:55:18 2011 From: pcantalupo at gmail.com (Paul Cantalupo) Date: Thu, 18 Aug 2011 08:55:18 -0400 Subject: [Bioperl-l] query about Bio::Tools::Run::RemoteBlast In-Reply-To: <1313669694.59013.YahooMailNeo@web137303.mail.in.yahoo.com> References: <7717ecd0-c9b6-4ecc-895f-db60ca89e259@l7g2000vbz.googlegroups.com> <5D9B8006-A20E-4AAE-88EF-0A1DCA56B26E@verizon.net> <410A8BF5-D5EF-4E7A-B91C-D3DDACBABB75@illinois.edu> <1313669694.59013.YahooMailNeo@web137303.mail.in.yahoo.com> Message-ID: Punit I think you want '$hit->description' not '$hsp->desc' Paul Paul Cantalupo University of Pittsburgh On Thu, Aug 18, 2011 at 8:14 AM, punit kumar wrote: > hi friends , > > i am new to Bioperl , and i am using "Bio::Tools::Run::RemoteBlast" for remote blast i tried to use this module and i succeed a little yet, i want to get the description part of blast alignments which were found against my query sequence, as result is shown in format as given below, which is the out put table of ONLINE BLAST, > > Sequences producing significant alignments: > Accession > Description > Max score > Total score > Query coverage > E value > Links > NP_216760.1 acyl carrier protein [Mycobacterium tuberculosis H37Rv] >ref|NP_336774.1| acyl carrier protein [Mycobacterium tuberculosis CDC1551] >ref|NP_855917.1| acyl carrier protein [Mycobacterium bovis AF2122/97] >ref|YP_978350.1| acyl carrier protein [Mycobacterium bovis BCG str. Pasteur 1173P2] >ref|YP_001283588.1| acyl carrier protein [Mycobacterium tuberculosis H37Ra] >ref|YP_001288206.1| acyl carrier protein [Mycobacterium tuberculosis F11] >ref|ZP_02551632.1| acyl carrier protein [Mycobacterium tuberculosis H37Ra] >ref|YP_002645307.1| acyl carrier protein [Mycobacterium bovis BCG str. Tokyo 172] >ref|YP_003031689.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis KZN 1435] >ref|ZP_04925721.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis C] >ref|ZP_04981085.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis str. Haarlem] >ref|ZP_05141736.1| acyl carrier > protein [Mycobacterium tuberculosis '98-R604 INH-RIF-EM'] >ref|ZP_06433498.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis T46] >ref|ZP_06437620.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis CPHL_A] >ref|ZP_06443178.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis KZN 605] >ref|ZP_06450592.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis T17] >ref|ZP_06455160.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis K85] >ref|ZP_06504896.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis 02_1987] >ref|ZP_06510220.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis T92] >ref|ZP_06513730.1| meromycolate extension acyl carrier protein acpm [Mycobacterium tuberculosis EAS054] >ref|ZP_06517747.1| meromycolate extension acyl carrier protein acpm > [Mycobacterium tuberculosis T85] >ref|ZP_06521786.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis GM 1503] >ref|ZP_06799170.1| acyl carrier protein [Mycobacterium tuberculosis 210] >ref|ZP_06952619.1| acyl carrier protein [Mycobacterium tuberculosis KZN 4207] >ref|ZP_06960948.1| acyl carrier protein [Mycobacterium tuberculosis KZN R506] >ref|ZP_07013145.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis 94_M4241A] >ref|ZP_07414839.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu001] >ref|ZP_07418616.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu002] >ref|ZP_07423348.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu003] >ref|ZP_07427715.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu004] >ref|ZP_07432018.1| meromycolate extension acyl carrier protein > acpM [Mycobacterium tuberculosis SUMu005] >ref|ZP_07436410.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu006] >ref|ZP_07440655.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu008] >ref|ZP_07445228.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu007] >ref|ZP_07481045.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu009] >ref|ZP_07485275.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu010] >ref|ZP_07489492.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu011] >ref|ZP_07494023.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu012] >ref|ZP_07816044.1| acyl carrier protein [Mycobacterium tuberculosis KZN V2475] >ref|YP_004723912.1| meromycolate extension acyl carrier protein ACPM [Mycobacterium africanum > GM041182] >ref|YP_004745700.1| meromycolate extension acyl carrier protein ACPM [Mycobacterium canettii CIPT 140010059] >sp|P0A4W6.1|ACPM_MYCTU RecName: Full=Meromycolate extension acyl carrier protein; Short=ACP >sp|P0A4W7.1|ACPM_MYCBO RecName: Full=Meromycolate extension acyl carrier protein; Short=ACP >emb|CAA94640.1| MEROMYCOLATE EXTENSION ACYL CARRIER PROTEIN ACPM [Mycobacterium tuberculosis H37Rv] >gb|AAK46588.1| acyl carrier protein [Mycobacterium tuberculosis CDC1551] >emb|CAD97121.1| MEROMYCOLATE EXTENSION ACYL CARRIER PROTEIN ACPM [Mycobacterium bovis AF2122/97] >emb|CAL72249.1| Meromycolate extension acyl carrier protein acpM [Mycobacterium bovis BCG str. Pasteur 1173P2] >gb|EAY60463.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis C] >gb|EBA42598.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis str. Haarlem] >gb|ABQ74026.1| meromycolate extension acyl carrier protein AcpM > [Mycobacterium tuberculosis H37Ra] >gb|ABR06604.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis F11] >dbj|BAH26539.1| acyl carrier protein [Mycobacterium bovis BCG str. Tokyo 172] >gb|ACT24794.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis KZN 1435] >gb|EFD13913.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis T46] >gb|EFD18035.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis CPHL_A] >gb|EFD21093.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis KZN 605] >gb|EFD43942.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis K85] >gb|EFD47767.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis T17] >gb|EFD53534.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis 02_1987] >gb|EFD58858.1| meromycolate extension acyl carrier > protein acpM [Mycobacterium tuberculosis T92] >gb|EFD62368.1| meromycolate extension acyl carrier protein acpm [Mycobacterium tuberculosis EAS054] >gb|EFD73930.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis GM 1503] >gb|EFD77945.1| meromycolate extension acyl carrier protein acpm [Mycobacterium tuberculosis T85] >gb|EFI30824.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis 94_M4241A] >gb|EFO74536.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu001] >gb|EFP15742.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu002] >gb|EFP19094.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu003] >gb|EFP22930.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu004] >gb|EFP26734.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu005] > >gb|EFP30496.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu006] >gb|EFP33906.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu007] >gb|EFP38213.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu008] >gb|EFP42922.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu009] >gb|EFP46864.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu010] >gb|EFP50800.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu011] >gb|EFP54373.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis SUMu012] >gb|EGB28294.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis CDC1551A] >gb|EGE50793.1| meromycolate extension acyl carrier protein acpM [Mycobacterium tuberculosis W-148] >gb|AEB03875.1| meromycolate extension acyl > carrier protein acpM [Mycobacterium tuberculosis KZN 4207] >gb|AEJ47271.1| acyl carrier protein [Mycobacterium tuberculosis CCDC5079] >gb|AEJ50890.1| acyl carrier protein [Mycobacterium tuberculosis CCDC5180] >emb|CCC27325.1| meromycolate extension acyl carrier protein ACPM [Mycobacterium africanum GM041182] >emb|CCC44598.1| meromycolate extension acyl carrier protein ACPM [Mycobacterium canettii CIPT 140010059] >emb|CCC64838.1| Meromycolate extension acyl carrier protein acpM [Mycobacterium bovis BCG str. Moreau RDJ] 223 223 100% 1e-74 > 1KLP_A Chain A, The Solution Structure Of Acyl Carrier Protein From Mycobacterium Tuberculosis 220 220 99% 2e-73 > ZP_04748738.1 acyl carrier protein [Mycobacterium kansasii ATCC 12478] 165 165 100% 9e-52 > ZP_05224070.1 acyl carrier protein [Mycobacterium intracellulare ATCC 13950] 162 162 100% 8e-51 > NP_960931.1 acyl carrier protein [Mycobacterium avium subsp. paratuberculosis K-10] >ref|YP_881402.1| acyl carrier protein [Mycobacterium avium 104] >ref|ZP_05216419.1| acyl carrier protein [Mycobacterium avium subsp. avium ATCC 25291] >gb|AAS04314.1| AcpM [Mycobacterium avium subsp. paratuberculosis K-10] >gb|ABK65172.1| acyl carrier protein [Mycobacterium avium 104] >gb|EGO40713.1| acyl carrier protein [Mycobacterium avium subsp. paratuberculosis S397] 162 162 100% 8e-51 > NP_302135.1 acyl carrier protein [Mycobacterium leprae TN] >ref|YP_002503765.1| acyl carrier protein [Mycobacterium leprae Br4923] >sp|O69475.1|ACPM_MYCLE RecName: Full=Meromycolate extension acyl carrier protein; Short=ACP >emb|CAA19202.1| acyl carrier protein [Mycobacterium leprae] >emb|CAC30605.1| acyl carrier protein (meromycolate extension) [Mycobacterium leprae] >emb|CAR71749.1| acyl carrier protein (meromycolate extension) [Mycobacterium leprae Br4923] 162 162 100% 2e-50 > ZP_07966703.1 hypothetical protein HMPREF9336_03075 [Segniliparus rugosus ATCC BAA-974] >gb|EFV12044.1| hypothetical protein HMPREF9336_03075 [Segniliparus rugosus ATCC BAA-974] 162 162 88% 3e-50 > YP_905336.1 acyl carrier protein [Mycobacterium ulcerans Agy99] >ref|YP_001851618.1| meromycolate extension acyl carrier protein AcpM [Mycobacterium marinum M] >gb|ABL03865.1| meromycolate extension acyl carrier protein AcpM [Mycobacterium ulcerans Agy99] >gb|ACC41763.1| meromycolate extension acyl carrier protein AcpM [Mycobacterium marinum M] 161 161 100% 3e-50 > ZP_08713925.1 acyl carrier protein [Mycobacterium colombiense CECT 3035] >gb|EGT87768.1| acyl carrier protein [Mycobacterium colombiense CECT 3035] 160 160 100% 6e-50 > YP_003660002.1 phosphopantetheine-binding protein [Segniliparus rotundus DSM 44985] >gb|ADG99171.1| phosphopantetheine-binding protein [Segniliparus rotundus DSM 44985] 160 160 88% 8e-50 > > where in my code: > > print "hit name is ",$hit->name, "\n"; # gives me the refrence of aligned sequence > print"Score: ".$hsp->score."\n"; # gives me the score of aligned sequence > print"E-val: ".$hsp->expect."\n"; # gives me the evalue of aligned sequence > print"percent identity: ".$hsp->percent_identity."\n"; # gives me the query coverage of aligned sequence > > i want to use #print "Description ",$hsp->desc, "\n"; to show the description but i am not getting can any body help me out for this i need to know urgently, thanks to read and i hope i was succesfull to explain my problem . > > below is the copy of my code i am trying to use : > > > > > use Bio::Tools::Run::RemoteBlast; > use strict; > my $v = 1; > my $prog = 'blastp'; > my $db = 'refseq_protein'; > my $e_val= '1e-10'; #1e-10 > > my $result; > #my $code=q| my $answer = my $a / my $b;|; > > > > > > my @params = ( > '-prog' => $prog, > '-data' => $db, > '-expect' => $e_val > ); > > my $factory = Bio::Tools::Run::RemoteBlast->new(@params); > $v = 1; > my $str = Bio::SeqIO->new(-file=>'prot.txt' , '-format' => 'fasta' ); > my $input; > while($input = $str->next_seq()) > { > > # Blast a sequence against a database: > > my $r = $factory->submit_blast($input); > print STDERR "waiting..." if( $v > 0 ); > > my %hit_evalue; > my @evalue; > > while ( my @rids = $factory->each_rid ) { > foreach my $rid ( @rids ) { > my $rc = $factory->retrieve_blast($rid); > if( !ref($rc) ) { > if( $rc < 0 ) { > $factory->remove_rid($rid); > } > print STDERR "." if ( $v > 0 ); > sleep 5; > } else { > $factory->remove_rid($rid); > #print $rid."\n\n"; > my $result = $rc->next_result; > > print "db is ", $result->database_name(), "\n"; > my $count = 0; > while( my $hit = $result->next_hit ) { > $count++; > #next unless ( $v > 0); > #print "hit name is ", $hit->name, "\n"; > while( my $hsp = $hit->next_hsp ) > { > print "hit name is ",$hit->name, "\n"; > #print "Query name is ",$hsp->desc, "\n"; exit; > > print"Score: ".$hsp->score."\n"; > print"E-val: ".$hsp->expect."\n"; > print"percent identity: ".$hsp->percent_identity."\n"; > } > > > } > } > } > } > } > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From David.Messina at sbc.su.se Fri Aug 19 09:07:35 2011 From: David.Messina at sbc.su.se (Dave Messina) Date: Fri, 19 Aug 2011 11:07:35 +0200 Subject: [Bioperl-l] Fwd: pls help.. In-Reply-To: References: Message-ID: Whoops, resending ? the attachment was too big. Ravi, please provide only a few example lines from your GFF file, or host the file elsewhere and post a link to it. Dave ---------- Forwarded message ---------- From: Dave Messina Date: Fri, Aug 19, 2011 at 10:53 Subject: pls help.. To: ravi.devani89 at gmail.com Cc: bioperl-l Ravi, Your message belongs on the main BioPerl list, not the bioperl-dev list, so I'm reposting it there. To sign up for the main list, go to: http://bioperl.org/mailman/listinfo/bioperl-l Dave ---------- Forwarded message ---------- From: Ravi Devani To: bioperl-dev at lists.open-bio.org Date: Fri, 19 Aug 2011 13:54:22 +0530 Subject: Fwd: pls help.. i tried to create a gff3 file from .gbk file using bioperl genbank2gff3 script but what i get is same features repeating many times.. and the file keeps growing in size ntil my harddisk gets full.. i have tried to filter all other features except "region" but still it repeats a single entry many times.. i have attached a part of the file generated.. pls kindly help me. From ravi.devani89 at gmail.com Fri Aug 19 05:16:00 2011 From: ravi.devani89 at gmail.com (Ravi Devani) Date: Fri, 19 Aug 2011 10:46:00 +0530 Subject: [Bioperl-l] pls help.. In-Reply-To: References: Message-ID: ---------- Forwarded message ---------- From: Ravi Devani Date: Thu, Aug 18, 2011 at 12:40 PM Subject: pls help.. To: scott at scottcain.net i tried to create a gff3 file from .gbk file using bp_genbank2gff3.pl but what i get is same features repeating many times.. and the file keeps growing in size ntil my harddisk gets full.. i have tried to filter all other features except "region" but still it repeats a single entry many times.. i have attached a part of the file generated.. pls kindly help me. -------------- next part -------------- A non-text attachment was scrubbed... Name: ref_chrUn.gff Type: application/octet-stream Size: 602112 bytes Desc: not available URL: From anjan.purkayastha at gmail.com Mon Aug 15 14:32:39 2011 From: anjan.purkayastha at gmail.com (ANJAN PURKAYASTHA) Date: Mon, 15 Aug 2011 10:32:39 -0400 Subject: [Bioperl-l] Problem with Bio::DB::Taxonomy Message-ID: Hello, I wrote a short test script for the Bio::DB::Taxonomy module: ================================================ #!/usr/bin/perl -w use strict; use Bio::DB::Taxonomy; my ($nodesfile, $namesfile)= ('nodes.dmp', 'names.dmp'); my $db= new Bio::DB::Taxonomy(-source => 'flatfile', -nodesfile => $nodesfile, -namesfile => $namesfile ); my $bacteria= $db->get_Taxonomy_Node(-taxonid => '2'); print("$bacteria->id\t$bacteria->name\n"); ================================================ On running this script I expect the following output: 2 Bacteria. Instead I get a warning: UNIVERSAL->import is deprecated and will be removed in a future perl at /usr/share/perl5/vendor_perl/Bio/Tree/TreeFunctionsI.pm line 94. and the following ouput: Bio::Taxon=HASH(0x158dbe0)->id Bio::Taxon=HASH(0x158dbe0)->name The script seems to be working but there seems to be a problem with dereferencing a Bio::Taxon object. Any leads on how to troubleshoot this will be much appreciated. Thanks Anjan -- =================================== Anjan Purkayastha, PhD Senior Computational Biologist TessArae LLC 46090 Lake Center Plaza, Suite 304 Potomac Falls, VA 20165** Office- 703.444.7188 ext. 116 Mobile-703.740.6939 =================================== From scott at scottcain.net Fri Aug 19 13:45:47 2011 From: scott at scottcain.net (Scott Cain) Date: Fri, 19 Aug 2011 09:45:47 -0400 Subject: [Bioperl-l] pls help.. In-Reply-To: References: Message-ID: <92CA808D-16F0-4F08-BC44-8A0C06292EA8@scottcain.net> Ravi, The gff file is fairly useless from a debugging perspective. Can you please attach the genbank file you're using? Also, please indicate what version of bioperl you're using. Scott Sent from my iPhone On Aug 19, 2011, at 1:16 AM, Ravi Devani wrote: > ---------- Forwarded message ---------- > From: Ravi Devani > Date: Thu, Aug 18, 2011 at 12:40 PM > Subject: pls help.. > To: scott at scottcain.net > > > i tried to create a gff3 file from .gbk file using > bp_genbank2gff3.pl but > what i get is same features repeating many times.. and the file keeps > growing in size ntil my harddisk gets full.. i have tried to filter > all > other features except "region" but still it repeats a single entry > many > times.. i have attached a part of the file generated.. pls kindly > help me. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Fri Aug 19 14:05:03 2011 From: cjfields at illinois.edu (Chris Fields) Date: Fri, 19 Aug 2011 09:05:03 -0500 Subject: [Bioperl-l] Problem with Bio::DB::Taxonomy In-Reply-To: References: Message-ID: <7A733494-D831-43ED-9AE4-AB62AC5A2761@illinois.edu> Anjan, You are likely using an old version of BioPerl (this was fixed in the latest release on CPAN I believe). Bio::DB::Taxonomy uses Bio::Taxon, so the use ofname() is incorrect; it is node_name(); if this is documented somewhere it is incorrect, so let us know where that came from. Also, the print statement at the end isn't interpolating correctly; in general with objects I make this more explicit: print $bacteria->id."\t".$bacteria->node_name."\n"; Correcting that, it works for me: [cjfields at pyrimidine1 anjan]$ perl test.pl 2 Bacteria chris On Aug 15, 2011, at 9:32 AM, ANJAN PURKAYASTHA wrote: > Hello, > I wrote a short test script for the Bio::DB::Taxonomy module: > ================================================ > #!/usr/bin/perl -w > use strict; > use Bio::DB::Taxonomy; > > my ($nodesfile, $namesfile)= ('nodes.dmp', 'names.dmp'); > > my $db= new Bio::DB::Taxonomy(-source => 'flatfile', > -nodesfile => $nodesfile, > -namesfile => $namesfile > ); > > my $bacteria= $db->get_Taxonomy_Node(-taxonid => '2'); > print("$bacteria->id\t$bacteria->name\n"); > ================================================ > > On running this script I expect the following output: 2 Bacteria. > > Instead I get a warning: > UNIVERSAL->import is deprecated and will be removed in a future perl at > /usr/share/perl5/vendor_perl/Bio/Tree/TreeFunctionsI.pm line 94. > > and the following ouput: > Bio::Taxon=HASH(0x158dbe0)->id Bio::Taxon=HASH(0x158dbe0)->name > > The script seems to be working but there seems to be a problem with > dereferencing a Bio::Taxon object. > > Any leads on how to troubleshoot this will be much appreciated. > Thanks > Anjan > > > > -- > =================================== > Anjan Purkayastha, PhD > Senior Computational Biologist > TessArae LLC > 46090 Lake Center Plaza, Suite 304 > Potomac Falls, VA 20165** > Office- 703.444.7188 ext. 116 > Mobile-703.740.6939 > =================================== > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Fri Aug 19 14:26:06 2011 From: cjfields at illinois.edu (Chris Fields) Date: Fri, 19 Aug 2011 09:26:06 -0500 Subject: [Bioperl-l] pls help.. In-Reply-To: <92CA808D-16F0-4F08-BC44-8A0C06292EA8@scottcain.net> References: <92CA808D-16F0-4F08-BC44-8A0C06292EA8@scottcain.net> Message-ID: <1D308407-17A9-4203-9D6C-D71FA0FD74D0@illinois.edu> Scott, http://www.ncbi.nlm.nih.gov/nuccore/NW_002121371.1?report=gbwithparts&log$=seqview (it's in the GFF file) It definitely is getting stuck in a loop for the genomic region, but using the file for GFF3 doesn't make sense (very few features of note). On Aug 19, 2011, at 8:45 AM, Scott Cain wrote: > Ravi, > > The gff file is fairly useless from a debugging perspective. Can you please attach the genbank file you're using? Also, please indicate what version of bioperl you're using. > > Scott > > > Sent from my iPhone > > On Aug 19, 2011, at 1:16 AM, Ravi Devani wrote: > >> ---------- Forwarded message ---------- >> From: Ravi Devani >> Date: Thu, Aug 18, 2011 at 12:40 PM >> Subject: pls help.. >> To: scott at scottcain.net >> >> >> i tried to create a gff3 file from .gbk file using bp_genbank2gff3.pl but >> what i get is same features repeating many times.. and the file keeps >> growing in size ntil my harddisk gets full.. i have tried to filter all >> other features except "region" but still it repeats a single entry many >> times.. i have attached a part of the file generated.. pls kindly help me. >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From scott at scottcain.net Fri Aug 19 14:38:16 2011 From: scott at scottcain.net (Scott Cain) Date: Fri, 19 Aug 2011 10:38:16 -0400 Subject: [Bioperl-l] pls help.. In-Reply-To: <1D308407-17A9-4203-9D6C-D71FA0FD74D0@illinois.edu> References: <92CA808D-16F0-4F08-BC44-8A0C06292EA8@scottcain.net> <1D308407-17A9-4203-9D6C-D71FA0FD74D0@illinois.edu> Message-ID: I was wondering if perhaps the genbank file had been manipulated in some way. Scott On Fri, Aug 19, 2011 at 10:26 AM, Chris Fields wrote: > Scott, > > http://www.ncbi.nlm.nih.gov/nuccore/NW_002121371.1?report=gbwithparts&log$=seqview > > (it's in the GFF file) > > It definitely is getting stuck in a loop for the genomic region, but using the file for GFF3 doesn't make sense (very few features of note). > > On Aug 19, 2011, at 8:45 AM, Scott Cain wrote: > >> Ravi, >> >> The gff file is fairly useless from a debugging perspective. Can you please attach the genbank file you're using? ?Also, please indicate what version of bioperl you're using. >> >> Scott >> >> >> Sent from my iPhone >> >> On Aug 19, 2011, at 1:16 AM, Ravi Devani wrote: >> >>> ---------- Forwarded message ---------- >>> From: Ravi Devani >>> Date: Thu, Aug 18, 2011 at 12:40 PM >>> Subject: pls help.. >>> To: scott at scottcain.net >>> >>> >>> i tried to create a gff3 file from .gbk file using bp_genbank2gff3.pl but >>> what i get is same features repeating many times.. and the file keeps >>> growing in size ntil my harddisk gets full.. i have tried to filter all >>> other features except "region" but still it repeats a single entry many >>> times.. ?i have attached a part of the file generated.. pls kindly help me. >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- ------------------------------------------------------------------------ Scott Cain, Ph. D.? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? scott at scottcain dot net GMOD Coordinator (http://gmod.org/)? ? ? ? ? ? ? ? ? ?? 216-392-3087 Ontario Institute for Cancer Research From cjfields at illinois.edu Fri Aug 19 19:19:40 2011 From: cjfields at illinois.edu (Chris Fields) Date: Fri, 19 Aug 2011 14:19:40 -0500 Subject: [Bioperl-l] pls help.. In-Reply-To: References: <92CA808D-16F0-4F08-BC44-8A0C06292EA8@scottcain.net> <1D308407-17A9-4203-9D6C-D71FA0FD74D0@illinois.edu> Message-ID: Yeah, the output is rather odd. Maybe it's using the contig file version? chris On Aug 19, 2011, at 9:38 AM, Scott Cain wrote: > I was wondering if perhaps the genbank file had been manipulated in some way. > > Scott > > > On Fri, Aug 19, 2011 at 10:26 AM, Chris Fields wrote: >> Scott, >> >> http://www.ncbi.nlm.nih.gov/nuccore/NW_002121371.1?report=gbwithparts&log$=seqview >> >> (it's in the GFF file) >> >> It definitely is getting stuck in a loop for the genomic region, but using the file for GFF3 doesn't make sense (very few features of note). >> >> On Aug 19, 2011, at 8:45 AM, Scott Cain wrote: >> >>> Ravi, >>> >>> The gff file is fairly useless from a debugging perspective. Can you please attach the genbank file you're using? Also, please indicate what version of bioperl you're using. >>> >>> Scott >>> >>> >>> Sent from my iPhone >>> >>> On Aug 19, 2011, at 1:16 AM, Ravi Devani wrote: >>> >>>> ---------- Forwarded message ---------- >>>> From: Ravi Devani >>>> Date: Thu, Aug 18, 2011 at 12:40 PM >>>> Subject: pls help.. >>>> To: scott at scottcain.net >>>> >>>> >>>> i tried to create a gff3 file from .gbk file using bp_genbank2gff3.pl but >>>> what i get is same features repeating many times.. and the file keeps >>>> growing in size ntil my harddisk gets full.. i have tried to filter all >>>> other features except "region" but still it repeats a single entry many >>>> times.. i have attached a part of the file generated.. pls kindly help me. >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> > > > > -- > ------------------------------------------------------------------------ > Scott Cain, Ph. D. scott at scottcain dot net > GMOD Coordinator (http://gmod.org/) 216-392-3087 > Ontario Institute for Cancer Research From hlapp at drycafe.net Sat Aug 20 03:38:51 2011 From: hlapp at drycafe.net (Hilmar Lapp) Date: Fri, 19 Aug 2011 22:38:51 -0500 Subject: [Bioperl-l] [BioSQL-l] How is is_circular recorded in BioSQL (by BioPerl)? In-Reply-To: <4E2D79D6.6020108@gmail.com> References: <4E2D5000.30305@gmail.com> <4E2D5314.5090107@gmail.com> <4E2D5BAC.8020001@gmail.com> <4E2D79D6.6020108@gmail.com> Message-ID: <59AF5708-AECD-4375-9EB8-6E79D4B21C26@drycafe.net> I realize I'm chiming in here late, but the below sums it up quite well. In fact, biosequence.alphabet column was originally (pre-2002) called molecule, and the BioPerl Genbank writer defaults to alphabet() if molecule() is not defined. -hilmar Sent with a tap. On Jul 25, 2011, at 9:12 AM, Roy Chaudhuri wrote: > As with the is_circular hack, you could store the molecule type by adding it as an annotation in the SequenceProcessor (it's stored as $seq->molecule by BioPerl). > > Actually, when round-tripping a GenBank file through BioSQL, the LOCUS line molecule type ends up in lower case, which makes me wonder if it is coming from alphabet in the biosequence table. From hlapp at drycafe.net Sat Aug 20 01:02:12 2011 From: hlapp at drycafe.net (Hilmar Lapp) Date: Fri, 19 Aug 2011 20:02:12 -0500 Subject: [Bioperl-l] Error writing SequenceProcessor to associate GO terms in biosql database In-Reply-To: <26C59A57-F54A-4237-8D97-4E7A77E55D59@sgul.ac.uk> References: <26C59A57-F54A-4237-8D97-4E7A77E55D59@sgul.ac.uk> Message-ID: <6BDB69DE-5856-4061-96FA-0CF2884EDD9E@drycafe.net> Hi Adam I'm not sure whether you've received a response to this. Apologies if not. There is indeed a NOT NULL constraint on seqfeature_qualifier_value.value. The only other metadata association table in BioSQL that does this is location_qualifier_value. In the latter case there is arguably some sense to that (at least originally for locations the purpose of that table was pretty much to store the fuzzy location start/end properties), but for seqfeatures this looks like a bug to me. I'll post this to the BioSQL list and fix it f there are no objections, but feel free to drop the NOT NULL on that column yourself in the meantime. The INSERT query gets constructed in the innards of Bioperl-db. There is no reason to mess with that for this problem though - just drop the NOT NULL constraint. -hilmar Sent with a tap. On Jul 26, 2011, at 10:07 AM, Adam Witney wrote: > > Hi, > > I'm trying to write a SequenceProcessor for a genbank file to associate GO terms to the GO data preloaded in my biosql database. The command looks like this: > > perl load_seqdatabase.pl --dbname=biosql --driver=Pg --host=myhost --port= 5432 --dbuser=user --dbpass=pass -format genbank -namespace testing -pipeline 'GOSequenceProcessor' --debug S_sonnei.EB1_s_sonnei.dat > > The SequenceProcessor process_seq looks like this: > > sub process_seq{ > my ($self,$seq) = @_; > > my @features = $seq->get_SeqFeatures(); > foreach my $feat ( @features ) { > if ( $feat->has_tag('db_xref') ) { > my @db_xrefs = $feat->get_tag_values('db_xref'); > > foreach my $db_xref (@db_xrefs) { > if ( $db_xref =~ m/^GO:/ ) { > my $term = Bio::Annotation::OntologyTerm->new(-identifier => $db_xref, > -ontology => 'Gene Ontology'); > $feat->annotation->add_Annotation($term); > } > } > } > } > > return ($seq); > } > > But this gives this error: > > preparing INSERT statement: INSERT INTO seqfeature_qualifier_value (seqfeature_id, term_id, rank) VALUES (?, ?, ?) > TermAdaptor::add_assoc: binding column 1 to "935181" (FK to Bio::SeqFeature::Generic) > TermAdaptor::add_assoc: binding column 2 to "50253" (FK to Bio::Annotation::OntologyTerm) > TermAdaptor::add_assoc: binding column 3 to "1" (rank) > > --------------------- WARNING --------------------- > MSG: TermAdaptor::add_assoc: unexpected failure of statement execution: ERROR: null value in column "value" violates not-null constraint > name: INSERT ASSOC [1] Bio::SeqFeature::Generic;Bio::Annotation::OntologyTerm > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::add_association /var/users/adam/BioPerl/bioperl-db/lib//Bio/DB/BioSQL/BasePersistenceAdaptor.pm:458 > STACK Bio::DB::BioSQL::AnnotationCollectionAdaptor::add_association /var/users/adam/BioPerl/bioperl-db/lib//Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:468 > STACK Bio::DB::BioSQL::SeqFeatureAdaptor::store_children /var/users/adam/BioPerl/bioperl-db/lib//Bio/DB/BioSQL/SeqFeatureAdaptor.pm:304 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create /var/users/adam/BioPerl/bioperl-db/lib//Bio/DB/BioSQL/BasePersistenceAdaptor.pm:227 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store /var/users/adam/BioPerl/bioperl-db/lib//Bio/DB/BioSQL/BasePersistenceAdaptor.pm:264 > STACK Bio::DB::Persistent::PersistentObject::store /var/users/adam/BioPerl/bioperl-db/lib//Bio/DB/Persistent/PersistentObject.pm:284 > STACK Bio::DB::BioSQL::SeqAdaptor::store_children /var/users/adam/BioPerl/bioperl-db/lib//Bio/DB/BioSQL/SeqAdaptor.pm:257 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create /var/users/adam/BioPerl/bioperl-db/lib//Bio/DB/BioSQL/BasePersistenceAdaptor.pm:227 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store /var/users/adam/BioPerl/bioperl-db/lib//Bio/DB/BioSQL/BasePersistenceAdaptor.pm:264 > STACK Bio::DB::Persistent::PersistentObject::store /var/users/adam/BioPerl/bioperl-db/lib//Bio/DB/Persistent/PersistentObject.pm:284STACK (eval) /var/users/adam/BioPerl/bioperl-db/scripts/biosql/load_seqdatabase.pl:630 > STACK toplevel /var/users/adam/BioPerl/bioperl-db/scripts/biosql/load_seqdatabase.pl:612 > > As you can see it generates an INSERT against seqfeature_qualifier_value without including a 'value' field, which is of course defined as NOT NULL. > > Firstly, is this the best way to achieve this? And secondly, where is the INSERT statement put together, I can't seem to find it in the object hierarchy > > Thanks > > adam > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From ulrik.stervbo at gmail.com Sun Aug 21 17:33:44 2011 From: ulrik.stervbo at gmail.com (Ulrik Stervbo) Date: Sun, 21 Aug 2011 19:33:44 +0200 Subject: [Bioperl-l] Change of Expasy Protparam url Message-ID: it seems the there are some minor changes with the urls for the expasy-services. In the Protparam.pm, line 110 should be changed from @args=('-url'=>'http://www.expasy.org/cgi-bin/protparam','-form'=>'sequence', at args); to @args=('-url'=>'http://web.expasy.org/cgi-bin/protparam/protparam','-form'=>'sequence', at args); At least it seems to be working here, after adding the change to my local Protparam.pm Cheers, Ulrik From cjfields at illinois.edu Sun Aug 21 17:56:10 2011 From: cjfields at illinois.edu (Chris Fields) Date: Sun, 21 Aug 2011 12:56:10 -0500 Subject: [Bioperl-l] Change of Expasy Protparam url In-Reply-To: References: Message-ID: <9178B7E4-6EF2-4BC7-9B1C-9E5B282B5012@illinois.edu> Thanks for pointing that out. I've updated that on github. The critical thing is to get some tests working, so a failure for the webservice doesn't happen again w/o some exceptions (so we can track this). chris On Aug 21, 2011, at 12:33 PM, Ulrik Stervbo wrote: > it seems the there are some minor changes with the urls for the expasy-services. > > In the Protparam.pm, line 110 should be changed from > @args=('-url'=>'http://www.expasy.org/cgi-bin/protparam','-form'=>'sequence', at args); > > to > > @args=('-url'=>'http://web.expasy.org/cgi-bin/protparam/protparam','-form'=>'sequence', at args); > > At least it seems to be working here, after adding the change to my > local Protparam.pm > > Cheers, > Ulrik > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From scott at scottcain.net Mon Aug 22 15:51:55 2011 From: scott at scottcain.net (Scott Cain) Date: Mon, 22 Aug 2011 11:51:55 -0400 Subject: [Bioperl-l] pls help.. In-Reply-To: References: <92CA808D-16F0-4F08-BC44-8A0C06292EA8@scottcain.net> <1D308407-17A9-4203-9D6C-D71FA0FD74D0@illinois.edu> Message-ID: Hello Ravi, Please keep the BioPerl mailing list cc'ed. I downloaded your 1.7GB multi-genbank file and started processing it with bp_genbank2gff3.pl and I killed it when the GFF file got to 10GB, however, it was working as expected. I suggest you upgrade to the most recent release of BioPerl and try again. Additionally, it might make sense to break that big multi-genbank file into smaller files. Scott On Sun, Aug 21, 2011 at 11:33 AM, Ravi Devani wrote: > scott i hv given the link to the gbk file, please kindly help me > > On 8/19/11, Scott Cain wrote: >> Ravi, >> >> I also meant to ask what version of BioPerl you are using. ?When I run >> this command >> >> ? bp_genbank2gff3.pl NW_002121371.gbk >> >> I get a rather dull GFF3 file with 4 lines of GFF (one region and >> three gaps) and a fasta section. >> >> Scott >> >> >> On Fri, Aug 19, 2011 at 12:33 PM, Ravi Devani >> wrote: >>> No the genbank file has not been manipulated >>> >>> On 8/19/11, Scott Cain wrote: >>>> I was wondering if perhaps the genbank file had been manipulated in some >>>> way. >>>> >>>> Scott >>>> >>>> >>>> On Fri, Aug 19, 2011 at 10:26 AM, Chris Fields >>>> wrote: >>>>> Scott, >>>>> >>>>> http://www.ncbi.nlm.nih.gov/nuccore/NW_002121371.1?report=gbwithparts&log$=seqview >>>>> >>>>> (it's in the GFF file) >>>>> >>>>> It definitely is getting stuck in a loop for the genomic region, but >>>>> using >>>>> the file for GFF3 doesn't make sense (very few features of note). >>>>> >>>>> On Aug 19, 2011, at 8:45 AM, Scott Cain wrote: >>>>> >>>>>> Ravi, >>>>>> >>>>>> The gff file is fairly useless from a debugging perspective. Can you >>>>>> please attach the genbank file you're using? ?Also, please indicate >>>>>> what >>>>>> version of bioperl you're using. >>>>>> >>>>>> Scott >>>>>> >>>>>> >>>>>> Sent from my iPhone >>>>>> >>>>>> On Aug 19, 2011, at 1:16 AM, Ravi Devani >>>>>> wrote: >>>>>> >>>>>>> ---------- Forwarded message ---------- >>>>>>> From: Ravi Devani >>>>>>> Date: Thu, Aug 18, 2011 at 12:40 PM >>>>>>> Subject: pls help.. >>>>>>> To: scott at scottcain.net >>>>>>> >>>>>>> >>>>>>> i tried to create a gff3 file from .gbk file using bp_genbank2gff3.pl >>>>>>> but >>>>>>> what i get is same features repeating many times.. and the file keeps >>>>>>> growing in size ntil my harddisk gets full.. i have tried to filter >>>>>>> all >>>>>>> other features except "region" but still it repeats a single entry >>>>>>> many >>>>>>> times.. ?i have attached a part of the file generated.. pls kindly >>>>>>> help >>>>>>> me. >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Bioperl-l mailing list >>>>>>> Bioperl-l at lists.open-bio.org >>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>>>> _______________________________________________ >>>>>> Bioperl-l mailing list >>>>>> Bioperl-l at lists.open-bio.org >>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>>> >>>>> >>>> >>>> >>>> >>>> -- >>>> ------------------------------------------------------------------------ >>>> Scott Cain, Ph. D.? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? scott at scottcain >>>> dot >>>> net >>>> GMOD Coordinator (http://gmod.org/)? ? ? ? ? ? ? ? ? ?? 216-392-3087 >>>> Ontario Institute for Cancer Research >>>> >>> >> >> >> >> -- >> ------------------------------------------------------------------------ >> Scott Cain, Ph. D.? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? scott at scottcain dot >> net >> GMOD Coordinator (http://gmod.org/)? ? ? ? ? ? ? ? ? ?? 216-392-3087 >> Ontario Institute for Cancer Research >> > -- ------------------------------------------------------------------------ Scott Cain, Ph. D.? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? scott at scottcain dot net GMOD Coordinator (http://gmod.org/)? ? ? ? ? ? ? ? ? ?? 216-392-3087 Ontario Institute for Cancer Research From allenday at ionflux.com Mon Aug 22 18:40:33 2011 From: allenday at ionflux.com (Allen Day, PhD) Date: Mon, 22 Aug 2011 18:40:33 +0000 Subject: [Bioperl-l] Beijing and Los Angeles Human NGS Biostatistics/Informatics jobs Message-ID: Hi all, Ion Flux is a startup that I just created to apply NGS technology to the clinical diagnostics field. We like to think of ourselves as an enterprise class "23andme". This is an early-stage startup -- you will have a chance to influence the company and to be rewarded accordingly. I am Allen, the founder. We have a couple of open positions - for smart, passionate, scientist / engineering types. Others need not apply. Please check out these job descriptions, if this sparks your interest: http://ionflux.com/blog/careers/bioinformatician-data-modeling-and-processing/ http://ionflux.com/blog/careers/bioinformatician-data-analysis-and-algorithms/ Our offices are in Los Angeles (UCLA adjacent) and Beijing (??@???). I'm happy to post future openings to other lists in the future if this isn't the right venue for an occasional job announcement. -Allen From acpatel at gmail.com Mon Aug 22 19:25:50 2011 From: acpatel at gmail.com (Anand Patel) Date: Mon, 22 Aug 2011 14:25:50 -0500 Subject: [Bioperl-l] primer3redux 0.09 add_targets is not there Message-ID: I'm trying to get Primer3Redux to work, and am noticing some strange things. While I found and changed my parameters to the new primer3 2.2.3 parameters, I still can't find add_targets. Assigning the parameters using set_parameters works for primer3redux, add_targets is ?leftover? from primer3. So is this a doc/POD issue? Thanks, Anand Anand C. Patel, MD MS Washington University School of Medicine acpatel at gmail.com From cjfields1 at gmail.com Mon Aug 22 19:42:28 2011 From: cjfields1 at gmail.com (Christopher Fields) Date: Mon, 22 Aug 2011 14:42:28 -0500 Subject: [Bioperl-l] primer3redux 0.09 add_targets is not there In-Reply-To: References: Message-ID: On Aug 22, 2011, at 2:25 PM, Anand Patel wrote: > I'm trying to get Primer3Redux to work, and am noticing some strange > things. While I found and changed my parameters to the new primer3 > 2.2.3 parameters, I still can't find add_targets. > > Assigning the parameters using set_parameters works for primer3redux, > add_targets is ?leftover? from primer3. > > So is this a doc/POD issue? I'm confused. You are trying to use add_targets with the latest primer3, but it isn't there? Or is the Primer3Redux wrapper missing this parameter? chris > Thanks, > Anand > > Anand C. Patel, MD MS > Washington University School of Medicine > acpatel at gmail.com > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From acpatel at gmail.com Mon Aug 22 19:52:12 2011 From: acpatel at gmail.com (Anand Patel) Date: Mon, 22 Aug 2011 14:52:12 -0500 Subject: [Bioperl-l] primer3redux 0.09 add_targets is not there In-Reply-To: References: Message-ID: my $primer3 = Bio::Tools::Run::Primer3Redux->new(-outfile => "temp.out", -path => "/usr/bin/primer3_core"); If I use this: $primer3->add_targets( 'PRIMER_OPT_TM'=>$PRIMER_OPT_TM,'PRIMER_MIN_TM'=>$PRIMER_MIN_TM, 'PRIMER_MAX_TM'=>$PRIMER_MAX_TM, 'PRIMER_PAIR_MAX_DIFF_TM'=>$PRIMER_MAX_DIFF_TM, 'PRIMER_MAX_SIZE'=>$PRIMER_MAX_SIZE,'PRIMER_OPT_SIZE'=>$PRIMER_OPT_SIZE, 'PRIMER_MIN_SIZE'=>$PRIMER_MIN_SIZE, 'PRIMER_MAX_GC'=>$PRIMER_MAX_GC, 'PRIMER_OPT_GC_PERCENT'=>$PRIMER_OPT_GC_PERCENT, 'PRIMER_MIN_GC'=>$PRIMER_MIN_GC, 'SEQUENCE_TARGET'=>$TARGET, 'PRIMER_PRODUCT_SIZE_RANGE' =>$PRIMER_PRODUCT_SIZE_RANGE); I get: Can't locate object method "add_targets" via package "Bio::Tools::Run::Primer3Redux" at p3ra.pl line 31, line 1. On the other hand, if I change that line to: $primer3->set_parameters( 'PRIMER_OPT_TM'=>$PRIMER_OPT_TM,'PRIMER_MIN_TM'=>$PRIMER_MIN_TM, 'PRIMER_MAX_TM'=>$PRIMER_MAX_TM, 'PRIMER_PAIR_MAX_DIFF_TM'=>$PRIMER_MAX_DIFF_TM, 'PRIMER_MAX_SIZE'=>$PRIMER_MAX_SIZE,'PRIMER_OPT_SIZE'=>$PRIMER_OPT_SIZE, 'PRIMER_MIN_SIZE'=>$PRIMER_MIN_SIZE, 'PRIMER_MAX_GC'=>$PRIMER_MAX_GC, 'PRIMER_OPT_GC_PERCENT'=>$PRIMER_OPT_GC_PERCENT, 'PRIMER_MIN_GC'=>$PRIMER_MIN_GC, 'SEQUENCE_TARGET'=>$TARGET, 'PRIMER_PRODUCT_SIZE_RANGE' =>$PRIMER_PRODUCT_SIZE_RANGE); It works. When I looked at the source code for Primer3Redux, I couldn't find add_targets, but set_parameters looked like it might work, so I used that instead, and it worked. But I see over in the github that there are other issues with the documentation (how primer3redux's result object is now 3 deep rather than 2 deep). Not sure if this is in that category or not. Thanks, Anand On Mon, Aug 22, 2011 at 2:42 PM, Christopher Fields wrote: > On Aug 22, 2011, at 2:25 PM, Anand Patel wrote: > >> I'm trying to get Primer3Redux to work, and am noticing some strange >> things. ?While I found and changed my parameters to the new primer3 >> 2.2.3 parameters, I still can't find add_targets. >> >> Assigning the parameters using set_parameters works for primer3redux, >> add_targets is ?leftover? from primer3. >> >> So is this a doc/POD issue? > > I'm confused. ?You are trying to use add_targets with the latest primer3, but it isn't there? ?Or is the Primer3Redux wrapper missing this parameter? > > chris > >> Thanks, >> Anand >> >> Anand C. Patel, MD MS >> Washington University School of Medicine >> acpatel at gmail.com >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From cjfields at illinois.edu Mon Aug 22 20:10:25 2011 From: cjfields at illinois.edu (Chris Fields) Date: Mon, 22 Aug 2011 15:10:25 -0500 Subject: [Bioperl-l] primer3redux 0.09 add_targets is not there In-Reply-To: References: Message-ID: <3BE41688-C163-4EA1-AF6A-34A6052FCFEA@illinois.edu> On Aug 22, 2011, at 2:52 PM, Anand Patel wrote: > my $primer3 = Bio::Tools::Run::Primer3Redux->new(-outfile => > "temp.out", -path => "/usr/bin/primer3_core"); > > If I use this: > $primer3->add_targets( > 'PRIMER_OPT_TM'=>$PRIMER_OPT_TM,'PRIMER_MIN_TM'=>$PRIMER_MIN_TM, > 'PRIMER_MAX_TM'=>$PRIMER_MAX_TM, > 'PRIMER_PAIR_MAX_DIFF_TM'=>$PRIMER_MAX_DIFF_TM, > 'PRIMER_MAX_SIZE'=>$PRIMER_MAX_SIZE,'PRIMER_OPT_SIZE'=>$PRIMER_OPT_SIZE, > 'PRIMER_MIN_SIZE'=>$PRIMER_MIN_SIZE, > 'PRIMER_MAX_GC'=>$PRIMER_MAX_GC, > 'PRIMER_OPT_GC_PERCENT'=>$PRIMER_OPT_GC_PERCENT, > 'PRIMER_MIN_GC'=>$PRIMER_MIN_GC, > 'SEQUENCE_TARGET'=>$TARGET, 'PRIMER_PRODUCT_SIZE_RANGE' > =>$PRIMER_PRODUCT_SIZE_RANGE); > > I get: > Can't locate object method "add_targets" via package > "Bio::Tools::Run::Primer3Redux" at p3ra.pl line 31, line 1. > > On the other hand, if I change that line to: > $primer3->set_parameters( > 'PRIMER_OPT_TM'=>$PRIMER_OPT_TM,'PRIMER_MIN_TM'=>$PRIMER_MIN_TM, > 'PRIMER_MAX_TM'=>$PRIMER_MAX_TM, > 'PRIMER_PAIR_MAX_DIFF_TM'=>$PRIMER_MAX_DIFF_TM, > 'PRIMER_MAX_SIZE'=>$PRIMER_MAX_SIZE,'PRIMER_OPT_SIZE'=>$PRIMER_OPT_SIZE, > 'PRIMER_MIN_SIZE'=>$PRIMER_MIN_SIZE, > 'PRIMER_MAX_GC'=>$PRIMER_MAX_GC, > 'PRIMER_OPT_GC_PERCENT'=>$PRIMER_OPT_GC_PERCENT, > 'PRIMER_MIN_GC'=>$PRIMER_MIN_GC, > 'SEQUENCE_TARGET'=>$TARGET, 'PRIMER_PRODUCT_SIZE_RANGE' > =>$PRIMER_PRODUCT_SIZE_RANGE); > > It works. When I looked at the source code for Primer3Redux, I > couldn't find add_targets, but set_parameters looked like it might > work, so I used that instead, and it worked. > > But I see over in the github that there are other issues with the > documentation (how primer3redux's result object is now 3 deep rather > than 2 deep). Not sure if this is in that category or not. That is true; documentation was to be updated but that hasn't happened yet (haven't had the free time to work specifically on this, and I think fschwach was to work on some HOWTO documentation). I do plan on an update in the next few weeks to address the various Issues on github, if you can file this as well it would help. I have to go back and look at the history of add_targets() reative to primer3 bioperl code, but I don't think this was part of the commit history of Bio::Tools::Run::Primer3Redux (maybe for the old version, Bio::Tools::Run::Primer3), so that is probably cruft left over from the update. Would be easy enough to alias it for convenience... chris > Thanks, > Anand ... From miquel.amat at me.com Tue Aug 23 20:11:15 2011 From: miquel.amat at me.com (Miguel A. Amat) Date: Tue, 23 Aug 2011 16:11:15 -0400 Subject: [Bioperl-l] Installation on OS X Lion Message-ID: I am trying to install bioperl on mac os x 10.7 but ran into problems with the dependency packages Bio::ASN1::EntrezGene and DBD::mysql. I am running the latest version of CPAN and perl -v 5.12.3 and the BioPerl-1.6.1 package. The installation was being conducted interactively through via the "perl Build.PL" command. Can you provide some help, or suggest an alternative way of installing BioPerl? From cjfields at illinois.edu Wed Aug 24 00:14:49 2011 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 23 Aug 2011 19:14:49 -0500 Subject: [Bioperl-l] Installation on OS X Lion In-Reply-To: References: Message-ID: Try installing the latest version from CPAN; this bypasses the Bio::ASN1::EntrezGene req. DBD::mysql is only needed if you intend on using modules requiring that functionality. chris On Aug 23, 2011, at 3:11 PM, Miguel A. Amat wrote: > I am trying to install bioperl on mac os x 10.7 but ran into problems with the dependency packages Bio::ASN1::EntrezGene and DBD::mysql. > > I am running the latest version of CPAN and perl -v 5.12.3 and the BioPerl-1.6.1 package. The installation was being conducted interactively through via the "perl Build.PL" command. > > Can you provide some help, or suggest an alternative way of installing BioPerl? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From miquel.amat at me.com Wed Aug 24 03:25:31 2011 From: miquel.amat at me.com (Miguel A Amat) Date: Tue, 23 Aug 2011 23:25:31 -0400 Subject: [Bioperl-l] Installation on OS X Lion In-Reply-To: References: Message-ID: Thanks for the feedback, Chris. Now I just need to get GD to install ... On Aug 23, 2011, at 8:14 PM, Chris Fields wrote: > Try installing the latest version from CPAN; this bypasses the Bio::ASN1::EntrezGene req. DBD::mysql is only needed if you intend on using modules requiring that functionality. > > chris > > On Aug 23, 2011, at 3:11 PM, Miguel A. Amat wrote: > >> I am trying to install bioperl on mac os x 10.7 but ran into problems with the dependency packages Bio::ASN1::EntrezGene and DBD::mysql. >> >> I am running the latest version of CPAN and perl -v 5.12.3 and the BioPerl-1.6.1 package. The installation was being conducted interactively through via the "perl Build.PL" command. >> >> Can you provide some help, or suggest an alternative way of installing BioPerl? >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > From scott at scottcain.net Wed Aug 24 14:31:44 2011 From: scott at scottcain.net (Scott Cain) Date: Wed, 24 Aug 2011 10:31:44 -0400 Subject: [Bioperl-l] Installation on OS X Lion In-Reply-To: References: Message-ID: <0D4184A9-2166-4869-823A-BC780E684DCE@scottcain.net> Hi Miguel, Did you try the installer for snow leopard on sourceforge: http://sourceforge.net/projects/gmod/files/Generic%20Genome%20Browser/libgd-MacOSX/ I don't know if it will work on lion but I don't have a copy of lion yet to try it out on. Scott Sent from my iPhone On Aug 23, 2011, at 11:25 PM, Miguel A Amat wrote: > Thanks for the feedback, Chris. Now I just need to get GD to > install ... > > On Aug 23, 2011, at 8:14 PM, Chris Fields > wrote: > >> Try installing the latest version from CPAN; this bypasses the >> Bio::ASN1::EntrezGene req. DBD::mysql is only needed if you intend >> on using modules requiring that functionality. >> >> chris >> >> On Aug 23, 2011, at 3:11 PM, Miguel A. Amat wrote: >> >>> I am trying to install bioperl on mac os x 10.7 but ran into >>> problems with the dependency packages Bio::ASN1::EntrezGene and >>> DBD::mysql. >>> >>> I am running the latest version of CPAN and perl -v 5.12.3 and the >>> BioPerl-1.6.1 package. The installation was being conducted >>> interactively through via the "perl Build.PL" command. >>> >>> Can you provide some help, or suggest an alternative way of >>> installing BioPerl? >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From sheena.scroggins at gmail.com Wed Aug 24 16:21:07 2011 From: sheena.scroggins at gmail.com (Sheena Scroggins) Date: Wed, 24 Aug 2011 09:21:07 -0700 Subject: [Bioperl-l] End of GSoC Message-ID: I just wanted to give a GIANT thanks to my mentors on the BioPerl project, Rob Buels and Chris Fields. They helped me tremendously and we made great progress on the reorganization. All of the modules we extracted can be found on github at https://github.com/bioperl We used a Dist Zilla plugin bundle, which can also be found there. The steps used in the process will be outlined on the BioPerl wiki in the upcoming weeks. The reorganization is off to a great start and by outlining the workflow I'm hoping others will be able to contribute more easily. The progress updates were posted at techomics.com during the project, although they were sporadic. The original outline of the project can be found there as well. Thanks again to all the mentors of GSoC, this program wouldn't work without you! Sheena From miquel.amat at me.com Wed Aug 24 17:48:06 2011 From: miquel.amat at me.com (Miguel A. Amat) Date: Wed, 24 Aug 2011 13:48:06 -0400 Subject: [Bioperl-l] Installation on OS X Lion In-Reply-To: <0D4184A9-2166-4869-823A-BC780E684DCE@scottcain.net> References: <0D4184A9-2166-4869-823A-BC780E684DCE@scottcain.net> Message-ID: <251484F2-E0EB-454D-B664-BB0834FFCF76@me.com> Thanks for all the help; I finally got it to work. Here are the steps I took: upgraded CPAN and used latest version of BioPerl installed dependencies in interactive mode, but GD failed. Quit the installation and tried ?install GD-SVG?; this one seems to have less functionality than GD, but it worked. Installed Bio::Perl. Then, installed Bio::ASN1::EntrezGene Best. On Aug 24, 2011, at 10:31 AM, Scott Cain wrote: > Hi Miguel, > > Did you try the installer for snow leopard on sourceforge: > > http://sourceforge.net/projects/gmod/files/Generic%20Genome%20Browser/libgd-MacOSX/ > > I don't know if it will work on lion but I don't have a copy of lion yet to try it out on. > > Scott > > > Sent from my iPhone > > On Aug 23, 2011, at 11:25 PM, Miguel A Amat wrote: > >> Thanks for the feedback, Chris. Now I just need to get GD to install ... >> >> On Aug 23, 2011, at 8:14 PM, Chris Fields wrote: >> >>> Try installing the latest version from CPAN; this bypasses the Bio::ASN1::EntrezGene req. DBD::mysql is only needed if you intend on using modules requiring that functionality. >>> >>> chris >>> >>> On Aug 23, 2011, at 3:11 PM, Miguel A. Amat wrote: >>> >>>> I am trying to install bioperl on mac os x 10.7 but ran into problems with the dependency packages Bio::ASN1::EntrezGene and DBD::mysql. >>>> >>>> I am running the latest version of CPAN and perl -v 5.12.3 and the BioPerl-1.6.1 package. The installation was being conducted interactively through via the "perl Build.PL" command. >>>> >>>> Can you provide some help, or suggest an alternative way of installing BioPerl? >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Wed Aug 24 17:51:19 2011 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 24 Aug 2011 12:51:19 -0500 Subject: [Bioperl-l] Installation on OS X Lion In-Reply-To: <251484F2-E0EB-454D-B664-BB0834FFCF76@me.com> References: <0D4184A9-2166-4869-823A-BC780E684DCE@scottcain.net> <251484F2-E0EB-454D-B664-BB0834FFCF76@me.com> Message-ID: <200F67E8-7B4E-40AD-9C0D-37160B970F22@illinois.edu> Interesting, since GD::SVG requires GD. Anyway, glad to know it's working for you! chris On Aug 24, 2011, at 12:48 PM, Miguel A. Amat wrote: > Thanks for all the help; I finally got it to work. Here are the steps I took: > > > ? upgraded CPAN and used latest version of BioPerl > ? installed dependencies in interactive mode, but GD failed. > ? Quit the installation and tried ?install GD-SVG?; this one seems to have less functionality than GD, but it worked. > ? Installed Bio::Perl. > ? Then, installed Bio::ASN1::EntrezGene > > > > > > > Best. > > > On Aug 24, 2011, at 10:31 AM, Scott Cain wrote: > >> Hi Miguel, >> >> Did you try the installer for snow leopard on sourceforge: >> >> http://sourceforge.net/projects/gmod/files/Generic%20Genome%20Browser/libgd-MacOSX/ >> >> I don't know if it will work on lion but I don't have a copy of lion yet to try it out on. >> >> Scott >> >> >> Sent from my iPhone >> >> On Aug 23, 2011, at 11:25 PM, Miguel A Amat wrote: >> >>> Thanks for the feedback, Chris. Now I just need to get GD to install ... >>> >>> On Aug 23, 2011, at 8:14 PM, Chris Fields wrote: >>> >>>> Try installing the latest version from CPAN; this bypasses the Bio::ASN1::EntrezGene req. DBD::mysql is only needed if you intend on using modules requiring that functionality. >>>> >>>> chris >>>> >>>> On Aug 23, 2011, at 3:11 PM, Miguel A. Amat wrote: >>>> >>>>> I am trying to install bioperl on mac os x 10.7 but ran into problems with the dependency packages Bio::ASN1::EntrezGene and DBD::mysql. >>>>> >>>>> I am running the latest version of CPAN and perl -v 5.12.3 and the BioPerl-1.6.1 package. The installation was being conducted interactively through via the "perl Build.PL" command. >>>>> >>>>> Can you provide some help, or suggest an alternative way of installing BioPerl? >>>>> _______________________________________________ >>>>> Bioperl-l mailing list >>>>> Bioperl-l at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > From abualiga2 at gmail.com Wed Aug 24 18:09:10 2011 From: abualiga2 at gmail.com (galeb abu-ali) Date: Wed, 24 Aug 2011 14:09:10 -0400 Subject: [Bioperl-l] append schema to proxy In-Reply-To: References: Message-ID: Hi, I'm trying to run a program that generates a circular genome homology atlas "BLASTatlas" ( http://www.cbs.dtu.dk/ws/ws.php?entry=BLASTatlas ). I think the problem is with the module that appends schemas to the proxy, and I don't know how to do that manually. I've emailed the author couple times and have not heard back. Pasted below is the error message. At your convenience, I'd greatly appreciate your help. thanks galeb p/s - also, is there another program that can generate concetric circular plots of BLAST scores for multiple bacterial genomes with a per nucleotide resolution? thanks [galeb at localhost GeneWiz]$ BLASTatlas -modus circle -ref BX571966.fsa -proteins BX571966.proteins.fsa -ann BX571966.ann -blastcfg blast.cfg -customcfg custom.cfg --dnap="Intrinsic Curvature,Stacking Energy,Position Preference" -title "B. pseudomallei K96243" > sgeneric.ps # title set to 'B. pseudomallei K96243' # output format is ps # modus is 'circle' # loading reference genome ... # loading proteins ... # parsing blast lane configuration (blast.cfg) ... # .. parsing blast lane (B. ubonensis Bu) ... # .. .. program: tblastn # .. .. parsing color 101010_040410 # .. .. .. color from: r:10, g:10, b:10 # .. .. .. color to: r:04, g:04, b:10 # .. .. byrange: 0 .. 0.8 # .. parsing sequene source 'cat ./19539.fsa |' ... 1142 done # .. parsing blast lane (B. pseudomallei DM98) ... # .. .. program: tblastn # .. .. parsing color 101010_040410 # .. .. .. color from: r:10, g:10, b:10 # .. .. .. color to: r:04, g:04, b:10 # .. .. byrange: 0 .. 0.8 # .. parsing sequene source 'cat ./19509.fsa |' ... 2370 done # parsing custom lane configuration (custom.cfg) ... # .. parsing custom data entry SIDD at -0.035 ... # .. .. parsing color 000010_101010 # .. .. .. color from: r:00, g:00, b:10 # .. .. .. color to: r:10, g:10, b:10 # .. .. byrange: 9 .. 10 # .. .. boxfilter 5000 ... # .. parsing data source 'gunzip -c BX571966-57a2f2c2e11ca0dd8cd74493d667d4d6-3173005.sidd--0.035-c-10-c.out.gz | cut -f4 |' ... # .. .. parsing data source ... 3173005 done # reading external files and build hash of sequences ... *panic: schemas() removed in v2.00, not needed anymore* at /usr/local/lib/perl5/site_perl/5.12.2/XML/Compile/WSDL11.pm line 65 XML::Compile::WSDL11::schemas(XML::Compile::WSDL11=HASH(0x1fed6740)) at xml-compile.pl line 48 main::appendSchemas(XML::Compile::WSDL11=HASH(0x1fed6740), " http://www.cbs.dtu.dk/ws/common/ws_common_1_0b.xsd", " http://www.cbs.dtu.dk/ws/BLASTatlas/ws_blastatlas_1_0_ws2.xsd") at BLASTatlas line 177 From roy.chaudhuri at gmail.com Wed Aug 24 18:21:12 2011 From: roy.chaudhuri at gmail.com (Roy Chaudhuri) Date: Wed, 24 Aug 2011 19:21:12 +0100 Subject: [Bioperl-l] append schema to proxy In-Reply-To: References: Message-ID: <4E554118.90108@gmail.com> Hi Galeb, This is the wrong mailing list for your question - it's intended for discussion of the Bioperl toolkit, not general bioinformatics questions. Next time, try a general bioinformatics mailing list such as BBB: http://www.bioinformatics.org/lists/bbb Having said all that, maybe you could try BRIG: http://sourceforge.net/projects/brig/ http://www.biomedcentral.com/1471-2164/12/402 Cheers, Roy. On 24/08/2011 19:09, galeb abu-ali wrote: > Hi, > > I'm trying to run a program that generates a circular genome homology atlas > "BLASTatlas" ( http://www.cbs.dtu.dk/ws/ws.php?entry=BLASTatlas ). I think > the problem is with the module that appends schemas to the proxy, and I > don't know how to do that manually. I've emailed the author couple times and > have not heard back. Pasted below is the error message. At your convenience, > I'd greatly appreciate your help. > > thanks > > galeb > > p/s - also, is there another program that can generate concetric circular > plots of BLAST scores for multiple bacterial genomes with a per nucleotide > resolution? thanks > > [galeb at localhost GeneWiz]$ BLASTatlas -modus circle -ref BX571966.fsa > -proteins BX571966.proteins.fsa -ann BX571966.ann -blastcfg blast.cfg > -customcfg custom.cfg --dnap="Intrinsic Curvature,Stacking Energy,Position > Preference" -title "B. pseudomallei K96243"> sgeneric.ps > # title set to 'B. pseudomallei K96243' > # output format is ps > # modus is 'circle' > # loading reference genome ... > # loading proteins ... > # parsing blast lane configuration (blast.cfg) ... > # .. parsing blast lane (B. ubonensis Bu) ... > # .. .. program: tblastn > # .. .. parsing color 101010_040410 > # .. .. .. color from: r:10, g:10, b:10 > # .. .. .. color to: r:04, g:04, b:10 > # .. .. byrange: 0 .. 0.8 > # .. parsing sequene source 'cat ./19539.fsa |' ... 1142 done > # .. parsing blast lane (B. pseudomallei DM98) ... > # .. .. program: tblastn > # .. .. parsing color 101010_040410 > # .. .. .. color from: r:10, g:10, b:10 > # .. .. .. color to: r:04, g:04, b:10 > # .. .. byrange: 0 .. 0.8 > # .. parsing sequene source 'cat ./19509.fsa |' ... 2370 done > # parsing custom lane configuration (custom.cfg) ... > # .. parsing custom data entry SIDD at -0.035 ... > # .. .. parsing color 000010_101010 > # .. .. .. color from: r:00, g:00, b:10 > # .. .. .. color to: r:10, g:10, b:10 > # .. .. byrange: 9 .. 10 > # .. .. boxfilter 5000 ... > # .. parsing data source 'gunzip -c > BX571966-57a2f2c2e11ca0dd8cd74493d667d4d6-3173005.sidd--0.035-c-10-c.out.gz > | cut -f4 |' ... > # .. .. parsing data source ... 3173005 done > # reading external files and build hash of sequences ... > *panic: schemas() removed in v2.00, not needed anymore* > at /usr/local/lib/perl5/site_perl/5.12.2/XML/Compile/WSDL11.pm line 65 > XML::Compile::WSDL11::schemas(XML::Compile::WSDL11=HASH(0x1fed6740)) at > xml-compile.pl line 48 > main::appendSchemas(XML::Compile::WSDL11=HASH(0x1fed6740), " > http://www.cbs.dtu.dk/ws/common/ws_common_1_0b.xsd", " > http://www.cbs.dtu.dk/ws/BLASTatlas/ws_blastatlas_1_0_ws2.xsd") at > BLASTatlas line 177 > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Wed Aug 24 18:22:26 2011 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 24 Aug 2011 13:22:26 -0500 Subject: [Bioperl-l] append schema to proxy In-Reply-To: References: Message-ID: <61E5BF3C-653F-40D3-8764-0DA61859BC8B@illinois.edu> Sorry, but this doesn't have anything to do with BioPerl. Not sure you'll get an answer here. chris On Aug 24, 2011, at 1:09 PM, galeb abu-ali wrote: > Hi, > > I'm trying to run a program that generates a circular genome homology atlas > "BLASTatlas" ( http://www.cbs.dtu.dk/ws/ws.php?entry=BLASTatlas ). I think > the problem is with the module that appends schemas to the proxy, and I > don't know how to do that manually. I've emailed the author couple times and > have not heard back. Pasted below is the error message. At your convenience, > I'd greatly appreciate your help. > > thanks > > galeb > > p/s - also, is there another program that can generate concetric circular > plots of BLAST scores for multiple bacterial genomes with a per nucleotide > resolution? thanks > > [galeb at localhost GeneWiz]$ BLASTatlas -modus circle -ref BX571966.fsa > -proteins BX571966.proteins.fsa -ann BX571966.ann -blastcfg blast.cfg > -customcfg custom.cfg --dnap="Intrinsic Curvature,Stacking Energy,Position > Preference" -title "B. pseudomallei K96243" > sgeneric.ps > # title set to 'B. pseudomallei K96243' > # output format is ps > # modus is 'circle' > # loading reference genome ... > # loading proteins ... > # parsing blast lane configuration (blast.cfg) ... > # .. parsing blast lane (B. ubonensis Bu) ... > # .. .. program: tblastn > # .. .. parsing color 101010_040410 > # .. .. .. color from: r:10, g:10, b:10 > # .. .. .. color to: r:04, g:04, b:10 > # .. .. byrange: 0 .. 0.8 > # .. parsing sequene source 'cat ./19539.fsa |' ... 1142 done > # .. parsing blast lane (B. pseudomallei DM98) ... > # .. .. program: tblastn > # .. .. parsing color 101010_040410 > # .. .. .. color from: r:10, g:10, b:10 > # .. .. .. color to: r:04, g:04, b:10 > # .. .. byrange: 0 .. 0.8 > # .. parsing sequene source 'cat ./19509.fsa |' ... 2370 done > # parsing custom lane configuration (custom.cfg) ... > # .. parsing custom data entry SIDD at -0.035 ... > # .. .. parsing color 000010_101010 > # .. .. .. color from: r:00, g:00, b:10 > # .. .. .. color to: r:10, g:10, b:10 > # .. .. byrange: 9 .. 10 > # .. .. boxfilter 5000 ... > # .. parsing data source 'gunzip -c > BX571966-57a2f2c2e11ca0dd8cd74493d667d4d6-3173005.sidd--0.035-c-10-c.out.gz > | cut -f4 |' ... > # .. .. parsing data source ... 3173005 done > # reading external files and build hash of sequences ... > *panic: schemas() removed in v2.00, not needed anymore* > at /usr/local/lib/perl5/site_perl/5.12.2/XML/Compile/WSDL11.pm line 65 > XML::Compile::WSDL11::schemas(XML::Compile::WSDL11=HASH(0x1fed6740)) at > xml-compile.pl line 48 > main::appendSchemas(XML::Compile::WSDL11=HASH(0x1fed6740), " > http://www.cbs.dtu.dk/ws/common/ws_common_1_0b.xsd", " > http://www.cbs.dtu.dk/ws/BLASTatlas/ws_blastatlas_1_0_ws2.xsd") at > BLASTatlas line 177 > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From abualiga2 at gmail.com Wed Aug 24 18:39:33 2011 From: abualiga2 at gmail.com (abualiga2 at gmail.com) Date: Wed, 24 Aug 2011 18:39:33 +0000 Subject: [Bioperl-l] append schema to proxy In-Reply-To: <4E554118.90108@gmail.com> Message-ID: <00504502ec3723598f04ab44a23f@google.com> Roy, thanks! I'll try that. galeb On Aug 24, 2011 2:21pm, Roy Chaudhuri wrote: > Hi Galeb, > This is the wrong mailing list for your question - it's intended for > discussion of the Bioperl toolkit, not general bioinformatics questions. > Next time, try a general bioinformatics mailing list such as BBB: > http://www.bioinformatics.org/lists/bbb > Having said all that, maybe you could try BRIG: > http://sourceforge.net/projects/brig/ > http://www.biomedcentral.com/1471-2164/12/402 > Cheers, > Roy. From slucky at ibab.ac.in Mon Aug 22 06:01:16 2011 From: slucky at ibab.ac.in (Lucky Singh) Date: Mon, 22 Aug 2011 11:31:16 +0530 (IST) Subject: [Bioperl-l] Problem using Bio::Tools::Run::RemoteBlast Message-ID: <37711.192.168.1.254.1313992876.squirrel@webmail.ibab.ac.in> Dear sir/Ma'am, I am student of Institute of Bioinformatics and Applied Biotechnology, Bangalore, India. While doing my project work I needed remoteblast.pm. So I used default example program which is available with this package. Now I wanted to host it from web server, but This program is not working from it may be it is not able to create or write on file from web server but in command line it is working fine. I don't know the possible reason, please help me to figure it out. -> I am using same example program with basic cgi modification for taking input from web browser. -> Ubuntu 10.04 64 bit OS -> apache2 server -> I have given all permissions 777 recursively to cgi-bin folder -- Regards, Lucky Singh Institute of Bioinformatics and Applied Biotechnology, ------------------------------------------------------ Biotech Park Electronics City Phase I Bangalore 560 100 India. Tel: 080-28528900, 080-28528901, 080-28528902 Fax: 080-28528904 From abualiga2 at gmail.com Wed Aug 24 17:26:10 2011 From: abualiga2 at gmail.com (galeb abu-ali) Date: Wed, 24 Aug 2011 13:26:10 -0400 Subject: [Bioperl-l] append schema to proxy Message-ID: Hi, I'm trying to run a program that generates a circular genome homology atlas "BLASTatlas" ( http://www.cbs.dtu.dk/ws/ws.php?entry=BLASTatlas ). I think the problem is with the module that appends schemas to the proxy, and I don't know how to do that manually. I've emailed the author couple times and have not heard back. Pasted below is the error message. At your convenience, I'd greatly appreciate your help. thanks galeb p/s - also, is there another program that can generate concetric circular plots of BLAST scores for multiple bacterial genomes with a per nucleotide resolution? thanks [galeb at localhost GeneWiz]$ BLASTatlas -modus circle -ref BX571966.fsa -proteins BX571966.proteins.fsa -ann BX571966.ann -blastcfg blast.cfg -customcfg custom.cfg --dnap="Intrinsic Curvature,Stacking Energy,Position Preference" -title "B. pseudomallei K96243" > sgeneric.ps # title set to 'B. pseudomallei K96243' # output format is ps # modus is 'circle' # loading reference genome ... # loading proteins ... # parsing blast lane configuration (blast.cfg) ... # .. parsing blast lane (B. ubonensis Bu) ... # .. .. program: tblastn # .. .. parsing color 101010_040410 # .. .. .. color from: r:10, g:10, b:10 # .. .. .. color to: r:04, g:04, b:10 # .. .. byrange: 0 .. 0.8 # .. parsing sequene source 'cat ./19539.fsa |' ... 1142 done # .. parsing blast lane (B. pseudomallei DM98) ... # .. .. program: tblastn # .. .. parsing color 101010_040410 # .. .. .. color from: r:10, g:10, b:10 # .. .. .. color to: r:04, g:04, b:10 # .. .. byrange: 0 .. 0.8 # .. parsing sequene source 'cat ./19509.fsa |' ... 2370 done # parsing custom lane configuration (custom.cfg) ... # .. parsing custom data entry SIDD at -0.035 ... # .. .. parsing color 000010_101010 # .. .. .. color from: r:00, g:00, b:10 # .. .. .. color to: r:10, g:10, b:10 # .. .. byrange: 9 .. 10 # .. .. boxfilter 5000 ... # .. parsing data source 'gunzip -c BX571966-57a2f2c2e11ca0dd8cd74493d667d4d6-3173005.sidd--0.035-c-10-c.out.gz | cut -f4 |' ... # .. .. parsing data source ... 3173005 done # reading external files and build hash of sequences ... *panic: schemas() removed in v2.00, not needed anymore* at /usr/local/lib/perl5/site_perl/5.12.2/XML/Compile/WSDL11.pm line 65 XML::Compile::WSDL11::schemas(XML::Compile::WSDL11=HASH(0x1fed6740)) at xml-compile.pl line 48 main::appendSchemas(XML::Compile::WSDL11=HASH(0x1fed6740), " http://www.cbs.dtu.dk/ws/common/ws_common_1_0b.xsd", " http://www.cbs.dtu.dk/ws/BLASTatlas/ws_blastatlas_1_0_ws2.xsd") at BLASTatlas line 177 From jj.emerson at gmail.com Thu Aug 25 01:53:38 2011 From: jj.emerson at gmail.com (J.J. Emerson) Date: Wed, 24 Aug 2011 18:53:38 -0700 Subject: [Bioperl-l] Bio::SeqIO can't guess the format of data from a pipe Message-ID: Hello All, I have experienced some behavior in SeqIO that doesn't seem to be what I would expect. Basically, for a certain script, if I try to pass something like "-fh => \*STDIN" to Bio::SeqIO->new(), it will fail if both of the following two conditions are met simultaneously: 1. STDIN is coming from a pipe; 2. SeqIO is trying to guess the format. If STDIO is coming from redirection instead of a pipe or if the format is specified manually (i.e. BioPERL doesn't have to guess), the error doesn't seem to occur. This issue has been reported previously: http://lists.open-bio.org/pipermail/bioperl-l/2010-July/033681.html https://redmine.open-bio.org/issues/3122 This issue is ultimately one of using seek() on a pipe, which is forbidden (see below). To be clear, there are kludgy ways around this that allow BioPERL to take input from a pipe AND guess the format. My naive and inefficient kludge was to test for reading from STDIN and for the absence of a format. If both of these conditions are met, then I slurp STDIN into a variable and then open a filehandle on that variable, and pass it to SeqIO, which can guess the format if the fh isn't opened on a pipe. SeqIO then successfully guesses the format and does the SeqIO thing, at the expense of having the program pass over the data at least twice. And if the input file is huge, it could potentially consume all the memory. A better way to address the problem would be to process the input one line at a time, but this seems to require more extensive changes. The reason I'm reposting this is because I think that the inability to guess the sequence format from data originating from a pipe is an important limitation for a fundamental part of BioPERL. When designing scripts to be used in pipelines, the inability to guess formats for piped data limits BioPERL's pipelineability substantially. Even though previous reports of this have been made and a bug opened and closed, I was wondering if anyone thought this was worthwhile fixing so as to make SeqIO (and probably AlignIO as well?) more flexible? Does anyone think this should be refiled as a bug? Cheers, J.J. PS Below are snippets of code and/or errors related to reproducing the failure to guess unspecified formats. I'll see how Mailman treats my attachments and post the code as a reply if they don't work. The bioperl_fhtest.pl attachment is the script that reproduces the error. The w.fa is a fasta file containing some sequence. Here are the command lines to generate the behavior I observe (w.fa is a file containing some fasta sequences, in my case it was the w gene from different *Drosophila* species): ./bioperl_fhtest.pl fasta < w.fa # Works (redirection, no guessing) > ./bioperl_fhtest.pl < w.fa # Works (redirection, guessing) > > cat w.fa | ./bioperl_fhtest.pl fasta # Works (pipe, no guessing) > cat w.fa | ./bioperl_fhtest.pl # DOESN'T work (pipe, guessing) > Here's the error I get in the last case: ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: Failed resetting the filehandle; IO error occurred > STACK: Error::throw > STACK: Bio::Root::Root::throw > /usr/local/share/perl/5.10.1/Bio/Root/Root.pm:472 > STACK: Bio::Tools::GuessSeqFormat::guess > /usr/local/share/perl/5.10.1/Bio/Tools/GuessSeqFormat.pm:512 > STACK: Bio::SeqIO::new /usr/local/share/perl/5.10.1/Bio/SeqIO.pm:381 > STACK: ./bioperl_fhtest.pl:8 > ----------------------------------------------------------- > >From what I gather, the error is triggered by a failure of seek() on a STDIO fh on lines 517-518 (text from the version GuessSeqFormat.pm installed on my server): 512 if (defined $self->{-file}) { > 513 # Close the file we opened. > 514 close($fh); > 515 } elsif (ref $fh eq 'GLOB') { > 516 # Try seeking to the start position. > 517 seek($fh, $start_pos, 0) || $self->throw("Failed resetting > the ". > 518 "filehandle; IO error > occurred");; > 519 } elsif (defined $fh && $fh->can('setpos')) { > 520 # Seek to the start position. > 521 $fh->setpos($start_pos); > 522 } > -------------- next part -------------- A non-text attachment was scrubbed... Name: bioperl_fhtest.pl Type: text/x-perl-script Size: 505 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: w.fa Type: application/octet-stream Size: 6335 bytes Desc: not available URL: From frederic.sapet at gmail.com Thu Aug 25 13:24:08 2011 From: frederic.sapet at gmail.com (=?UTF-8?B?RnLDqWTDqXJpYyBTYXBldA==?=) Date: Thu, 25 Aug 2011 15:24:08 +0200 Subject: [Bioperl-l] fasta35 and fasta36 parsing support in BioPerl Message-ID: Hello I have tried to parse a fasta35 report file using BioPerl, in order to produce a valid HTML file. It seems to work well, but there's a small issue with homology string in the report. Please find in joined files, a test script. After that, I have tried to parse a fasta36 file, but this seems to be not supported yet: here is the error thrown : Uncaught exception from user code: ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: Unrecognized alignment line (3) '>--' STACK: Error::throw STACK: Bio::Root::Root::throw /home/bga/bioinfo/fsapet/BioPerlLive/lib/perl5/Bio/Root/Root.pm:472 STACK: Bio::SearchIO::fasta::next_result /home/bga/bioinfo/fsapet/BioPerlLive/lib/perl5/Bio/SearchIO/fasta.pm:1061 STACK: ./test.pl:36 ----------------------------------------------------------- at /usr/lib/perl5/site_perl/5.10.0/Error.pm line 184 Error::throw('Bio::Root::Exception', 'Unrecognized alignment line (3) \'>--\'') called at /home/bga/bioinfo/fsapet/BioPerlLive/lib/perl5/Bio/Root/Root.pm line 472 Bio::Root::Root::throw('Bio::SearchIO::fasta=HASH', 'Unrecognized alignment line (3) \'>--\'') called at /home/bga/bioinfo/fsapet/BioPerlLive/lib/perl5/Bio/SearchIO/fasta.pm line 1061 Bio::SearchIO::fasta::next_result('Bio::SearchIO::fasta=HASH') called at ./test.pl line 36 Thank you Fred -------------- next part -------------- A non-text attachment was scrubbed... Name: FastaBioPerl.tar.bz2 Type: application/x-bzip2 Size: 7692 bytes Desc: not available URL: From miquel.amat at me.com Tue Aug 23 06:07:54 2011 From: miquel.amat at me.com (Miguel A. Amat) Date: Tue, 23 Aug 2011 02:07:54 -0400 Subject: [Bioperl-l] Help Message-ID: <44829080-5467-4103-AF5B-D09CBDA6F99F@me.com> I am trying to install bioperl on mac os x 10.7 but ran into problems with the dependencies Bio::ASN1::EntrezGene and DBD::mysql. I am running the latest version of CPAN and perl -v 5.12.3 and the BioPerl-1.6.1 package. The installation was being conducted interactively through via the "perl Build.PL" command. Can you provide some help? From bosborne11 at verizon.net Thu Aug 25 14:35:29 2011 From: bosborne11 at verizon.net (Brian Osborne) Date: Thu, 25 Aug 2011 10:35:29 -0400 Subject: [Bioperl-l] SeqIO alters Genbank files Message-ID: bioperl-l, I need to run something by you before I commit code and tests. I have code that takes a Genbank file as input and creates another Genbank file as output. I noticed that SeqIO - specifically FTHelper.pm - was taking a tag like this in the input file: /score=100.1 And adding a "note" tag, so the output file contains this: /score=100.1 /note="score=100.1" I'm assuming that the code does this because NCBI will not accept score tags and values even though Bioperl, generally speaking, does not say that NCBI defines the fine details of Genbank format. On the other hand I don't like the idea that SeqIO is altering the content. It also turns out that if you have code that does multiple round-trips you end up with text like this: /score=100.1 /note="score=100.1" /note="score=100.1" /note="score=100.1" /note="score=100.1" Should I comment out the code that's doing these edits or not? Thanks again, Brian O. From cjfields at illinois.edu Thu Aug 25 16:21:15 2011 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 25 Aug 2011 11:21:15 -0500 Subject: [Bioperl-l] Problem using Bio::Tools::Run::RemoteBlast In-Reply-To: <37711.192.168.1.254.1313992876.squirrel@webmail.ibab.ac.in> References: <37711.192.168.1.254.1313992876.squirrel@webmail.ibab.ac.in> Message-ID: It's hard to evaluate what the problem is w/o code, the BioPerl version, and so on. It's very possible you are using an out-of-date BioPerl. chris On Aug 22, 2011, at 1:01 AM, Lucky Singh wrote: > Dear sir/Ma'am, > > I am student of Institute of Bioinformatics and Applied Biotechnology, > Bangalore, India. While doing my project work I needed remoteblast.pm. So > I used default example program which is available with this package. Now I > wanted to host it from web server, but This program is not working from it > may be it is not able to create or write on file from web server but in > command line it is working fine. I don't know the possible reason, please > help me to figure it out. > > > -> I am using same example program with basic cgi modification for taking > input from web browser. > -> Ubuntu 10.04 64 bit OS > -> apache2 server > -> I have given all permissions 777 recursively to cgi-bin folder > > > -- > Regards, > Lucky Singh > > Institute of Bioinformatics and Applied Biotechnology, > ------------------------------------------------------ > Biotech Park > Electronics City Phase I > Bangalore 560 100 > India. > Tel: 080-28528900, 080-28528901, 080-28528902 > Fax: 080-28528904 > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Thu Aug 25 16:34:40 2011 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 25 Aug 2011 11:34:40 -0500 Subject: [Bioperl-l] fasta35 and fasta36 parsing support in BioPerl In-Reply-To: References: Message-ID: <4C95797A-343C-4651-AF0C-964A7E10E8D1@illinois.edu> Frederic, The best place to post this is to our bug server: http://redmine.open-bio.org Attach all relevant data for the bug, this really helps us to diagnose the issue. chris On Aug 25, 2011, at 8:24 AM, Fr?d?ric Sapet wrote: > Hello > I have tried to parse a fasta35 report file using BioPerl, in order to > produce a valid HTML file. > It seems to work well, but there's a small issue with homology string > in the report. > Please find in joined files, a test script. > > After that, I have tried to parse a fasta36 file, but this seems to be > not supported yet: here is the error thrown : > > Uncaught exception from user code: > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: Unrecognized alignment line (3) '>--' > STACK: Error::throw > STACK: Bio::Root::Root::throw > /home/bga/bioinfo/fsapet/BioPerlLive/lib/perl5/Bio/Root/Root.pm:472 > STACK: Bio::SearchIO::fasta::next_result > /home/bga/bioinfo/fsapet/BioPerlLive/lib/perl5/Bio/SearchIO/fasta.pm:1061 > STACK: ./test.pl:36 > ----------------------------------------------------------- > at /usr/lib/perl5/site_perl/5.10.0/Error.pm line 184 > Error::throw('Bio::Root::Exception', 'Unrecognized alignment line (3) > \'>--\'') called at > /home/bga/bioinfo/fsapet/BioPerlLive/lib/perl5/Bio/Root/Root.pm line > 472 > Bio::Root::Root::throw('Bio::SearchIO::fasta=HASH', 'Unrecognized > alignment line (3) \'>--\'') called at > /home/bga/bioinfo/fsapet/BioPerlLive/lib/perl5/Bio/SearchIO/fasta.pm > line 1061 > Bio::SearchIO::fasta::next_result('Bio::SearchIO::fasta=HASH') called > at ./test.pl line 36 > > Thank you > > Fred > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Thu Aug 25 16:42:30 2011 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 25 Aug 2011 11:42:30 -0500 Subject: [Bioperl-l] SeqIO alters Genbank files In-Reply-To: References: Message-ID: Brian, I think comment out the code; our baked-in validation is only half-correct anyway, and I think it's probably a good idea to veer towards separation of format validation and parsing (they're two related but different concerns). To tell the truth, I think we should eschew using FTHelper altogether and just use a Bio::SeqFeatureI-based class directly. I haven't quite grasped the reasoning behind FTHelper.pm, and I would bet removing it as a middleman across the board would help parsing speed. Anyone have an objection to that, or at least an explanation for generation of tons of FTHelper instances that couldn't be handled by a Factory? chris On Aug 25, 2011, at 9:35 AM, Brian Osborne wrote: > bioperl-l, > > I need to run something by you before I commit code and tests. I have code that takes a Genbank file as input and creates another Genbank file as output. I noticed that SeqIO - specifically FTHelper.pm - was taking a tag like this in the input file: > > /score=100.1 > > And adding a "note" tag, so the output file contains this: > > /score=100.1 > /note="score=100.1" > > I'm assuming that the code does this because NCBI will not accept score tags and values even though Bioperl, generally speaking, does not say that NCBI defines the fine details of Genbank format. > > On the other hand I don't like the idea that SeqIO is altering the content. It also turns out that if you have code that does multiple round-trips you end up with text like this: > > /score=100.1 > /note="score=100.1" > /note="score=100.1" > /note="score=100.1" > /note="score=100.1" > > Should I comment out the code that's doing these edits or not? > > Thanks again, > > Brian O. > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Thu Aug 25 16:58:51 2011 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 25 Aug 2011 11:58:51 -0500 Subject: [Bioperl-l] Bio::SeqIO can't guess the format of data from a pipe In-Reply-To: References: Message-ID: <2E070303-ADDF-472B-835B-6D3959B83E9A@illinois.edu> On Aug 24, 2011, at 8:53 PM, J.J. Emerson wrote: > Hello All, > > I have experienced some behavior in SeqIO that doesn't seem to be what I > would expect. Basically, for a certain script, if I try to pass something > like "-fh => \*STDIN" to Bio::SeqIO->new(), it will fail if both of the > following two conditions are met simultaneously: > > 1. STDIN is coming from a pipe; > 2. SeqIO is trying to guess the format. > > If STDIO is coming from redirection instead of a pipe or if the format is > specified manually (i.e. BioPERL doesn't have to guess), the error doesn't > seem to occur. > > This issue has been reported previously: > > http://lists.open-bio.org/pipermail/bioperl-l/2010-July/033681.html > https://redmine.open-bio.org/issues/3122 Yes, this was addressed according to that case. > This issue is ultimately one of using seek() on a pipe, which is forbidden > (see below). To be clear, there are kludgy ways around this that allow > BioPERL to take input from a pipe AND guess the format. My naive and > inefficient kludge was to test for reading from STDIN and for the absence of > a format. If both of these conditions are met, then I slurp STDIN into a > variable and then open a filehandle on that variable, and pass it to SeqIO, > which can guess the format if the fh isn't opened on a pipe. SeqIO then > successfully guesses the format and does the SeqIO thing, at the expense of > having the program pass over the data at least twice. And if the input file > is huge, it could potentially consume all the memory. A better way to > address the problem would be to process the input one line at a time, but > this seems to require more extensive changes. Have you tried tempfiles? Not that this is a great solution, but it's very commonly used for large sequence data, and it is seekable. This behavior could also be wrapped in GuessSeqFormat i suppose (but see below) > The reason I'm reposting this is because I think that the inability to guess > the sequence format from data originating from a pipe is an important > limitation for a fundamental part of BioPERL. When designing scripts to be > used in pipelines, the inability to guess formats for piped data limits > BioPERL's pipelineability substantially. Even though previous reports of > this have been made and a bug opened and closed, I was wondering if anyone > thought this was worthwhile fixing so as to make SeqIO (and probably AlignIO > as well?) more flexible? > > Does anyone think this should be refiled as a bug? > > Cheers, > > J.J. The fundamental problem with pipes (as you indicated) is that the data stream is not seekable. We do have a built-in buffer in Bio::Root::IO that somewhat handles this, but Bio::Tools::GuessSeqFormat is (IIRC) designed to use the filehandle directly, bypassing the BioPerl IO layer completely. One solution is to redesign GuessSeqFormat to use Bio::Root::IO, have GuessSeqFormat push all data back to the buffer, then let SeqIO parse. That will require some fundamental changes for both Bio::Root::IO and Bio::SeqIO (note that one cannot pass a Bio::Root::IO instance to another Bio::Root::IO-based class for parsing at this time). The other option is (as hinted above) having GuessSeqFormat dump the data to a tempfile, seek back after guessing, and retain the filehandle for Bio::SeqIO. Not the best solutions, but either should work. My question (not a criticism, just trying to understand the problem): why are you going through all the trouble of using GuessSeqFormat as a permanent solution anyway? If you have a stream returning a possibly unknown data type, I would argue that the fundamental bug is not GuessSeqFormat but something else, more specifically not knowing the behavior of the data source and the returned format to begin with. Is something preventing that? My point is, GuessSeqFormat is fine as a temporary stop-gap, but it is not a permanent solution to your problems (it is guessing, after all). Note the code has had very little development over the years, and the related SeqIO code hasn't aged particularly well. > PS > > Below are snippets of code and/or errors related to reproducing the failure > to guess unspecified formats. I'll see how Mailman treats my attachments and > post the code as a reply if they don't work. > > The bioperl_fhtest.pl attachment is the script that reproduces the error. > The w.fa is a fasta file containing some sequence. > > Here are the command lines to generate the behavior I observe (w.fa is a > file containing some fasta sequences, in my case it was the w gene from > different *Drosophila* species): > > ./bioperl_fhtest.pl fasta < w.fa # Works (redirection, no guessing) >> ./bioperl_fhtest.pl < w.fa # Works (redirection, guessing) >> >> cat w.fa | ./bioperl_fhtest.pl fasta # Works (pipe, no guessing) >> cat w.fa | ./bioperl_fhtest.pl # DOESN'T work (pipe, guessing) >> > > > Here's the error I get in the last case: > > ------------- EXCEPTION: Bio::Root::Exception ------------- >> MSG: Failed resetting the filehandle; IO error occurred >> STACK: Error::throw >> STACK: Bio::Root::Root::throw >> /usr/local/share/perl/5.10.1/Bio/Root/Root.pm:472 >> STACK: Bio::Tools::GuessSeqFormat::guess >> /usr/local/share/perl/5.10.1/Bio/Tools/GuessSeqFormat.pm:512 >> STACK: Bio::SeqIO::new /usr/local/share/perl/5.10.1/Bio/SeqIO.pm:381 >> STACK: ./bioperl_fhtest.pl:8 >> ----------------------------------------------------------- >> > >> From what I gather, the error is triggered by a failure of seek() on a STDIO > fh on lines 517-518 (text from the version GuessSeqFormat.pm installed on my > server): > > 512 if (defined $self->{-file}) { >> 513 # Close the file we opened. >> 514 close($fh); >> 515 } elsif (ref $fh eq 'GLOB') { >> 516 # Try seeking to the start position. >> 517 seek($fh, $start_pos, 0) || $self->throw("Failed resetting >> the ". >> 518 "filehandle; IO error >> occurred");; >> 519 } elsif (defined $fh && $fh->can('setpos')) { >> 520 # Seek to the start position. >> 521 $fh->setpos($start_pos); >> 522 } >> > _______________________________________________ You are always welcome to reopen and update the bug, or file a new one. chris From cjfields at illinois.edu Thu Aug 25 17:16:03 2011 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 25 Aug 2011 12:16:03 -0500 Subject: [Bioperl-l] SeqIO alters Genbank files In-Reply-To: References: Message-ID: <393F144A-AECE-4F7D-B418-B71D46F3C82F@illinois.edu> Brian, Yes, that's correct (comment out or remove the other stuff). Not sure what difference it will make, I'm interested to see if anything fundamental expects this behavior and breaks with tests. Using 'git blame', it appears Allen Day added this in relation to Feature-Annotation code we actually reverted a few years ago, so this should be removed anyway. I still think we should work around FTHelper altogether. Reading the code, it seems like a ton of wasted instances being generated for no apparent reason. Now going back to our bioperl archives to see if there is any need for it... chris On Aug 25, 2011, at 11:53 AM, Brian Osborne wrote: > Chris, > > OK, will do. I should add that an early version of FTHelper was doing this same edit with the "strand", "source_tag", and "frame" tags but someone has commented out the "source_tag" and "strand" lines. > > Should I comment out both "score" and "frame" code? > > BIO > > On Aug 25, 2011, at 12:42 PM, Chris Fields wrote: > >> Brian, >> >> I think comment out the code; our baked-in validation is only half-correct anyway, and I think it's probably a good idea to veer towards separation of format validation and parsing (they're two related but different concerns). >> >> To tell the truth, I think we should eschew using FTHelper altogether and just use a Bio::SeqFeatureI-based class directly. I haven't quite grasped the reasoning behind FTHelper.pm, and I would bet removing it as a middleman across the board would help parsing speed. Anyone have an objection to that, or at least an explanation for generation of tons of FTHelper instances that couldn't be handled by a Factory? >> >> chris >> >> On Aug 25, 2011, at 9:35 AM, Brian Osborne wrote: >> >>> bioperl-l, >>> >>> I need to run something by you before I commit code and tests. I have code that takes a Genbank file as input and creates another Genbank file as output. I noticed that SeqIO - specifically FTHelper.pm - was taking a tag like this in the input file: >>> >>> /score=100.1 >>> >>> And adding a "note" tag, so the output file contains this: >>> >>> /score=100.1 >>> /note="score=100.1" >>> >>> I'm assuming that the code does this because NCBI will not accept score tags and values even though Bioperl, generally speaking, does not say that NCBI defines the fine details of Genbank format. >>> >>> On the other hand I don't like the idea that SeqIO is altering the content. It also turns out that if you have code that does multiple round-trips you end up with text like this: >>> >>> /score=100.1 >>> /note="score=100.1" >>> /note="score=100.1" >>> /note="score=100.1" >>> /note="score=100.1" >>> >>> Should I comment out the code that's doing these edits or not? >>> >>> Thanks again, >>> >>> Brian O. >>> >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > From bosborne11 at verizon.net Thu Aug 25 16:53:08 2011 From: bosborne11 at verizon.net (Brian Osborne) Date: Thu, 25 Aug 2011 12:53:08 -0400 Subject: [Bioperl-l] SeqIO alters Genbank files In-Reply-To: References: Message-ID: Chris, OK, will do. I should add that an early version of FTHelper was doing this same edit with the "strand", "source_tag", and "frame" tags but someone has commented out the "source_tag" and "strand" lines. Should I comment out both "score" and "frame" code? BIO On Aug 25, 2011, at 12:42 PM, Chris Fields wrote: > Brian, > > I think comment out the code; our baked-in validation is only half-correct anyway, and I think it's probably a good idea to veer towards separation of format validation and parsing (they're two related but different concerns). > > To tell the truth, I think we should eschew using FTHelper altogether and just use a Bio::SeqFeatureI-based class directly. I haven't quite grasped the reasoning behind FTHelper.pm, and I would bet removing it as a middleman across the board would help parsing speed. Anyone have an objection to that, or at least an explanation for generation of tons of FTHelper instances that couldn't be handled by a Factory? > > chris > > On Aug 25, 2011, at 9:35 AM, Brian Osborne wrote: > >> bioperl-l, >> >> I need to run something by you before I commit code and tests. I have code that takes a Genbank file as input and creates another Genbank file as output. I noticed that SeqIO - specifically FTHelper.pm - was taking a tag like this in the input file: >> >> /score=100.1 >> >> And adding a "note" tag, so the output file contains this: >> >> /score=100.1 >> /note="score=100.1" >> >> I'm assuming that the code does this because NCBI will not accept score tags and values even though Bioperl, generally speaking, does not say that NCBI defines the fine details of Genbank format. >> >> On the other hand I don't like the idea that SeqIO is altering the content. It also turns out that if you have code that does multiple round-trips you end up with text like this: >> >> /score=100.1 >> /note="score=100.1" >> /note="score=100.1" >> /note="score=100.1" >> /note="score=100.1" >> >> Should I comment out the code that's doing these edits or not? >> >> Thanks again, >> >> Brian O. >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From jj.emerson at gmail.com Thu Aug 25 18:52:48 2011 From: jj.emerson at gmail.com (J.J. Emerson) Date: Thu, 25 Aug 2011 11:52:48 -0700 Subject: [Bioperl-l] Bio::SeqIO can't guess the format of data from a pipe In-Reply-To: <2E070303-ADDF-472B-835B-6D3959B83E9A@illinois.edu> References: <2E070303-ADDF-472B-835B-6D3959B83E9A@illinois.edu> Message-ID: Hi Chris, You asked: My question (not a criticism, just trying to understand the problem): why > are you going through all the trouble of using GuessSeqFormat as a permanent > solution anyway? If you have a stream returning a possibly unknown data > type, I would argue that the fundamental bug is not GuessSeqFormat but > something else, more specifically not knowing the behavior of the data > source and the returned format to begin with. Is something preventing that? > In my particular case, I'm trying not to impose a particular usage scenario onto the script I'm writing in the hopes it will be useful (and general) to others in my lab in the future*. In my proximate case, I will certainly be able to provide SeqIO with a format argument. But insofar as GuessSeqFormat is considered desirable (and reasonable people could indeed disagree whether it is desirable) I think its applicability shouldn't hinge on whether it is guessing on a pipe or a file. My point is, GuessSeqFormat is fine as a temporary stop-gap, but it is not a > permanent solution to your problems (it is guessing, after all). Note the > code has had very little development over the years, and the related SeqIO > code hasn't aged particularly well. > I see. I wasn't aware that GuessSeqFormat was so relatively neglected. Given the rather challenging nature of the more elegant fix you suggested (using the buffering of Root:IO), perhaps I should consider dropping my issue or filing it as a feature request rather than a bug? Cheers, J.J. PS * The way I plan on using my script is roughly as follows: prog1 [some arguments] \ | myscript.pl --informat fasta \ | prog2 \ | prog3 > pipeline.output However, I'd like for the "--informat" switch to be optional, mainly to increase usability for other users. For any well considered format, the information is right there in the data to know what the format is, and as such, providing the format a second time is somewhat redundant. In principle, being able to do the following would be very useful: prog1 [some arguments] \ | myscript.pl \ | prog2 > pipeline.output The modularity of pipelining is very valuable and this is what caused me to anticipate a usage scenario that involved both GuessSeqFormat and reading from a pipe. On Thu, Aug 25, 2011 at 9:58 AM, Chris Fields wrote: > On Aug 24, 2011, at 8:53 PM, J.J. Emerson wrote: > > > Hello All, > > > > I have experienced some behavior in SeqIO that doesn't seem to be what I > > would expect. Basically, for a certain script, if I try to pass something > > like "-fh => \*STDIN" to Bio::SeqIO->new(), it will fail if both of the > > following two conditions are met simultaneously: > > > > 1. STDIN is coming from a pipe; > > 2. SeqIO is trying to guess the format. > > > > If STDIO is coming from redirection instead of a pipe or if the format is > > specified manually (i.e. BioPERL doesn't have to guess), the error > doesn't > > seem to occur. > > > > This issue has been reported previously: > > > > http://lists.open-bio.org/pipermail/bioperl-l/2010-July/033681.html > > https://redmine.open-bio.org/issues/3122 > > Yes, this was addressed according to that case. > > > This issue is ultimately one of using seek() on a pipe, which is > forbidden > > (see below). To be clear, there are kludgy ways around this that allow > > BioPERL to take input from a pipe AND guess the format. My naive and > > inefficient kludge was to test for reading from STDIN and for the absence > of > > a format. If both of these conditions are met, then I slurp STDIN into a > > variable and then open a filehandle on that variable, and pass it to > SeqIO, > > which can guess the format if the fh isn't opened on a pipe. SeqIO then > > successfully guesses the format and does the SeqIO thing, at the expense > of > > having the program pass over the data at least twice. And if the input > file > > is huge, it could potentially consume all the memory. A better way to > > address the problem would be to process the input one line at a time, but > > this seems to require more extensive changes. > > Have you tried tempfiles? Not that this is a great solution, but it's very > commonly used for large sequence data, and it is seekable. This behavior > could also be wrapped in GuessSeqFormat i suppose (but see below) > > > The reason I'm reposting this is because I think that the inability to > guess > > the sequence format from data originating from a pipe is an important > > limitation for a fundamental part of BioPERL. When designing scripts to > be > > used in pipelines, the inability to guess formats for piped data limits > > BioPERL's pipelineability substantially. Even though previous reports of > > this have been made and a bug opened and closed, I was wondering if > anyone > > thought this was worthwhile fixing so as to make SeqIO (and probably > AlignIO > > as well?) more flexible? > > > > Does anyone think this should be refiled as a bug? > > > > Cheers, > > > > J.J. > > The fundamental problem with pipes (as you indicated) is that the data > stream is not seekable. We do have a built-in buffer in Bio::Root::IO that > somewhat handles this, but Bio::Tools::GuessSeqFormat is (IIRC) designed to > use the filehandle directly, bypassing the BioPerl IO layer completely. > > One solution is to redesign GuessSeqFormat to use Bio::Root::IO, have > GuessSeqFormat push all data back to the buffer, then let SeqIO parse. That > will require some fundamental changes for both Bio::Root::IO and Bio::SeqIO > (note that one cannot pass a Bio::Root::IO instance to another > Bio::Root::IO-based class for parsing at this time). > > The other option is (as hinted above) having GuessSeqFormat dump the data > to a tempfile, seek back after guessing, and retain the filehandle for > Bio::SeqIO. Not the best solutions, but either should work. > > My question (not a criticism, just trying to understand the problem): why > are you going through all the trouble of using GuessSeqFormat as a permanent > solution anyway? If you have a stream returning a possibly unknown data > type, I would argue that the fundamental bug is not GuessSeqFormat but > something else, more specifically not knowing the behavior of the data > source and the returned format to begin with. Is something preventing that? > > My point is, GuessSeqFormat is fine as a temporary stop-gap, but it is not > a permanent solution to your problems (it is guessing, after all). Note the > code has had very little development over the years, and the related SeqIO > code hasn't aged particularly well. > > > PS > > > > Below are snippets of code and/or errors related to reproducing the > failure > > to guess unspecified formats. I'll see how Mailman treats my attachments > and > > post the code as a reply if they don't work. > > > > The bioperl_fhtest.pl attachment is the script that reproduces the > error. > > The w.fa is a fasta file containing some sequence. > > > > Here are the command lines to generate the behavior I observe (w.fa is a > > file containing some fasta sequences, in my case it was the w gene from > > different *Drosophila* species): > > > > ./bioperl_fhtest.pl fasta < w.fa # Works (redirection, no guessing) > >> ./bioperl_fhtest.pl < w.fa # Works (redirection, guessing) > >> > >> cat w.fa | ./bioperl_fhtest.pl fasta # Works (pipe, no guessing) > >> cat w.fa | ./bioperl_fhtest.pl # DOESN'T work (pipe, guessing) > >> > > > > > > Here's the error I get in the last case: > > > > ------------- EXCEPTION: Bio::Root::Exception ------------- > >> MSG: Failed resetting the filehandle; IO error occurred > >> STACK: Error::throw > >> STACK: Bio::Root::Root::throw > >> /usr/local/share/perl/5.10.1/Bio/Root/Root.pm:472 > >> STACK: Bio::Tools::GuessSeqFormat::guess > >> /usr/local/share/perl/5.10.1/Bio/Tools/GuessSeqFormat.pm:512 > >> STACK: Bio::SeqIO::new /usr/local/share/perl/5.10.1/Bio/SeqIO.pm:381 > >> STACK: ./bioperl_fhtest.pl:8 > >> ----------------------------------------------------------- > >> > > > >> From what I gather, the error is triggered by a failure of seek() on a > STDIO > > fh on lines 517-518 (text from the version GuessSeqFormat.pm installed on > my > > server): > > > > 512 if (defined $self->{-file}) { > >> 513 # Close the file we opened. > >> 514 close($fh); > >> 515 } elsif (ref $fh eq 'GLOB') { > >> 516 # Try seeking to the start position. > >> 517 seek($fh, $start_pos, 0) || $self->throw("Failed > resetting > >> the ". > >> 518 "filehandle; IO error > >> occurred");; > >> 519 } elsif (defined $fh && $fh->can('setpos')) { > >> 520 # Seek to the start position. > >> 521 $fh->setpos($start_pos); > >> 522 } > >> > > _______________________________________________ > > You are always welcome to reopen and update the bug, or file a new one. > > chris > > From cjfields at illinois.edu Thu Aug 25 21:04:15 2011 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 25 Aug 2011 16:04:15 -0500 Subject: [Bioperl-l] Bio::SeqIO can't guess the format of data from a pipe In-Reply-To: References: <2E070303-ADDF-472B-835B-6D3959B83E9A@illinois.edu> Message-ID: On Aug 25, 2011, at 1:52 PM, J.J. Emerson wrote: > Hi Chris, > > You asked: > > My question (not a criticism, just trying to understand the problem): why are you going through all the trouble of using GuessSeqFormat as a permanent solution anyway? If you have a stream returning a possibly unknown data type, I would argue that the fundamental bug is not GuessSeqFormat but something else, more specifically not knowing the behavior of the data source and the returned format to begin with. Is something preventing that? > > In my particular case, I'm trying not to impose a particular usage scenario onto the script I'm writing in the hopes it will be useful (and general) to others in my lab in the future*. In my proximate case, I will certainly be able to provide SeqIO with a format argument. But insofar as GuessSeqFormat is considered desirable (and reasonable people could indeed disagree whether it is desirable) I think its applicability shouldn't hinge on whether it is guessing on a pipe or a file. > > My point is, GuessSeqFormat is fine as a temporary stop-gap, but it is not a permanent solution to your problems (it is guessing, after all). Note the code has had very little development over the years, and the related SeqIO code hasn't aged particularly well. > > I see. I wasn't aware that GuessSeqFormat was so relatively neglected. Given the rather challenging nature of the more elegant fix you suggested (using the buffering of Root:IO), perhaps I should consider dropping my issue or filing it as a feature request rather than a bug? That's fine. I don't want to dissuade you from taking this on, either. > Cheers, > > J.J. > > PS > > * The way I plan on using my script is roughly as follows: > > prog1 [some arguments] \ > | myscript.pl --informat fasta \ > | prog2 \ > | prog3 > pipeline.output > > However, I'd like for the "--informat" switch to be optional, mainly to increase usability for other users. For any well considered format, the information is right there in the data to know what the format is, and as such, providing the format a second time is somewhat redundant. In principle, being able to do the following would be very useful: > > prog1 [some arguments] \ > | myscript.pl \ > | prog2 > pipeline.output > > The modularity of pipelining is very valuable and this is what caused me to anticipate a usage scenario that involved both GuessSeqFormat and reading from a pipe. Not disagreeing with you at all, flexible code is best. chris From hlapp at drycafe.net Fri Aug 26 02:29:44 2011 From: hlapp at drycafe.net (Hilmar Lapp) Date: Fri, 26 Aug 2011 11:29:44 +0900 Subject: [Bioperl-l] SeqIO alters Genbank files In-Reply-To: References: Message-ID: <00375B6C-64AE-4D43-9D98-6CD90C31A76A@drycafe.net> Could this behavior perhaps be made optional, with the default being off? -hilmar On Aug 25, 2011, at 11:35 PM, Brian Osborne wrote: > bioperl-l, > > I need to run something by you before I commit code and tests. I > have code that takes a Genbank file as input and creates another > Genbank file as output. I noticed that SeqIO - specifically > FTHelper.pm - was taking a tag like this in the input file: > > /score=100.1 > > And adding a "note" tag, so the output file contains this: > > /score=100.1 > /note="score=100.1" > > I'm assuming that the code does this because NCBI will not accept > score tags and values even though Bioperl, generally speaking, does > not say that NCBI defines the fine details of Genbank format. > > On the other hand I don't like the idea that SeqIO is altering the > content. It also turns out that if you have code that does multiple > round-trips you end up with text like this: > > /score=100.1 > /note="score=100.1" > /note="score=100.1" > /note="score=100.1" > /note="score=100.1" > > Should I comment out the code that's doing these edits or not? > > Thanks again, > > Brian O. > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : =========================================================== From carandraug+dev at gmail.com Fri Aug 26 14:20:39 2011 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Fri, 26 Aug 2011 15:20:39 +0100 Subject: [Bioperl-l] Problem using Bio::Tools::Run::RemoteBlast In-Reply-To: <37711.192.168.1.254.1313992876.squirrel@webmail.ibab.ac.in> References: <37711.192.168.1.254.1313992876.squirrel@webmail.ibab.ac.in> Message-ID: On 22 August 2011 07:01, Lucky Singh wrote: > Now I > wanted to host it from web server, but This program is not working from it > may be it is not able to create or write on file from web server but in > command line it is working fine. I don't know the possible reason, please > help me to figure it out. Have you looked in the apache logs (look in /var/log/apache2/error.log) ? Can you pastebin your whole code and the content of the error log after trying to run the script? From bosborne11 at verizon.net Fri Aug 26 14:39:44 2011 From: bosborne11 at verizon.net (Brian Osborne) Date: Fri, 26 Aug 2011 10:39:44 -0400 Subject: [Bioperl-l] SeqIO alters Genbank files In-Reply-To: <00375B6C-64AE-4D43-9D98-6CD90C31A76A@drycafe.net> References: <00375B6C-64AE-4D43-9D98-6CD90C31A76A@drycafe.net> Message-ID: <9EB8EA4F-0E22-4446-A57E-F726E001B068@verizon.net> Hilmar, Yes, of course. Are you thinking that this code is designed, in part, to help people submit to NCBI? BIO On Aug 25, 2011, at 10:29 PM, Hilmar Lapp wrote: > Could this behavior perhaps be made optional, with the default being off? > > -hilmar > > On Aug 25, 2011, at 11:35 PM, Brian Osborne wrote: > >> bioperl-l, >> >> I need to run something by you before I commit code and tests. I have code that takes a Genbank file as input and creates another Genbank file as output. I noticed that SeqIO - specifically FTHelper.pm - was taking a tag like this in the input file: >> >> /score=100.1 >> >> And adding a "note" tag, so the output file contains this: >> >> /score=100.1 >> /note="score=100.1" >> >> I'm assuming that the code does this because NCBI will not accept score tags and values even though Bioperl, generally speaking, does not say that NCBI defines the fine details of Genbank format. >> >> On the other hand I don't like the idea that SeqIO is altering the content. It also turns out that if you have code that does multiple round-trips you end up with text like this: >> >> /score=100.1 >> /note="score=100.1" >> /note="score=100.1" >> /note="score=100.1" >> /note="score=100.1" >> >> Should I comment out the code that's doing these edits or not? >> >> Thanks again, >> >> Brian O. >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : > =========================================================== > > > > From hlapp at drycafe.net Fri Aug 26 14:50:26 2011 From: hlapp at drycafe.net (Hilmar Lapp) Date: Fri, 26 Aug 2011 23:50:26 +0900 Subject: [Bioperl-l] SeqIO alters Genbank files In-Reply-To: <9EB8EA4F-0E22-4446-A57E-F726E001B068@verizon.net> References: <00375B6C-64AE-4D43-9D98-6CD90C31A76A@drycafe.net> <9EB8EA4F-0E22-4446-A57E-F726E001B068@verizon.net> Message-ID: On Aug 26, 2011, at 11:39 PM, Brian Osborne wrote: > Are you thinking that this code is designed, in part, to help people > submit to NCBI? I don't know, but perhaps. My thinking was, if the code is doing something that's useful in some, but bad in many or most other situations, it'd be nice if the useful behavior could be retained as an option for those who expressly want (or need) it. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : =========================================================== From florent.angly at gmail.com Sat Aug 27 11:12:05 2011 From: florent.angly at gmail.com (Florent Angly) Date: Sat, 27 Aug 2011 21:12:05 +1000 Subject: [Bioperl-l] Bio::SeqIO can't guess the format of data from a pipe In-Reply-To: References: <2E070303-ADDF-472B-835B-6D3959B83E9A@illinois.edu> Message-ID: <4E58D105.7050805@gmail.com> On the topic of guessing file formats, last I checked, it was difficult to reuse the format guessed by Bio::SeqIO For example, if I want to takes sequences in any format (FASTA, FASTQ, ...) and filter some of them out and put them in a new file in the same format, I need to do something along these lines: # Open the file and let BioPerl guess its format my $in = Bio::SeqIO->new( -file => $input_seqfile ); # Have Bioperl guess the format (again) so we can use the same format for the output file my $format = $in->_guess_format( $input_seqfile ); # Open the output file (same format as the input file my $out = Bio::SeqIO->new( -file => ">".$output_seqfile , format => $format ); # Now do the work... The limitations of the code above is that in is more complex than it should be and forces Bioperl do check the file format twice. My proposal would be to store the format of a file somewhere in the Bio::SeqIO object and create a new get/set method in Bio::SeqIO called format() to store of access its value. The idea would be that the example code above could be rewritten as: # Open the file and let BioPerl guess its format my $in = Bio::SeqIO->new( -file => $input_seqfile ); # Retrieve the format guessed by BioPerl my $format = $in->format( ); # Open the output file using the same format as the input file my $out = Bio::SeqIO->new( -file => ">".$output_seqfile , format => $format ); # Now do the work... I think this is more elegant since it is more readable, requires less computation (the file format is guessed once), and is more consistent with other Bio::SeqIO methods like alphabet, that guesses the alphabet but has a get/set method to access it. Florent On 26/08/11 07:04, Chris Fields wrote: > On Aug 25, 2011, at 1:52 PM, J.J. Emerson wrote: > >> Hi Chris, >> >> You asked: >> >> My question (not a criticism, just trying to understand the problem): why are you going through all the trouble of using GuessSeqFormat as a permanent solution anyway? If you have a stream returning a possibly unknown data type, I would argue that the fundamental bug is not GuessSeqFormat but something else, more specifically not knowing the behavior of the data source and the returned format to begin with. Is something preventing that? >> >> In my particular case, I'm trying not to impose a particular usage scenario onto the script I'm writing in the hopes it will be useful (and general) to others in my lab in the future*. In my proximate case, I will certainly be able to provide SeqIO with a format argument. But insofar as GuessSeqFormat is considered desirable (and reasonable people could indeed disagree whether it is desirable) I think its applicability shouldn't hinge on whether it is guessing on a pipe or a file. >> >> My point is, GuessSeqFormat is fine as a temporary stop-gap, but it is not a permanent solution to your problems (it is guessing, after all). Note the code has had very little development over the years, and the related SeqIO code hasn't aged particularly well. >> >> I see. I wasn't aware that GuessSeqFormat was so relatively neglected. Given the rather challenging nature of the more elegant fix you suggested (using the buffering of Root:IO), perhaps I should consider dropping my issue or filing it as a feature request rather than a bug? > That's fine. I don't want to dissuade you from taking this on, either. > >> Cheers, >> >> J.J. >> >> PS >> >> * The way I plan on using my script is roughly as follows: >> >> prog1 [some arguments] \ >> | myscript.pl --informat fasta \ >> | prog2 \ >> | prog3> pipeline.output >> >> However, I'd like for the "--informat" switch to be optional, mainly to increase usability for other users. For any well considered format, the information is right there in the data to know what the format is, and as such, providing the format a second time is somewhat redundant. In principle, being able to do the following would be very useful: >> >> prog1 [some arguments] \ >> | myscript.pl \ >> | prog2> pipeline.output >> >> The modularity of pipelining is very valuable and this is what caused me to anticipate a usage scenario that involved both GuessSeqFormat and reading from a pipe. > Not disagreeing with you at all, flexible code is best. > > chris > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Sat Aug 27 03:54:05 2011 From: cjfields at illinois.edu (Chris Fields) Date: Fri, 26 Aug 2011 22:54:05 -0500 Subject: [Bioperl-l] Bio::SeqIO can't guess the format of data from a pipe In-Reply-To: <4E58D105.7050805@gmail.com> References: <2E070303-ADDF-472B-835B-6D3959B83E9A@illinois.edu> <4E58D105.7050805@gmail.com> Message-ID: On Aug 27, 2011, at 6:12 AM, Florent Angly wrote: > On the topic of guessing file formats, last I checked, it was difficult to reuse the format guessed by Bio::SeqIO > > For example, if I want to takes sequences in any format (FASTA, FASTQ, ...) and filter some of them out and put them in a new file in the same format, I need to do something along these lines: > > # Open the file and let BioPerl guess its format > my $in = Bio::SeqIO->new( -file => $input_seqfile ); > > # Have Bioperl guess the format (again) so we can use the same format for the output file > my $format = $in->_guess_format( $input_seqfile ); > > # Open the output file (same format as the input file > my $out = Bio::SeqIO->new( -file => ">".$output_seqfile , format => $format ); > > # Now do the work... > > The limitations of the code above is that in is more complex than it should be and forces Bioperl do check the file format twice. My proposal would be to store the format of a file somewhere in the Bio::SeqIO object and create a new get/set method in Bio::SeqIO called format() to store of access its value. The name of the class is the format (that's how they are loaded). We could add this as a convenience level for Bio::SeqIO (fairly easy to do, actually), but it would only makes sense as a getter. Bio::SeqIO dynamically loads the proper Bio::SeqIO:: module in the constructor (Bio::SeqIO::genbank, for example). Being able to set the format to 'fasta' with a loaded Bio::SeqIO::genbank still gets GenBank format. > The idea would be that the example code above could be rewritten as: > > # Open the file and let BioPerl guess its format > my $in = Bio::SeqIO->new( -file => $input_seqfile ); > > # Retrieve the format guessed by BioPerl > my $format = $in->format( ); > > # Open the output file using the same format as the input file > my $out = Bio::SeqIO->new( -file => ">".$output_seqfile , format => $format ); > > # Now do the work... > > I think this is more elegant since it is more readable, requires less computation (the file format is guessed once), and is more consistent with other Bio::SeqIO methods like alphabet, that guesses the alphabet but has a get/set method to access it. > > Florent Guessing the alphabet for the vast majority of sequence data isn't quite as complex and quixotic as guessing a sequence format. The latter is far more variable and infinitely increases, much like standards (ex: http://xkcd.com/927/). Not that sequences aren't capable of change... chris From hlapp at drycafe.net Sat Aug 27 03:43:57 2011 From: hlapp at drycafe.net (Hilmar Lapp) Date: Sat, 27 Aug 2011 12:43:57 +0900 Subject: [Bioperl-l] Bio::SeqIO can't guess the format of data from a pipe In-Reply-To: <4E58D105.7050805@gmail.com> References: <2E070303-ADDF-472B-835B-6D3959B83E9A@illinois.edu> <4E58D105.7050805@gmail.com> Message-ID: The format is already available - it is in essence the class of the SeqIO instance: my $format = ref($in); Rather than passing that into SeqIO->new(), you can directly instantiate a new object from it: my $out = ref($in)->new(-file => ...); Would that address what you are trying to accomplish? -hilmar Sent with a tap. On Aug 27, 2011, at 8:12 PM, Florent Angly wrote: > My proposal would be to store the format of a file somewhere in the Bio::SeqIO object and create a new get/set method in Bio::SeqIO called format() to store of access its value. The idea would be that the example code above could be rewritten as: > > # Open the file and let BioPerl guess its format > my $in = Bio::SeqIO->new( -file => $input_seqfile ); > > # Retrieve the format guessed by BioPerl > my $format = $in->format( ); > > # Open the output file using the same format as the input file > my $out = Bio::SeqIO->new( -file => ">".$output_seqfile , format => $format ); > > # Now do the work... > > I think this is more elegant since it is more readable, requires less computation (the file format is guessed once), and is more consistent with other Bio::SeqIO methods like alphabet, that guesses the alphabet but has a get/set method to access it. From florent.angly at gmail.com Sun Aug 28 09:08:32 2011 From: florent.angly at gmail.com (Florent Angly) Date: Sun, 28 Aug 2011 19:08:32 +1000 Subject: [Bioperl-l] Bio::SeqIO can't guess the format of data from a pipe In-Reply-To: References: <2E070303-ADDF-472B-835B-6D3959B83E9A@illinois.edu> <4E58D105.7050805@gmail.com> Message-ID: <4E5A0590.2010805@gmail.com> Yes indeed, that's a very convenient way to implement a format() methods that gets the format of the file. I'll try to implement it today. More logic may be involved because of the formats that take variants, e.g. the FASTQ format (Bio::SeqIO::fastq module) has a 'sanger', 'illumina' and 'solexa' variants. Florent On 27/08/11 13:43, Hilmar Lapp wrote: > The format is already available - it is in essence the class of the SeqIO instance: > > my $format = ref($in); > > Rather than passing that into SeqIO->new(), you can directly instantiate a new object from it: > > my $out = ref($in)->new(-file => ...); > > Would that address what you are trying to accomplish? > > -hilmar > > Sent with a tap. > > On Aug 27, 2011, at 8:12 PM, Florent Angly wrote: > >> My proposal would be to store the format of a file somewhere in the Bio::SeqIO object and create a new get/set method in Bio::SeqIO called format() to store of access its value. The idea would be that the example code above could be rewritten as: >> >> # Open the file and let BioPerl guess its format >> my $in = Bio::SeqIO->new( -file => $input_seqfile ); >> >> # Retrieve the format guessed by BioPerl >> my $format = $in->format( ); >> >> # Open the output file using the same format as the input file >> my $out = Bio::SeqIO->new( -file => ">".$output_seqfile , format => $format ); >> >> # Now do the work... >> >> I think this is more elegant since it is more readable, requires less computation (the file format is guessed once), and is more consistent with other Bio::SeqIO methods like alphabet, that guesses the alphabet but has a get/set method to access it. From cjfields at illinois.edu Sun Aug 28 03:27:34 2011 From: cjfields at illinois.edu (Chris Fields) Date: Sat, 27 Aug 2011 22:27:34 -0500 Subject: [Bioperl-l] Bio::SeqIO can't guess the format of data from a pipe In-Reply-To: <4E5A0590.2010805@gmail.com> References: <2E070303-ADDF-472B-835B-6D3959B83E9A@illinois.edu> <4E58D105.7050805@gmail.com> <4E5A0590.2010805@gmail.com> Message-ID: <8D639B95-0666-4F09-8E9E-88C8CDF76ABC@illinois.edu> There is no reason the variant couldn't also be a method; it's fairly generic to Bio::SeqIO. FASTQ just happens to be the only parser that takes advantage of it (probably b/c I added it when I refactored FASTQ :) See the code for Bio::SeqIO::new to see what is done. Again, like the format it only makes sense as a getter method. chris On Aug 28, 2011, at 4:08 AM, Florent Angly wrote: > > Yes indeed, that's a very convenient way to implement a format() methods that gets the format of the file. I'll try to implement it today. More logic may be involved because of the formats that take variants, e.g. the FASTQ format (Bio::SeqIO::fastq module) has a 'sanger', 'illumina' and 'solexa' variants. > Florent > > > On 27/08/11 13:43, Hilmar Lapp wrote: >> The format is already available - it is in essence the class of the SeqIO instance: >> >> my $format = ref($in); >> >> Rather than passing that into SeqIO->new(), you can directly instantiate a new object from it: >> >> my $out = ref($in)->new(-file => ...); >> >> Would that address what you are trying to accomplish? >> >> -hilmar >> >> Sent with a tap. >> >> On Aug 27, 2011, at 8:12 PM, Florent Angly wrote: >> >>> My proposal would be to store the format of a file somewhere in the Bio::SeqIO object and create a new get/set method in Bio::SeqIO called format() to store of access its value. The idea would be that the example code above could be rewritten as: >>> >>> # Open the file and let BioPerl guess its format >>> my $in = Bio::SeqIO->new( -file => $input_seqfile ); >>> >>> # Retrieve the format guessed by BioPerl >>> my $format = $in->format( ); >>> >>> # Open the output file using the same format as the input file >>> my $out = Bio::SeqIO->new( -file => ">".$output_seqfile , format => $format ); >>> >>> # Now do the work... >>> >>> I think this is more elegant since it is more readable, requires less computation (the file format is guessed once), and is more consistent with other Bio::SeqIO methods like alphabet, that guesses the alphabet but has a get/set method to access it. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From florent.angly at gmail.com Sun Aug 28 22:35:36 2011 From: florent.angly at gmail.com (Florent Angly) Date: Mon, 29 Aug 2011 08:35:36 +1000 Subject: [Bioperl-l] Bio::SeqIO can't guess the format of data from a pipe In-Reply-To: <8D639B95-0666-4F09-8E9E-88C8CDF76ABC@illinois.edu> References: <2E070303-ADDF-472B-835B-6D3959B83E9A@illinois.edu> <4E58D105.7050805@gmail.com> <4E5A0590.2010805@gmail.com> <8D639B95-0666-4F09-8E9E-88C8CDF76ABC@illinois.edu> Message-ID: <4E5AC2B8.9060808@gmail.com> Hi, I implemented the format() getter method in Bio::SeqIO as discussed, essentially following the way proposed by Hilmar. The variant() method is not needed since Bio::SeqIO::fastq already has a get/set method for that. I noticed that there are plenty more Bio*IO modules that could benefit from having a format() method, e.g.: Bio::AlignIO Bio::ClusterIO Bio::FeatureIO Bio::MapIO Bio::OntologyIO Bio::SearchIO Bio::TreeIO Bio::Assembly::IO * The code could be copy-pasted for each of them but it is not very graceful. Is there a way we could have all these IO modules share the same format() method? * Note how the IO class for Bio::Assembly is called Bio::Assembly::IO, and not Bio::AssemblyIO like for other classes. This may be something to change in the future for consistency. Florent On 28/08/11 13:27, Chris Fields wrote: > There is no reason the variant couldn't also be a method; it's fairly generic to Bio::SeqIO. FASTQ just happens to be the only parser that takes advantage of it (probably b/c I added it when I refactored FASTQ :) > > See the code for Bio::SeqIO::new to see what is done. Again, like the format it only makes sense as a getter method. > > chris > > On Aug 28, 2011, at 4:08 AM, Florent Angly wrote: > >> Yes indeed, that's a very convenient way to implement a format() methods that gets the format of the file. I'll try to implement it today. More logic may be involved because of the formats that take variants, e.g. the FASTQ format (Bio::SeqIO::fastq module) has a 'sanger', 'illumina' and 'solexa' variants. >> Florent >> >> >> On 27/08/11 13:43, Hilmar Lapp wrote: >>> The format is already available - it is in essence the class of the SeqIO instance: >>> >>> my $format = ref($in); >>> >>> Rather than passing that into SeqIO->new(), you can directly instantiate a new object from it: >>> >>> my $out = ref($in)->new(-file => ...); >>> >>> Would that address what you are trying to accomplish? >>> >>> -hilmar >>> >>> Sent with a tap. >>> >>> On Aug 27, 2011, at 8:12 PM, Florent Angly wrote: >>> >>>> My proposal would be to store the format of a file somewhere in the Bio::SeqIO object and create a new get/set method in Bio::SeqIO called format() to store of access its value. The idea would be that the example code above could be rewritten as: >>>> >>>> # Open the file and let BioPerl guess its format >>>> my $in = Bio::SeqIO->new( -file => $input_seqfile ); >>>> >>>> # Retrieve the format guessed by BioPerl >>>> my $format = $in->format( ); >>>> >>>> # Open the output file using the same format as the input file >>>> my $out = Bio::SeqIO->new( -file => ">".$output_seqfile , format => $format ); >>>> >>>> # Now do the work... >>>> >>>> I think this is more elegant since it is more readable, requires less computation (the file format is guessed once), and is more consistent with other Bio::SeqIO methods like alphabet, that guesses the alphabet but has a get/set method to access it. >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Mon Aug 29 01:10:27 2011 From: cjfields at illinois.edu (Chris Fields) Date: Sun, 28 Aug 2011 20:10:27 -0500 Subject: [Bioperl-l] Bio::SeqIO can't guess the format of data from a pipe In-Reply-To: <4E5AC2B8.9060808@gmail.com> References: <2E070303-ADDF-472B-835B-6D3959B83E9A@illinois.edu> <4E58D105.7050805@gmail.com> <4E5A0590.2010805@gmail.com> <8D639B95-0666-4F09-8E9E-88C8CDF76ABC@illinois.edu> <4E5AC2B8.9060808@gmail.com> Message-ID: On Aug 28, 2011, at 5:35 PM, Florent Angly wrote: > Hi, > > I implemented the format() getter method in Bio::SeqIO as discussed, essentially following the way proposed by Hilmar. The variant() method is not needed since Bio::SeqIO::fastq already has a get/set method for that. Right, but the method could be used by other modules if it were moved to Bio::SeqIO. for instance. > I noticed that there are plenty more Bio*IO modules that could benefit from having a format() method, e.g.: > Bio::AlignIO > Bio::ClusterIO > Bio::FeatureIO > Bio::MapIO > Bio::OntologyIO > Bio::SearchIO > Bio::TreeIO > Bio::Assembly::IO * > The code could be copy-pasted for each of them but it is not very graceful. Is there a way we could have all these IO modules share the same format() method? Move the method to Bio::Root::IO, the common base class for all of the above. > * Note how the IO class for Bio::Assembly is called Bio::Assembly::IO, and not Bio::AssemblyIO like for other classes. This may be something to change in the future for consistency. > > Florent That's possible; one could take advantage of that for redesign/API issues if it were needed. chris From noncoding at gmail.com Mon Aug 29 10:31:10 2011 From: noncoding at gmail.com (Remo Sanges) Date: Mon, 29 Aug 2011 12:31:10 +0200 Subject: [Bioperl-l] Opportunity: PhD in BIOINFORMATICS at SZN, Naples, Italy In-Reply-To: <7F0AE58E-6052-469B-ACD0-207FAD060472@drycafe.net> References: <7F0AE58E-6052-469B-ACD0-207FAD060472@drycafe.net> Message-ID: <4E5B6A6E.2020508@gmail.com> (Apologies if you have received this already or if this is considered spam. Please feel free to pass on to anyone who might be interested.) The Stazione Zoologica Anton Dohrn in Naples is among the top research institutions in the world in the fields of marine biology and ecology. The new established bioinformatics laboratory is seeking for a candidate interested in the evolution of genome architecture http://bit.ly/okEGvL We are looking for someone who understands basic biological and evolutionary problems and is able to independently accomplish bioinformatics tasks. Candidates will be expected to have knowledge of biology, genetics and functional genomics, to demonstrate the ability to work in a UNIX/Linux environment and to be familiar with a scripting language (e.g. Perl), a database system (e.g. MySQL) and a statistical programming environment (e.g R). Previous experience with comparative genomics and genomics databases as well as an understanding of statistical methods used in the interpretation of biological data is a desirable asset. Wet lab work might be required during the PhD. All the information about the PhD and the guidelines on how to apply are listed on the webpage http://bit.ly/d2WuXk The closing date for applications is 20 September 2011. Kind Regards Remo -- Remo Sanges Bioinformatics - Animal Physiology and Evolution Stazione Zoologica Anton Dohrn Villa Comunale, 80121 Napoli - Italy +39 081 5833428 From locarpau at upvnet.upv.es Mon Aug 29 16:47:13 2011 From: locarpau at upvnet.upv.es (Lorenzo Carretero Paulet) Date: Mon, 29 Aug 2011 18:47:13 +0200 Subject: [Bioperl-l] Saving Codeml Output file In-Reply-To: <9866C4A4-AC36-4A25-B38F-3006A7BB0F11@illinois.edu> References: <9645AF32-5EC3-41FA-9A32-45B6B92E31FD@illinois.edu> <4DF56976.8080704@upvnet.upv.es> <9866C4A4-AC36-4A25-B38F-3006A7BB0F11@illinois.edu> Message-ID: <1314636433.4e5bc291a40c6@webmail.upv.es> Hi all, I'm running codeml from the PAML package using the corresponding Bioperl wrapper. I'd like to save the output file as -outfile => 'mlc', as in: my $codeml_factory = new Bio::Tools::Run::Phylo::PAML::Codeml ( -outfile => 'mlc', -save_tempfiles => 1, -alignment => $codon_MSA, -tree => $biotree, -params => { #'outfile' =>'mlc', 'verbose' => 1, 'noisy' => 9, 'runmode' => 0, #user tree 'seqtype' => 1, 'model' => $model, 'NSsites' => $NSsites, 'fix_omega' => $fix_omega, 'omega' => $omega, 'ncatG' => $ncatG, 'icode' => 0, #* 0:universal code; 1:mammalian mt; 2-10:see below (5:ciliate nuclear) #'fix_alpha' => 0, #'fix_kappa' => 0, #'RateAncestor' => 0, 'CodonFreq' => 2, 'cleandata' => 1, # remove sites with amibguity data (1 yes, 0 no), 'ndata' => 1 }, ); and subsequently parsing it using my $parserF= Bio::Tools::Phylo::PAML->new (-file => "mlc", -dir => "./"); However, I get the following message. ------------- EXCEPTION ------------- MSG: Could not open mlc: No such file or directory STACK Bio::Root::IO::_initialize_io /Library/Perl//5.10.0/Bio/Root/IO.pm:351 STACK Bio::Tools::Phylo::PAML::new /Library/Perl//5.10.0/Bio/Tools/Phylo/PAML.pm:239 STACK main::BranchSiteEvolAnalysis /Users/marioafares/Documents/workspace/BactEvolGen/BactEvolGen.pl:1421 STACK toplevel /Users/marioafares/Documents/workspace/BactEvolGen/BactEvolGen.pl:939 ------------------------------------- what I guess means the output file is not being saved in the previous step. Anyone knows what's wrong. Tnak you very much in advance for your help. Cheers, Lorenzo From David.Messina at sbc.su.se Mon Aug 29 17:43:33 2011 From: David.Messina at sbc.su.se (Dave Messina) Date: Mon, 29 Aug 2011 19:43:33 +0200 Subject: [Bioperl-l] Saving Codeml Output file In-Reply-To: <1314636433.4e5bc291a40c6@webmail.upv.es> References: <9645AF32-5EC3-41FA-9A32-45B6B92E31FD@illinois.edu> <4DF56976.8080704@upvnet.upv.es> <9866C4A4-AC36-4A25-B38F-3006A7BB0F11@illinois.edu> <1314636433.4e5bc291a40c6@webmail.upv.es> Message-ID: Hi Lorenzo, and subsequently parsing it using > my $parserF= Bio::Tools::Phylo::PAML->new (-file => "mlc", -dir => "./"); > > However, I get the following message. > > ------------- EXCEPTION ------------- > MSG: Could not open mlc: No such file or directory > > what I guess means the output file is not being saved in the previous step. > Your interpretation could be correct. I think though that it might be that the -dir parameter you specify, "./", is not correct. Are you seeing the mlc file in the '.' (current working) dir? If I remember correctly, by default the mlc file is created in a temporary directory in /scratch or /tmp, and the save_tempfiles flag simply keeps that temporary directory from being deleted. I don't have the docs in front of me, but I believe there's a way to get the path of the temp directory that B::T::P::PAML is using. If so, you can use that path as the value for the -dir parameter. Let me know if not, though, and we can follow up on this. Dave PS - also, could you verify that you're using the latest versions of bioperl-live and bioperl-run from Github? From Kevin.M.Brown at asu.edu Mon Aug 29 18:09:29 2011 From: Kevin.M.Brown at asu.edu (Kevin Brown) Date: Mon, 29 Aug 2011 11:09:29 -0700 Subject: [Bioperl-l] Saving Codeml Output file In-Reply-To: References: <9645AF32-5EC3-41FA-9A32-45B6B92E31FD@illinois.edu><4DF56976.8080704@upvnet.upv.es><9866C4A4-AC36-4A25-B38F-3006A7BB0F11@illinois.edu><1314636433.4e5bc291a40c6@webmail.upv.es> Message-ID: <1A4207F8295607498283FE9E93B775B407CCB29D@EX02.asurite.ad.asu.edu> Opening a file for output that does not exist requires the > or >> redirector (depending on if you want to overwrite or append output). my $parserF= Bio::Tools::Phylo::PAML->new (-file => ">mlc", -dir => "./"); Kevin Brown Center for Innovations in Medicine Biodesign Institute Arizona State University > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Dave Messina > Sent: Monday, August 29, 2011 10:44 AM > To: Lorenzo Carretero Paulet > Cc: bioperl-l at lists.open-bio.org > Subject: Re: [Bioperl-l] Saving Codeml Output file > > Hi Lorenzo, > > > and subsequently parsing it using > > my $parserF= Bio::Tools::Phylo::PAML->new (-file => "mlc", -dir => > "./"); > > > > However, I get the following message. > > > > ------------- EXCEPTION ------------- > > MSG: Could not open mlc: No such file or directory > > > > > > what I guess means the output file is not being saved in the previous > step. > > > > > Your interpretation could be correct. I think though that it might be > that > the -dir parameter you specify, "./", is not correct. Are you seeing > the mlc > file in the '.' (current working) dir? > > If I remember correctly, by default the mlc file is created in a > temporary > directory in /scratch or /tmp, and the save_tempfiles flag simply keeps > that > temporary directory from being deleted. > > I don't have the docs in front of me, but I believe there's a way to > get the > path of the temp directory that B::T::P::PAML is using. If so, you can > use > that path as the value for the -dir parameter. > > Let me know if not, though, and we can follow up on this. > > Dave > > PS - also, could you verify that you're using the latest versions of > bioperl-live and bioperl-run from Github? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From scott at scottcain.net Mon Aug 29 18:34:41 2011 From: scott at scottcain.net (Scott Cain) Date: Mon, 29 Aug 2011 14:34:41 -0400 Subject: [Bioperl-l] pls help.. In-Reply-To: References: <92CA808D-16F0-4F08-BC44-8A0C06292EA8@scottcain.net> <1D308407-17A9-4203-9D6C-D71FA0FD74D0@illinois.edu> Message-ID: Hi Ravi, Sorry I took a while to get back to you; I was on vacation last week. Also, please keep correspondence on the bioperl mailing list. If you had, perhaps somebody else would have provided another answer by now. I found the bug in the genbank2gff3 script that causes this problem. You have a few options for how to proceed: 1. Split the multi-genbank file into individual files, put them in a directory, and point the script at that directory (with the --dir flag). If you do this, you won't have to do anything with your BioPerl installation. 2. Get a fresh checkout of bioperl-live from git and install BioPerl from it, as I just committed the fix to the master branch. 3. Manually apply the fix that I just put into master. The diff is here: https://github.com/bioperl/bioperl-live/commit/1cff7d541e704a1f35d85bb27a0ab5911d89f8df Scott On Tue, Aug 23, 2011 at 12:55 AM, Ravi Devani wrote: > Yes the script works but have you seen the gff file generated by it. It has > multiple entries for the same features. And the file keeps on growing in > size with thw same features repeated many times. Thats the problem.. > > Thanking you, > Ravi > > > > -- ------------------------------------------------------------------------ Scott Cain, Ph. D.? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? scott at scottcain dot net GMOD Coordinator (http://gmod.org/)? ? ? ? ? ? ? ? ? ?? 216-392-3087 Ontario Institute for Cancer Research From locarpau at upvnet.upv.es Mon Aug 29 18:56:50 2011 From: locarpau at upvnet.upv.es (Lorenzo Carretero Paulet) Date: Mon, 29 Aug 2011 20:56:50 +0200 Subject: [Bioperl-l] Saving Codeml Output file In-Reply-To: References: <9645AF32-5EC3-41FA-9A32-45B6B92E31FD@illinois.edu> <4DF56976.8080704@upvnet.upv.es> <9866C4A4-AC36-4A25-B38F-3006A7BB0F11@illinois.edu> <1314636433.4e5bc291a40c6@webmail.upv.es> Message-ID: <1314644210.4e5be0f277c05@webmail.upv.es> Thanks Dave, Yes. I do not found the output file in the current directory, or in the temp directory. Using my $tmpdir = $codeml_factory->tempdir(); my $parserF= Bio::Tools::Phylo::PAML->new ( -file => "mlc", -dir => "$tmpdir" ); I still get the same error message. I'm using Bioperl version 1.006901. Cheers, Lorenzo Mensaje citado por Dave Messina : > Hi Lorenzo, > > > and subsequently parsing it using > > my $parserF= Bio::Tools::Phylo::PAML->new (-file => "mlc", -dir => "./"); > > > > However, I get the following message. > > > > ------------- EXCEPTION ------------- > > MSG: Could not open mlc: No such file or directory > > > > > > what I guess means the output file is not being saved in the previous step. > > > > > Your interpretation could be correct. I think though that it might be that > the -dir parameter you specify, "./", is not correct. Are you seeing the mlc > file in the '.' (current working) dir? > > If I remember correctly, by default the mlc file is created in a temporary > directory in /scratch or /tmp, and the save_tempfiles flag simply keeps that > temporary directory from being deleted. > > I don't have the docs in front of me, but I believe there's a way to get the > path of the temp directory that B::T::P::PAML is using. If so, you can use > that path as the value for the -dir parameter. > > Let me know if not, though, and we can follow up on this. > > Dave > > PS - also, could you verify that you're using the latest versions of > bioperl-live and bioperl-run from Github? > From locarpau at upvnet.upv.es Mon Aug 29 19:05:49 2011 From: locarpau at upvnet.upv.es (Lorenzo Carretero Paulet) Date: Mon, 29 Aug 2011 21:05:49 +0200 Subject: [Bioperl-l] Saving Codeml Output file In-Reply-To: <1A4207F8295607498283FE9E93B775B407CCB29D@EX02.asurite.ad.asu.edu> References: <9645AF32-5EC3-41FA-9A32-45B6B92E31FD@illinois.edu><4DF56976.8080704@upvnet.upv.es><9866C4A4-AC36-4A25-B38F-3006A7BB0F11@illinois.edu><1314636433.4e5bc291a40c6@webmail.upv.es> <1A4207F8295607498283FE9E93B775B407CCB29D@EX02.asurite.ad.asu.edu> Message-ID: <1314644749.4e5be30d78cb7@webmail.upv.es> Kevin, Still the same. The previous message is preceeded by: Filehandle GEN11 opened only for output at /Library/Perl//5.10.0/Bio/Root/IO.pm line 571 which points to # if the buffer been filled by _pushback then return the buffer # contents, rather than read from the filehandle if( @{$self->{'_readbuffer'} || [] } ) { $line = shift @{$self->{'_readbuffer'}}; } else { $line = <$fh>; } from the inner subroutine _readline of /Bio/Root/IO.pm Best, L Mensaje citado por Kevin Brown : > Opening a file for output that does not exist requires the > or >> > redirector (depending on if you want to overwrite or append output). > > my $parserF= Bio::Tools::Phylo::PAML->new (-file => ">mlc", -dir => > "./"); > > > > Kevin Brown > Center for Innovations in Medicine > Biodesign Institute > Arizona State University > > -----Original Message----- > > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > > bounces at lists.open-bio.org] On Behalf Of Dave Messina > > Sent: Monday, August 29, 2011 10:44 AM > > To: Lorenzo Carretero Paulet > > Cc: bioperl-l at lists.open-bio.org > > Subject: Re: [Bioperl-l] Saving Codeml Output file > > > > Hi Lorenzo, > > > > > > and subsequently parsing it using > > > my $parserF= Bio::Tools::Phylo::PAML->new (-file => "mlc", -dir => > > "./"); > > > > > > However, I get the following message. > > > > > > ------------- EXCEPTION ------------- > > > MSG: Could not open mlc: No such file or directory > > > > > > > > > > what I guess means the output file is not being saved in the > previous > > step. > > > > > > > > > Your interpretation could be correct. I think though that it might be > > that > > the -dir parameter you specify, "./", is not correct. Are you seeing > > the mlc > > file in the '.' (current working) dir? > > > > If I remember correctly, by default the mlc file is created in a > > temporary > > directory in /scratch or /tmp, and the save_tempfiles flag simply > keeps > > that > > temporary directory from being deleted. > > > > I don't have the docs in front of me, but I believe there's a way to > > get the > > path of the temp directory that B::T::P::PAML is using. If so, you can > > use > > that path as the value for the -dir parameter. > > > > Let me know if not, though, and we can follow up on this. > > > > Dave > > > > PS - also, could you verify that you're using the latest versions of > > bioperl-live and bioperl-run from Github? > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From Kevin.M.Brown at asu.edu Mon Aug 29 19:19:53 2011 From: Kevin.M.Brown at asu.edu (Kevin Brown) Date: Mon, 29 Aug 2011 12:19:53 -0700 Subject: [Bioperl-l] Saving Codeml Output file In-Reply-To: <1314636433.4e5bc291a40c6@webmail.upv.es> References: <9645AF32-5EC3-41FA-9A32-45B6B92E31FD@illinois.edu><4DF56976.8080704@upvnet.upv.es><9866C4A4-AC36-4A25-B38F-3006A7BB0F11@illinois.edu> <1314636433.4e5bc291a40c6@webmail.upv.es> Message-ID: <1A4207F8295607498283FE9E93B775B407CCB2D3@EX02.asurite.ad.asu.edu> OK, went back to the original message. And here's where the problem actually originates... my $codeml_factory = new Bio::Tools::Run::Phylo::PAML::Codeml ( # this should cause it to create a file called mlc -outfile => '>mlc', -save_tempfiles => 1, -alignment => $codon_MSA, -tree => $biotree, -params => { 'verbose' => 1, 'noisy' => 9, 'runmode' => 0, #user tree 'seqtype' => 1, 'model' => $model, 'NSsites' => $NSsites, 'fix_omega' => $fix_omega, 'omega' => $omega, 'ncatG' => $ncatG, 'icode' => 0, #* 0:universal code; 1:mammalian mt; 2-10:see below (5:ciliate nuclear) #'fix_alpha' => 0, #'fix_kappa' => 0, #'RateAncestor' => 0, 'CodonFreq' => 2, 'cleandata' => 1, # remove sites with amibguity data (1 yes, 0 no), 'ndata' => 1 }, ); Kevin Brown Center for Innovations in Medicine Biodesign Institute Arizona State University > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Lorenzo Carretero Paulet > Sent: Monday, August 29, 2011 9:47 AM > To: bioperl-l at lists.open-bio.org > Subject: [Bioperl-l] Saving Codeml Output file > > Hi all, > I'm running codeml from the PAML package using the corresponding > Bioperl > wrapper. I'd like to save the output file as -outfile => 'mlc', as in: > my $codeml_factory = new Bio::Tools::Run::Phylo::PAML::Codeml > ( -outfile => 'mlc', > -save_tempfiles => 1, > -alignment => $codon_MSA, > -tree => $biotree, > -params => > { > #'outfile' =>'mlc', > 'verbose' => 1, > 'noisy' => 9, > 'runmode' => 0, #user tree > 'seqtype' => 1, > 'model' => $model, > 'NSsites' => $NSsites, > 'fix_omega' => $fix_omega, > 'omega' => $omega, > 'ncatG' => $ncatG, > 'icode' => 0, #* 0:universal code; 1:mammalian mt; 2-10:see below > (5:ciliate > nuclear) > #'fix_alpha' => 0, > #'fix_kappa' => > 0, #'RateAncestor' => 0, > 'CodonFreq' => 2, > 'cleandata' => > 1, # remove sites with amibguity data (1 yes, 0 no), > 'ndata' => 1 > }, > ); > > and subsequently parsing it using > my $parserF= Bio::Tools::Phylo::PAML->new (-file => "mlc", -dir => > "./"); > > However, I get the following message. > > ------------- EXCEPTION ------------- > MSG: Could not open mlc: No such file or directory > STACK Bio::Root::IO::_initialize_io > /Library/Perl//5.10.0/Bio/Root/IO.pm:351 > STACK Bio::Tools::Phylo::PAML::new > /Library/Perl//5.10.0/Bio/Tools/Phylo/PAML.pm:239 > STACK main::BranchSiteEvolAnalysis > /Users/marioafares/Documents/workspace/BactEvolGen/BactEvolGen.pl:1421 > STACK toplevel > /Users/marioafares/Documents/workspace/BactEvolGen/BactEvolGen.pl:939 > ------------------------------------- > > what I guess means the output file is not being saved in the previous > step. > Anyone knows what's wrong. > Tnak you very much in advance for your help. > Cheers, > Lorenzo > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From locarpau at upvnet.upv.es Mon Aug 29 23:19:46 2011 From: locarpau at upvnet.upv.es (Lorenzo Carretero Paulet) Date: Tue, 30 Aug 2011 01:19:46 +0200 Subject: [Bioperl-l] Saving Codeml Output file In-Reply-To: <1A4207F8295607498283FE9E93B775B407CCB2D3@EX02.asurite.ad.asu.edu> References: <9645AF32-5EC3-41FA-9A32-45B6B92E31FD@illinois.edu><4DF56976.8080704@upvnet.upv.es><9866C4A4-AC36-4A25-B38F-3006A7BB0F11@illinois.edu> <1314636433.4e5bc291a40c6@webmail.upv.es> <1A4207F8295607498283FE9E93B775B407CCB2D3@EX02.asurite.ad.asu.edu> Message-ID: <1314659986.4e5c1e9268078@webmail.upv.es> Kevin, That's pretty reasonable, but unfortunately still doesn't run. Even if I create the file as $outfile and give it as value to the wrapper as -outfile =>$outfile. It seems as if Bio::Tools::Run::Phylo::PAML::Codeml failed at creating the outfile. Did anyone manage to generate the outfile from Bio::Tools::Run::Phylo::PAML::Codeml. Cheers, Lorenzo Mensaje citado por Kevin Brown : > OK, went back to the original message. > > And here's where the problem actually originates... > > my $codeml_factory = new Bio::Tools::Run::Phylo::PAML::Codeml > ( > # this should cause it to create a file > called mlc > -outfile => '>mlc', > -save_tempfiles => 1, > -alignment => > $codon_MSA, > -tree => > $biotree, > -params => > { > 'verbose' => 1, > 'noisy' => 9, > 'runmode' => 0, #user tree > 'seqtype' => 1, > 'model' => $model, > 'NSsites' => $NSsites, > 'fix_omega' => $fix_omega, > 'omega' => $omega, > 'ncatG' => $ncatG, > 'icode' => 0, #* 0:universal code; 1:mammalian mt; 2-10:see > below (5:ciliate nuclear) > #'fix_alpha' => 0, > #'fix_kappa' => 0, > #'RateAncestor' => 0, > 'CodonFreq' => 2, > 'cleandata' => 1, # remove sites with amibguity data (1 yes, 0 > no), > 'ndata' => 1 > }, > ); > > > Kevin Brown > Center for Innovations in Medicine > Biodesign Institute > Arizona State University > > > -----Original Message----- > > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > > bounces at lists.open-bio.org] On Behalf Of Lorenzo Carretero Paulet > > Sent: Monday, August 29, 2011 9:47 AM > > To: bioperl-l at lists.open-bio.org > > Subject: [Bioperl-l] Saving Codeml Output file > > > > Hi all, > > I'm running codeml from the PAML package using the corresponding > > Bioperl > > wrapper. I'd like to save the output file as -outfile => 'mlc', as in: > > my $codeml_factory = new Bio::Tools::Run::Phylo::PAML::Codeml > > ( -outfile => 'mlc', > > -save_tempfiles => 1, > > -alignment => > $codon_MSA, > > -tree => > $biotree, > > -params => > > { > > #'outfile' =>'mlc', > > 'verbose' => 1, > > 'noisy' => 9, > > 'runmode' => 0, #user tree > > 'seqtype' => 1, > > 'model' => $model, > > 'NSsites' => $NSsites, > > 'fix_omega' => $fix_omega, > > 'omega' => $omega, > > 'ncatG' => $ncatG, > > 'icode' => 0, #* 0:universal code; 1:mammalian mt; 2-10:see below > > (5:ciliate > > nuclear) > > #'fix_alpha' => 0, > > #'fix_kappa' => > > 0, #'RateAncestor' > => 0, > > 'CodonFreq' => > 2, > > 'cleandata' => > > 1, # remove sites with amibguity data (1 yes, 0 no), > > 'ndata' => 1 > > > }, > > ); > > > > and subsequently parsing it using > > my $parserF= Bio::Tools::Phylo::PAML->new (-file => "mlc", -dir => > > "./"); > > > > However, I get the following message. > > > > ------------- EXCEPTION ------------- > > MSG: Could not open mlc: No such file or directory > > STACK Bio::Root::IO::_initialize_io > > /Library/Perl//5.10.0/Bio/Root/IO.pm:351 > > STACK Bio::Tools::Phylo::PAML::new > > /Library/Perl//5.10.0/Bio/Tools/Phylo/PAML.pm:239 > > STACK main::BranchSiteEvolAnalysis > > /Users/marioafares/Documents/workspace/BactEvolGen/BactEvolGen.pl:1421 > > STACK toplevel > > /Users/marioafares/Documents/workspace/BactEvolGen/BactEvolGen.pl:939 > > ------------------------------------- > > > > what I guess means the output file is not being saved in the previous > > step. > > Anyone knows what's wrong. > > Tnak you very much in advance for your help. > > Cheers, > > Lorenzo > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From jason.stajich at gmail.com Tue Aug 30 00:05:57 2011 From: jason.stajich at gmail.com (Jason Stajich) Date: Mon, 29 Aug 2011 17:05:57 -0700 Subject: [Bioperl-l] Saving Codeml Output file In-Reply-To: <1314659986.4e5c1e9268078@webmail.upv.es> References: <9645AF32-5EC3-41FA-9A32-45B6B92E31FD@illinois.edu><4DF56976.8080704@upvnet.upv.es><9866C4A4-AC36-4A25-B38F-3006A7BB0F11@illinois.edu> <1314636433.4e5bc291a40c6@webmail.upv.es> <1A4207F8295607498283FE9E93B775B407CCB2D3@EX02.asurite.ad.asu.edu> <1314659986.4e5c1e9268078@webmail.upv.es> Message-ID: I think you are mistaken on how to use the factory running objects and associated parser. You don't have to instantiate a parser as this is what is returned by the run command. The whole point is you don't need to get to the tempdir or specify opening of the mlc file or all the other output files from the program. you get to use the parser to get the data out and then it cleans up afterwards so you can run many iterations of runs in separate folders without having to cleanup afterwards. http://www.bioperl.org/wiki/HOWTO:PAML my $factory = Bio::Tools::Run::Phylo::PAML::Codeml->new( ... ); my ($rc,$parser) = $factory->run( ); if( my $result = $parser->next_result ) { # $result is a Bio::Tools::Phylo::PAML object } On Aug 29, 2011, at 4:19 PM, Lorenzo Carretero Paulet wrote: > Kevin, > That's pretty reasonable, but unfortunately still doesn't run. Even if I create > the file as $outfile and give it as value to the wrapper as -outfile > =>$outfile. It seems as if Bio::Tools::Run::Phylo::PAML::Codeml failed at > creating the outfile. Did anyone manage to generate the outfile from > Bio::Tools::Run::Phylo::PAML::Codeml. > Cheers, > Lorenzo > > Mensaje citado por Kevin Brown : > >> OK, went back to the original message. >> >> And here's where the problem actually originates... >> >> my $codeml_factory = new Bio::Tools::Run::Phylo::PAML::Codeml >> ( >> # this should cause it to create a file >> called mlc >> -outfile => '>mlc', >> -save_tempfiles => 1, >> -alignment => >> $codon_MSA, >> -tree => >> $biotree, >> -params => >> { >> 'verbose' => 1, >> 'noisy' => 9, >> 'runmode' => 0, #user tree >> 'seqtype' => 1, >> 'model' => $model, >> 'NSsites' => $NSsites, >> 'fix_omega' => $fix_omega, >> 'omega' => $omega, >> 'ncatG' => $ncatG, >> 'icode' => 0, #* 0:universal code; 1:mammalian mt; 2-10:see >> below (5:ciliate nuclear) >> #'fix_alpha' => 0, >> #'fix_kappa' => 0, >> #'RateAncestor' => 0, >> 'CodonFreq' => 2, >> 'cleandata' => 1, # remove sites with amibguity data (1 yes, 0 >> no), >> 'ndata' => 1 >> }, >> ); >> >> >> Kevin Brown >> Center for Innovations in Medicine >> Biodesign Institute >> Arizona State University >> >>> -----Original Message----- >>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- >>> bounces at lists.open-bio.org] On Behalf Of Lorenzo Carretero Paulet >>> Sent: Monday, August 29, 2011 9:47 AM >>> To: bioperl-l at lists.open-bio.org >>> Subject: [Bioperl-l] Saving Codeml Output file >>> >>> Hi all, >>> I'm running codeml from the PAML package using the corresponding >>> Bioperl >>> wrapper. I'd like to save the output file as -outfile => 'mlc', as in: >>> my $codeml_factory = new Bio::Tools::Run::Phylo::PAML::Codeml >>> ( -outfile => 'mlc', >>> -save_tempfiles => 1, >>> -alignment => >> $codon_MSA, >>> -tree => >> $biotree, >>> -params => >>> { >>> #'outfile' =>'mlc', >>> 'verbose' => 1, >>> 'noisy' => 9, >>> 'runmode' => 0, #user tree >>> 'seqtype' => 1, >>> 'model' => $model, >>> 'NSsites' => $NSsites, >>> 'fix_omega' => $fix_omega, >>> 'omega' => $omega, >>> 'ncatG' => $ncatG, >>> 'icode' => 0, #* 0:universal code; 1:mammalian mt; 2-10:see below >>> (5:ciliate >>> nuclear) >>> #'fix_alpha' => 0, >>> #'fix_kappa' => >>> 0, #'RateAncestor' >> => 0, >>> 'CodonFreq' => >> 2, >>> 'cleandata' => >>> 1, # remove sites with amibguity data (1 yes, 0 no), >>> 'ndata' => 1 >>> >> }, >>> ); >>> >>> and subsequently parsing it using >>> my $parserF= Bio::Tools::Phylo::PAML->new (-file => "mlc", -dir => >>> "./"); >>> >>> However, I get the following message. >>> >>> ------------- EXCEPTION ------------- >>> MSG: Could not open mlc: No such file or directory >>> STACK Bio::Root::IO::_initialize_io >>> /Library/Perl//5.10.0/Bio/Root/IO.pm:351 >>> STACK Bio::Tools::Phylo::PAML::new >>> /Library/Perl//5.10.0/Bio/Tools/Phylo/PAML.pm:239 >>> STACK main::BranchSiteEvolAnalysis >>> /Users/marioafares/Documents/workspace/BactEvolGen/BactEvolGen.pl:1421 >>> STACK toplevel >>> /Users/marioafares/Documents/workspace/BactEvolGen/BactEvolGen.pl:939 >>> ------------------------------------- >>> >>> what I guess means the output file is not being saved in the previous >>> step. >>> Anyone knows what's wrong. >>> Tnak you very much in advance for your help. >>> Cheers, >>> Lorenzo >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From fs5 at sanger.ac.uk Tue Aug 30 09:45:46 2011 From: fs5 at sanger.ac.uk (Frank Schwach) Date: Tue, 30 Aug 2011 10:45:46 +0100 Subject: [Bioperl-l] primer3redux 0.09 add_targets is not there In-Reply-To: <3BE41688-C163-4EA1-AF6A-34A6052FCFEA@illinois.edu> References: <3BE41688-C163-4EA1-AF6A-34A6052FCFEA@illinois.edu> Message-ID: <1314697546.3797.8.camel@deskpro15336.internal.sanger.ac.uk> Yes, I still have the primer3redux doc on my TODO list. Sorry, haven't had the time to do this lately but will loook into this as soon as I can. Frank On Mon, 2011-08-22 at 15:10 -0500, Chris Fields wrote: > On Aug 22, 2011, at 2:52 PM, Anand Patel wrote: > > > my $primer3 = Bio::Tools::Run::Primer3Redux->new(-outfile => > > "temp.out", -path => "/usr/bin/primer3_core"); > > > > If I use this: > > $primer3->add_targets( > > 'PRIMER_OPT_TM'=>$PRIMER_OPT_TM,'PRIMER_MIN_TM'=>$PRIMER_MIN_TM, > > 'PRIMER_MAX_TM'=>$PRIMER_MAX_TM, > > 'PRIMER_PAIR_MAX_DIFF_TM'=>$PRIMER_MAX_DIFF_TM, > > 'PRIMER_MAX_SIZE'=>$PRIMER_MAX_SIZE,'PRIMER_OPT_SIZE'=>$PRIMER_OPT_SIZE, > > 'PRIMER_MIN_SIZE'=>$PRIMER_MIN_SIZE, > > 'PRIMER_MAX_GC'=>$PRIMER_MAX_GC, > > 'PRIMER_OPT_GC_PERCENT'=>$PRIMER_OPT_GC_PERCENT, > > 'PRIMER_MIN_GC'=>$PRIMER_MIN_GC, > > 'SEQUENCE_TARGET'=>$TARGET, 'PRIMER_PRODUCT_SIZE_RANGE' > > =>$PRIMER_PRODUCT_SIZE_RANGE); > > > > I get: > > Can't locate object method "add_targets" via package > > "Bio::Tools::Run::Primer3Redux" at p3ra.pl line 31, line 1. > > > > On the other hand, if I change that line to: > > $primer3->set_parameters( > > 'PRIMER_OPT_TM'=>$PRIMER_OPT_TM,'PRIMER_MIN_TM'=>$PRIMER_MIN_TM, > > 'PRIMER_MAX_TM'=>$PRIMER_MAX_TM, > > 'PRIMER_PAIR_MAX_DIFF_TM'=>$PRIMER_MAX_DIFF_TM, > > 'PRIMER_MAX_SIZE'=>$PRIMER_MAX_SIZE,'PRIMER_OPT_SIZE'=>$PRIMER_OPT_SIZE, > > 'PRIMER_MIN_SIZE'=>$PRIMER_MIN_SIZE, > > 'PRIMER_MAX_GC'=>$PRIMER_MAX_GC, > > 'PRIMER_OPT_GC_PERCENT'=>$PRIMER_OPT_GC_PERCENT, > > 'PRIMER_MIN_GC'=>$PRIMER_MIN_GC, > > 'SEQUENCE_TARGET'=>$TARGET, 'PRIMER_PRODUCT_SIZE_RANGE' > > =>$PRIMER_PRODUCT_SIZE_RANGE); > > > > It works. When I looked at the source code for Primer3Redux, I > > couldn't find add_targets, but set_parameters looked like it might > > work, so I used that instead, and it worked. > > > > But I see over in the github that there are other issues with the > > documentation (how primer3redux's result object is now 3 deep rather > > than 2 deep). Not sure if this is in that category or not. > > That is true; documentation was to be updated but that hasn't happened yet (haven't had the free time to work specifically on this, and I think fschwach was to work on some HOWTO documentation). I do plan on an update in the next few weeks to address the various Issues on github, if you can file this as well it would help. > > I have to go back and look at the history of add_targets() reative to primer3 bioperl code, but I don't think this was part of the commit history of Bio::Tools::Run::Primer3Redux (maybe for the old version, Bio::Tools::Run::Primer3), so that is probably cruft left over from the update. Would be easy enough to alias it for convenience... > > chris > > > Thanks, > > Anand > ... > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. From manju.rawat2 at gmail.com Tue Aug 30 11:22:33 2011 From: manju.rawat2 at gmail.com (Manju Rawat) Date: Tue, 30 Aug 2011 07:22:33 -0400 Subject: [Bioperl-l] Bioperl query.... Message-ID: Hey Pls help me.. I am very new in Bioperl.. And i want to use blast report in my programming.. But i dnt know how to use it...pls tell me how to use HSP,gaps.etc methods??/ how to use them to extract valus from blast file.. Thanks Manju Rawat From roy.chaudhuri at gmail.com Tue Aug 30 11:25:32 2011 From: roy.chaudhuri at gmail.com (Roy Chaudhuri) Date: Tue, 30 Aug 2011 12:25:32 +0100 Subject: [Bioperl-l] Bioperl query.... In-Reply-To: References: Message-ID: <4E5CC8AC.8050800@gmail.com> Hi Manju, See: http://www.bioperl.org/wiki/HOWTO:SearchIO Cheers, Roy. On 30/08/2011 12:22, Manju Rawat wrote: > Hey Pls help me.. > I am very new in Bioperl.. > And i want to use blast report in my programming.. > But i dnt know how to use it...pls tell me how to use HSP,gaps.etc > methods??/ > how to use them to extract valus from blast file.. > > Thanks > Manju Rawat > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Tue Aug 30 13:54:19 2011 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 30 Aug 2011 08:54:19 -0500 Subject: [Bioperl-l] primer3redux 0.09 add_targets is not there In-Reply-To: <1314697546.3797.8.camel@deskpro15336.internal.sanger.ac.uk> References: <3BE41688-C163-4EA1-AF6A-34A6052FCFEA@illinois.edu> <1314697546.3797.8.camel@deskpro15336.internal.sanger.ac.uk> Message-ID: <8063FB1D-4557-4D1B-B9EF-9833ECD440E9@illinois.edu> S'okay, we're all a bit busy :P chris On Aug 30, 2011, at 4:45 AM, Frank Schwach wrote: > Yes, I still have the primer3redux doc on my TODO list. Sorry, haven't > had the time to do this lately but will loook into this as soon as I > can. > Frank > > > On Mon, 2011-08-22 at 15:10 -0500, Chris Fields wrote: >> On Aug 22, 2011, at 2:52 PM, Anand Patel wrote: >> >>> my $primer3 = Bio::Tools::Run::Primer3Redux->new(-outfile => >>> "temp.out", -path => "/usr/bin/primer3_core"); >>> >>> If I use this: >>> $primer3->add_targets( >>> 'PRIMER_OPT_TM'=>$PRIMER_OPT_TM,'PRIMER_MIN_TM'=>$PRIMER_MIN_TM, >>> 'PRIMER_MAX_TM'=>$PRIMER_MAX_TM, >>> 'PRIMER_PAIR_MAX_DIFF_TM'=>$PRIMER_MAX_DIFF_TM, >>> 'PRIMER_MAX_SIZE'=>$PRIMER_MAX_SIZE,'PRIMER_OPT_SIZE'=>$PRIMER_OPT_SIZE, >>> 'PRIMER_MIN_SIZE'=>$PRIMER_MIN_SIZE, >>> 'PRIMER_MAX_GC'=>$PRIMER_MAX_GC, >>> 'PRIMER_OPT_GC_PERCENT'=>$PRIMER_OPT_GC_PERCENT, >>> 'PRIMER_MIN_GC'=>$PRIMER_MIN_GC, >>> 'SEQUENCE_TARGET'=>$TARGET, 'PRIMER_PRODUCT_SIZE_RANGE' >>> =>$PRIMER_PRODUCT_SIZE_RANGE); >>> >>> I get: >>> Can't locate object method "add_targets" via package >>> "Bio::Tools::Run::Primer3Redux" at p3ra.pl line 31, line 1. >>> >>> On the other hand, if I change that line to: >>> $primer3->set_parameters( >>> 'PRIMER_OPT_TM'=>$PRIMER_OPT_TM,'PRIMER_MIN_TM'=>$PRIMER_MIN_TM, >>> 'PRIMER_MAX_TM'=>$PRIMER_MAX_TM, >>> 'PRIMER_PAIR_MAX_DIFF_TM'=>$PRIMER_MAX_DIFF_TM, >>> 'PRIMER_MAX_SIZE'=>$PRIMER_MAX_SIZE,'PRIMER_OPT_SIZE'=>$PRIMER_OPT_SIZE, >>> 'PRIMER_MIN_SIZE'=>$PRIMER_MIN_SIZE, >>> 'PRIMER_MAX_GC'=>$PRIMER_MAX_GC, >>> 'PRIMER_OPT_GC_PERCENT'=>$PRIMER_OPT_GC_PERCENT, >>> 'PRIMER_MIN_GC'=>$PRIMER_MIN_GC, >>> 'SEQUENCE_TARGET'=>$TARGET, 'PRIMER_PRODUCT_SIZE_RANGE' >>> =>$PRIMER_PRODUCT_SIZE_RANGE); >>> >>> It works. When I looked at the source code for Primer3Redux, I >>> couldn't find add_targets, but set_parameters looked like it might >>> work, so I used that instead, and it worked. >>> >>> But I see over in the github that there are other issues with the >>> documentation (how primer3redux's result object is now 3 deep rather >>> than 2 deep). Not sure if this is in that category or not. >> >> That is true; documentation was to be updated but that hasn't happened yet (haven't had the free time to work specifically on this, and I think fschwach was to work on some HOWTO documentation). I do plan on an update in the next few weeks to address the various Issues on github, if you can file this as well it would help. >> >> I have to go back and look at the history of add_targets() reative to primer3 bioperl code, but I don't think this was part of the commit history of Bio::Tools::Run::Primer3Redux (maybe for the old version, Bio::Tools::Run::Primer3), so that is probably cruft left over from the update. Would be easy enough to alias it for convenience... >> >> chris >> >>> Thanks, >>> Anand >> ... >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > -- > The Wellcome Trust Sanger Institute is operated by Genome Research > Limited, a charity registered in England with number 1021457 and a > company registered in England with number 2742969, whose registered > office is 215 Euston Road, London, NW1 2BE. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From locarpau at upvnet.upv.es Tue Aug 30 14:58:51 2011 From: locarpau at upvnet.upv.es (Lorenzo Carretero Paulet) Date: Tue, 30 Aug 2011 16:58:51 +0200 Subject: [Bioperl-l] Saving Codeml Output file In-Reply-To: References: <9645AF32-5EC3-41FA-9A32-45B6B92E31FD@illinois.edu><4DF56976.8080704@upvnet.upv.es><9866C4A4-AC36-4A25-B38F-3006A7BB0F11@illinois.edu> <1314636433.4e5bc291a40c6@webmail.upv.es> <1A4207F8295607498283FE9E93B775B407CCB2D3@EX02.asurite.ad.asu.edu> <1314659986.4e5c1e9268078@webmail.upv.es> Message-ID: <1314716331.4e5cfaab4958e@webmail.upv.es> Thanks Jason, Ok, I see. That's what I was triying at the beggining. This runs OK in my scripts for branch-specific models. However, when I try branch-site models (NSsites > 0) and try to parse the results using my $model_result= $paml_result->get_NSSite_results I start to have problems. According to Dumper, I'm able to generate a Bio::Tools::Phylo::PAML object $paml_result but this doesn't store any Bio::Tools::Phylo::PAML::ModelResult that could be accessed using get_NSSite_results. See below a little piece of code to illustrate what I'm saying. my $codeml_factory = new Bio::Tools::Run::Phylo::PAML::Codeml ( -alignment => $codon_MSA, -tree => $biotree, -params => { ... ...parameter values ... }, ); my ($rc,$parser) = $codeml_factory->run(); # or run($dna_aln,$biotree) #$codeml_factory->cleanup(); my $paml_result = $parser->next_result; say Dumper $paml_result; #This returns a true Bio::Tools::Phylo::PAML::Result object!!! my $model_result= $paml_result->get_NSSite_results; say Dumper $model_result; #This doesn't return a true Bio::Tools::Phylo::PAML::ModelResult object ($VAR1 = 0;)!!! $ns_string = "model ".$model_result->model_num."\n".$model_result->model_description()."\n".$model_result->time_used."\n"; As no ModelResult object is generated, the script stops returning: Can't call method "model_num" without a package or object reference That's why I was trying to save the mlc output file and parse it, instead of parsing directly the Bio::Tools::Phylo::PAML object. Best, Lorenzo PS: I?m using paml version 4.4b, July 2010 and Bioperl 1.006901. on mac osx Mensaje citado por Jason Stajich : > I think you are mistaken on how to use the factory running objects and > associated parser. > > You don't have to instantiate a parser as this is what is returned by the run > command. The whole point is you don't need to get to the tempdir or specify > opening of the mlc file or all the other output files from the program. you > get to use the parser to get the data out and then it cleans up afterwards so > you can run many iterations of runs in separate folders without having to > cleanup afterwards. > > http://www.bioperl.org/wiki/HOWTO:PAML > > my $factory = Bio::Tools::Run::Phylo::PAML::Codeml->new( ... ); > my ($rc,$parser) = $factory->run( ); > > if( my $result = $parser->next_result ) { > # $result is a Bio::Tools::Phylo::PAML object > } > > > On Aug 29, 2011, at 4:19 PM, Lorenzo Carretero Paulet wrote: > > > Kevin, > > That's pretty reasonable, but unfortunately still doesn't run. Even if I > create > > the file as $outfile and give it as value to the wrapper as -outfile > > =>$outfile. It seems as if Bio::Tools::Run::Phylo::PAML::Codeml failed at > > creating the outfile. Did anyone manage to generate the outfile from > > Bio::Tools::Run::Phylo::PAML::Codeml. > > Cheers, > > Lorenzo > > > > Mensaje citado por Kevin Brown : > > > >> OK, went back to the original message. > >> > >> And here's where the problem actually originates... > >> > >> my $codeml_factory = new Bio::Tools::Run::Phylo::PAML::Codeml > >> ( > >> # this should cause it to create a file > >> called mlc > >> -outfile => '>mlc', > >> -save_tempfiles => 1, > >> -alignment => > >> $codon_MSA, > >> -tree => > >> $biotree, > >> -params => > >> { > >> 'verbose' => 1, > >> 'noisy' => 9, > >> 'runmode' => 0, #user tree > >> 'seqtype' => 1, > >> 'model' => $model, > >> 'NSsites' => $NSsites, > >> 'fix_omega' => $fix_omega, > >> 'omega' => $omega, > >> 'ncatG' => $ncatG, > >> 'icode' => 0, #* 0:universal code; 1:mammalian mt; 2-10:see > >> below (5:ciliate nuclear) > >> #'fix_alpha' => 0, > >> #'fix_kappa' => 0, > >> #'RateAncestor' => 0, > >> 'CodonFreq' => 2, > >> 'cleandata' => 1, # remove sites with amibguity data (1 yes, 0 > >> no), > >> 'ndata' => 1 > >> }, > >> ); > >> > >> > >> Kevin Brown > >> Center for Innovations in Medicine > >> Biodesign Institute > >> Arizona State University > >> > >>> -----Original Message----- > >>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > >>> bounces at lists.open-bio.org] On Behalf Of Lorenzo Carretero Paulet > >>> Sent: Monday, August 29, 2011 9:47 AM > >>> To: bioperl-l at lists.open-bio.org > >>> Subject: [Bioperl-l] Saving Codeml Output file > >>> > >>> Hi all, > >>> I'm running codeml from the PAML package using the corresponding > >>> Bioperl > >>> wrapper. I'd like to save the output file as -outfile => 'mlc', as in: > >>> my $codeml_factory = new Bio::Tools::Run::Phylo::PAML::Codeml > >>> ( -outfile => 'mlc', > >>> -save_tempfiles => 1, > >>> -alignment => > >> $codon_MSA, > >>> -tree => > >> $biotree, > >>> -params => > >>> { > >>> #'outfile' =>'mlc', > >>> 'verbose' => 1, > >>> 'noisy' => 9, > >>> 'runmode' => 0, #user tree > >>> 'seqtype' => 1, > >>> 'model' => $model, > >>> 'NSsites' => $NSsites, > >>> 'fix_omega' => $fix_omega, > >>> 'omega' => $omega, > >>> 'ncatG' => $ncatG, > >>> 'icode' => 0, #* 0:universal code; 1:mammalian mt; 2-10:see below > >>> (5:ciliate > >>> nuclear) > >>> #'fix_alpha' => 0, > >>> #'fix_kappa' => > >>> 0, #'RateAncestor' > >> => 0, > >>> 'CodonFreq' => > >> 2, > >>> 'cleandata' => > >>> 1, # remove sites with amibguity data (1 yes, 0 no), > >>> 'ndata' => 1 > >>> > >> }, > >>> ); > >>> > >>> and subsequently parsing it using > >>> my $parserF= Bio::Tools::Phylo::PAML->new (-file => "mlc", -dir => > >>> "./"); > >>> > >>> However, I get the following message. > >>> > >>> ------------- EXCEPTION ------------- > >>> MSG: Could not open mlc: No such file or directory > >>> STACK Bio::Root::IO::_initialize_io > >>> /Library/Perl//5.10.0/Bio/Root/IO.pm:351 > >>> STACK Bio::Tools::Phylo::PAML::new > >>> /Library/Perl//5.10.0/Bio/Tools/Phylo/PAML.pm:239 > >>> STACK main::BranchSiteEvolAnalysis > >>> /Users/marioafares/Documents/workspace/BactEvolGen/BactEvolGen.pl:1421 > >>> STACK toplevel > >>> /Users/marioafares/Documents/workspace/BactEvolGen/BactEvolGen.pl:939 > >>> ------------------------------------- > >>> > >>> what I guess means the output file is not being saved in the previous > >>> step. > >>> Anyone knows what's wrong. > >>> Tnak you very much in advance for your help. > >>> Cheers, > >>> Lorenzo > >>> _______________________________________________ > >>> Bioperl-l mailing list > >>> Bioperl-l at lists.open-bio.org > >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >> > >> _______________________________________________ > >> Bioperl-l mailing list > >> Bioperl-l at lists.open-bio.org > >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >> > > > > > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From shalabh.sharma7 at gmail.com Tue Aug 30 15:26:00 2011 From: shalabh.sharma7 at gmail.com (shalabh sharma) Date: Tue, 30 Aug 2011 11:26:00 -0400 Subject: [Bioperl-l] Bioperl query.... In-Reply-To: <4E5CC8AC.8050800@gmail.com> References: <4E5CC8AC.8050800@gmail.com> Message-ID: Hi Manju, Just follow the link sent by Roy. It also contain some useful example scripts. What i am suggesting is , you should run a blast on a very small data set that you can inspect easily and manually. Then parse it using SeachIO (follow the link) and you will get a fair idea that how it works. -Shalabh On Tue, Aug 30, 2011 at 7:25 AM, Roy Chaudhuri wrote: > Hi Manju, > > See: > http://www.bioperl.org/wiki/**HOWTO:SearchIO > > Cheers, > Roy. > > > On 30/08/2011 12:22, Manju Rawat wrote: > >> Hey Pls help me.. >> I am very new in Bioperl.. >> And i want to use blast report in my programming.. >> But i dnt know how to use it...pls tell me how to use HSP,gaps.etc >> methods??/ >> how to use them to extract valus from blast file.. >> >> Thanks >> Manju Rawat >> ______________________________**_________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/**mailman/listinfo/bioperl-l >> > > ______________________________**_________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/**mailman/listinfo/bioperl-l > -- Shalabh Sharma Scientific Computing Professional Associate (Bioinformatics Specialist) Department of Marine Sciences University of Georgia Athens, GA 30602-3636 From longbow0 at gmail.com Wed Aug 31 15:48:16 2011 From: longbow0 at gmail.com (longbow leo) Date: Wed, 31 Aug 2011 10:48:16 -0500 Subject: [Bioperl-l] How to color leaves of a tree by Bio::Tree::Draw::Cladogram? Message-ID: Dear all, I am using the module Bio::Tree::Draw::Cladogram to create a tree diagram. But when I tried to color the tree leaves, the diagram was still without any colors. How can I color tree leave? Thanks in advance. Here is my script: ###################################################################### #!/usr/bin/perl use strict; use warnings; use Bio::TreeIO; use Bio::Tree::Draw::Cladogram; my $treei = Bio::TreeIO->new( -fh => \*DATA, -format => 'newick', ); my $tree = $treei->next_tree; # Color node 'B' to red my ($nodeB) = $tree->find_node( -id => 'B' ); $nodeB->add_tag_value('Rcolor', 1); $nodeB->add_tag_value('Gcolor', 0); $nodeB->add_tag_value('Bcolor', 0); my $cg = Bio::Tree::Draw::Cladogram->new( -tree => $tree, ); $cg->print( -file => 'mytree.eps' ); __DATA__ (((A:5,B:5)90:2,C:4)25:3,D:10); ###################################################################### Regards, Haizhou From roy.chaudhuri at gmail.com Wed Aug 31 16:02:30 2011 From: roy.chaudhuri at gmail.com (Roy Chaudhuri) Date: Wed, 31 Aug 2011 17:02:30 +0100 Subject: [Bioperl-l] How to color leaves of a tree by Bio::Tree::Draw::Cladogram? In-Reply-To: References: Message-ID: <4E5E5B16.9070704@gmail.com> Hi Haizhou, I think you need to specify -colors=>1 in your Bio::Tree::Draw::Cladogram constructor: my $cg = Bio::Tree::Draw::Cladogram->new( -tree => $tree, -colors => 1 ); Not sure why this isn't on by default. Roy. On 31/08/2011 16:48, longbow leo wrote: > Dear all, > > I am using the module Bio::Tree::Draw::Cladogram to create a tree diagram. > But when I tried to color the tree leaves, the diagram was still without any > colors. > > How can I color tree leave? Thanks in advance. > > Here is my script: > > ###################################################################### > > > #!/usr/bin/perl > > use strict; > use warnings; > > use Bio::TreeIO; > use Bio::Tree::Draw::Cladogram; > > my $treei = Bio::TreeIO->new( > -fh => \*DATA, > -format => 'newick', > ); > > my $tree = $treei->next_tree; > > # Color node 'B' to red > my ($nodeB) = $tree->find_node( -id => 'B' ); > > $nodeB->add_tag_value('Rcolor', 1); > $nodeB->add_tag_value('Gcolor', 0); > $nodeB->add_tag_value('Bcolor', 0); > > my $cg = Bio::Tree::Draw::Cladogram->new( > -tree => $tree, > ); > > $cg->print( -file => 'mytree.eps' ); > > __DATA__ > (((A:5,B:5)90:2,C:4)25:3,D:10); > > > ###################################################################### > > Regards, > > Haizhou > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l