From anjan.purkayastha at gmail.com Mon Mar 3 12:31:11 2008 From: anjan.purkayastha at gmail.com (ANJAN PURKAYASTHA) Date: Mon, 3 Mar 2008 12:31:11 -0500 Subject: [Bioperl-l] problem with Bio::Tools::EMBOSS Message-ID: hi i am tried to use the perl wrappers for EMBOSS with: use lib "/Users/anjan/perl_directory/bioperl-1.5.2_102/"; use Bio::Factory::EMBOSS; however it seems that Bio::Factory::EMBOSS cannot be found in the bioperl directory mentioned above. so i tried to install Bio::Factory::EMBOSS from the cpan website. i got the attached error message. any ideas on what i need to do to make this work? all advice will be appreciated. tia, anjan -- ANJAN PURKAYASTHA, PhD. Senior Computational Biologist ========================== 1101 King Street, Suite 310, Alexandria, VA 22314. 703.518.8040 (office) 703.740.6939 (mobile) email: anjan at vbi.vt.edu; anjan.purkayastha at gmail.com http://www.vbi.vt.edu ========================== -------------- next part -------------- A non-text attachment was scrubbed... Name: emboss_install_error_message.rtf Type: application/rtf Size: 123212 bytes Desc: not available URL: From cjfields at uiuc.edu Mon Mar 3 13:54:06 2008 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 3 Mar 2008 12:54:06 -0600 Subject: [Bioperl-l] problem with Bio::Tools::EMBOSS In-Reply-To: References: Message-ID: You'll need to install bioperl-run. Bio::Factory::EMBOSS is in bioperl-run, not the main bioperl distribution (aka bioperl-core). chris On Mar 3, 2008, at 11:31 AM, ANJAN PURKAYASTHA wrote: > hi > i am tried to use the perl wrappers for EMBOSS with: > > use lib "/Users/anjan/perl_directory/bioperl-1.5.2_102/"; > use Bio::Factory::EMBOSS; > > however it seems that Bio::Factory::EMBOSS cannot be found in the > bioperl > directory mentioned above. > > so i tried to install Bio::Factory::EMBOSS from the cpan website. i > got the > attached error message. > > any ideas on what i need to do to make this work? > all advice will be appreciated. > > tia, > > anjan > > > -- > ANJAN PURKAYASTHA, PhD. > Senior Computational Biologist > ========================== > > 1101 King Street, Suite 310, > Alexandria, VA 22314. > 703.518.8040 (office) > 703.740.6939 (mobile) > > email: > anjan at vbi.vt.edu; > anjan.purkayastha at gmail.com > > http://www.vbi.vt.edu > > ========================== > < > emboss_install_error_message > .rtf>_______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From David.Messina at sbc.su.se Mon Mar 3 14:34:20 2008 From: David.Messina at sbc.su.se (Dave Messina) Date: Mon, 3 Mar 2008 20:34:20 +0100 Subject: [Bioperl-l] problem with Bio::Tools::EMBOSS In-Reply-To: References: Message-ID: <628aabb70803031134g3263e94fge7131f8862434b23@mail.gmail.com> Hi Anjan, Bio::Factory::EMBOSS is not part of the BioPerl core distribution, but rather part of bioperl-run. For some reason CPAN went for the old (1.4) version of bioperl-run rather than the current 1.5.2. And indeed, I seem to run into the same problem: cpan> d /bioperl/ Distribution BIRNEY/bioperl-1.2.1.tar.gz Distribution BIRNEY/bioperl-1.2.2.tar.gz Distribution BIRNEY/bioperl-1.2.3.tar.gz Distribution BIRNEY/bioperl-1.2.tar.gz Distribution BIRNEY/bioperl-1.4.tar.gz Distribution BIRNEY/bioperl-db-0.1.tar.gz Distribution BIRNEY/bioperl-ext-1.4.tar.gz Distribution BIRNEY/bioperl-gui-0.7.tar.gz Distribution BIRNEY/bioperl-run-1.2.2.tar.gz Distribution BIRNEY/bioperl-run-1.4.tar.gz Distribution BOZO/Fry-Lib-BioPerl-0.15.tar.gz Distribution CRAFFI/Bundle-BioPerl-2.1.8.tar.gz 12 items found but when I ask in a different way the right distributions show up. [Sendu, any idea what's going on here?] cpan> ls SENDU 5919092 2007-02-14 SENDU/bioperl-1.5.2_102.tar.gz 320154 2006-12-06 SENDU/bioperl-db-1.5.2_100.tar.gz 99082 2006-12-06 SENDU/bioperl-network-1.5.2_100.tar.gz 942093 2006-12-06 SENDU/bioperl-run-1.5.2_100.tar.gz So try doing cpan> install SENDU/bioperl-run-1.5.2_100.tar.gz Or if CPAN refuses to cooperate, you can grab it from here: http://www.bioperl.org/wiki/Getting_BioPerl#Bioperl_1.5.2.2C_Developer_Release Dave From arareko at campus.iztacala.unam.mx Mon Mar 3 14:25:14 2008 From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra) Date: Mon, 03 Mar 2008 13:25:14 -0600 Subject: [Bioperl-l] problem with Bio::Tools::EMBOSS In-Reply-To: References: Message-ID: <47CC509A.10306@campus.iztacala.unam.mx> Hi Anjan, It looks like you are using the latest BioPerl developer release (bioperl-1.5.2_102) from CPAN, to have Bio::Factory::EMBOSS available then you should try installing the latest BioPerl-run as well (bioperl-run-1.5.2_100). After you install it, you'll have to modify your 'use lib' pragma for your script to work as you expect: use lib "/Users/anjan/perl_directory/bioperl-run-1.5.2_100/"; use Bio::Factory::EMBOSS; Hope this helps. Regards, Mauricio. ANJAN PURKAYASTHA wrote: > hi > i am tried to use the perl wrappers for EMBOSS with: > > use lib "/Users/anjan/perl_directory/bioperl-1.5.2_102/"; > use Bio::Factory::EMBOSS; > > however it seems that Bio::Factory::EMBOSS cannot be found in the bioperl > directory mentioned above. > > so i tried to install Bio::Factory::EMBOSS from the cpan website. i got the > attached error message. > > any ideas on what i need to do to make this work? > all advice will be appreciated. > > tia, > > anjan > > > > > ------------------------------------------------------------------------ > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- MAURICIO HERRERA CUADRA arareko at campus.iztacala.unam.mx Laboratorio de Gen?tica Unidad de Morfofisiolog?a y Funci?n Facultad de Estudios Superiores Iztacala, UNAM From cjfields at uiuc.edu Mon Mar 3 15:05:16 2008 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 3 Mar 2008 14:05:16 -0600 Subject: [Bioperl-l] problem with Bio::Tools::EMBOSS In-Reply-To: <628aabb70803031134g3263e94fge7131f8862434b23@mail.gmail.com> References: <628aabb70803031134g3263e94fge7131f8862434b23@mail.gmail.com> Message-ID: <43EC247B-EC01-483D-82B1-D861590A141A@uiuc.edu> On Mar 3, 2008, at 1:34 PM, Dave Messina wrote: > Hi Anjan, > > Bio::Factory::EMBOSS is not part of the BioPerl core distribution, but > rather part of bioperl-run. For some reason CPAN went for the old > (1.4) > version of bioperl-run rather than the current 1.5.2. > > And indeed, I seem to run into the same problem: > cpan> d /bioperl/ > > Distribution BIRNEY/bioperl-1.2.1.tar.gz > Distribution BIRNEY/bioperl-1.2.2.tar.gz > Distribution BIRNEY/bioperl-1.2.3.tar.gz > Distribution BIRNEY/bioperl-1.2.tar.gz > Distribution BIRNEY/bioperl-1.4.tar.gz > Distribution BIRNEY/bioperl-db-0.1.tar.gz > Distribution BIRNEY/bioperl-ext-1.4.tar.gz > Distribution BIRNEY/bioperl-gui-0.7.tar.gz > Distribution BIRNEY/bioperl-run-1.2.2.tar.gz > Distribution BIRNEY/bioperl-run-1.4.tar.gz > Distribution BOZO/Fry-Lib-BioPerl-0.15.tar.gz > Distribution CRAFFI/Bundle-BioPerl-2.1.8.tar.gz > 12 items found > > but when I ask in a different way the right distributions show up. > [Sendu, > any idea what's going on here?] It's marked as a developer release, which I think requires a full path (as you have below) and not just the package name. chris > cpan> ls > SENDU > 5919092 2007-02-14 SENDU/bioperl-1.5.2_102.tar.gz > 320154 2006-12-06 SENDU/bioperl-db-1.5.2_100.tar.gz > 99082 2006-12-06 SENDU/bioperl-network-1.5.2_100.tar.gz > 942093 2006-12-06 SENDU/bioperl-run-1.5.2_100.tar.gz > > So try doing > > cpan> install SENDU/bioperl-run-1.5.2_100.tar.gz > > Or if CPAN refuses to cooperate, you can grab it from here: > http://www.bioperl.org/wiki/Getting_BioPerl#Bioperl_1.5.2.2C_Developer_Release > > > Dave From anjan.purkayastha at gmail.com Mon Mar 3 14:57:33 2008 From: anjan.purkayastha at gmail.com (ANJAN PURKAYASTHA) Date: Mon, 3 Mar 2008 14:57:33 -0500 Subject: [Bioperl-l] problem with Bio::Tools::EMBOSS In-Reply-To: <47CC509A.10306@campus.iztacala.unam.mx> References: <47CC509A.10306@campus.iztacala.unam.mx> Message-ID: guys, thanks! i got bioperl-run to work. next question, let's say i want to run the palindrome program in emboss using the bioperl wrapper. now, palindrome takes in a list of parameter values- these are fed into emboss as a key-value hash. where do i find the correct names of the keys to create the input hash? tia. anjan On Mon, Mar 3, 2008 at 2:25 PM, Mauricio Herrera Cuadra < arareko at campus.iztacala.unam.mx> wrote: > Hi Anjan, > > It looks like you are using the latest BioPerl developer release > (bioperl-1.5.2_102) from CPAN, to have Bio::Factory::EMBOSS available > then you should try installing the latest BioPerl-run as well > (bioperl-run-1.5.2_100). After you install it, you'll have to modify > your 'use lib' pragma for your script to work as you expect: > > use lib "/Users/anjan/perl_directory/bioperl-run-1.5.2_100/"; > use Bio::Factory::EMBOSS; > > Hope this helps. > > Regards, > Mauricio. > > > ANJAN PURKAYASTHA wrote: > > hi > > i am tried to use the perl wrappers for EMBOSS with: > > > > use lib "/Users/anjan/perl_directory/bioperl-1.5.2_102/"; > > use Bio::Factory::EMBOSS; > > > > however it seems that Bio::Factory::EMBOSS cannot be found in the > bioperl > > directory mentioned above. > > > > so i tried to install Bio::Factory::EMBOSS from the cpan website. i got > the > > attached error message. > > > > any ideas on what i need to do to make this work? > > all advice will be appreciated. > > > > tia, > > > > anjan > > > > > > > > > > ------------------------------------------------------------------------ > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > MAURICIO HERRERA CUADRA > arareko at campus.iztacala.unam.mx > Laboratorio de Gen?tica > Unidad de Morfofisiolog?a y Funci?n > Facultad de Estudios Superiores Iztacala, UNAM > > > -- ANJAN PURKAYASTHA, PhD. Senior Computational Biologist ========================== 1101 King Street, Suite 310, Alexandria, VA 22314. 703.518.8040 (office) 703.740.6939 (mobile) email: anjan at vbi.vt.edu; anjan.purkayastha at gmail.com http://www.vbi.vt.edu ========================== From Daniel.Gerlach at medecine.unige.ch Tue Mar 4 03:48:15 2008 From: Daniel.Gerlach at medecine.unige.ch (Daniel Gerlach) Date: Tue, 04 Mar 2008 09:48:15 +0100 Subject: [Bioperl-l] Bio::TreeIO rises error "Weak references are not implemented in the version of perl" Message-ID: <47CD0CCF.4060306@medecine.unige.ch> Hello, Trying to run Bio::TreeIO by this command: perl -e 'use Bio::TreeIO' I get the following error: Weak references are not implemented in the version of perl at /usr/lib/perl5/site_perl/5.8.8/Bio/Tree/Node.pm line 76 BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.8/Bio/Tree/Node.pm line 76. Compilation failed in require at /usr/lib/perl5/site_perl/5.8.8/Bio/TreeIO/TreeEventBuilder.pm line 65. BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.8/Bio/TreeIO/TreeEventBuilder.pm line 65. Compilation failed in require at /usr/lib/perl5/site_perl/5.8.8/Bio/TreeIO.pm line 77. BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.8/Bio/TreeIO.pm line 77. Compilation failed in require at -e line 1. BEGIN failed--compilation aborted at -e line 1. I am running perl v5.8.8 on Fedora 8 on a 64bit machine. I installed a recent version of bioperl around 5 month ago. Any suggestions of why this module can't be loaded correctly? Greetings, Daniel From bix at sendu.me.uk Tue Mar 4 06:55:32 2008 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 04 Mar 2008 11:55:32 +0000 Subject: [Bioperl-l] Bio::TreeIO rises error "Weak references are not implemented in the version of perl" In-Reply-To: <47CD0CCF.4060306@medecine.unige.ch> References: <47CD0CCF.4060306@medecine.unige.ch> Message-ID: <47CD38B4.1070200@sendu.me.uk> Daniel Gerlach wrote: > Hello, > > Trying to run Bio::TreeIO by this command: > > perl -e 'use Bio::TreeIO' > > I get the following error: > > Weak references are not implemented in the version of perl > [...] > I am running perl v5.8.8 on Fedora 8 on a 64bit machine. I installed a > recent version of bioperl around 5 month ago. Any suggestions of why > this module can't be loaded correctly? Redhat/Fedora apparently has Perl issues. First try installing the latest version of Scalar::Util yourself: perl -MCPAN -e shell force install Scalar::Util If that doesn't work, you'll have to download and compile Perl yourself from source (don't use Fedora's installation system). From apapanicolaou at ice.mpg.de Tue Mar 4 07:03:27 2008 From: apapanicolaou at ice.mpg.de (Alexie Papanicolaou) Date: Tue, 04 Mar 2008 13:03:27 +0100 Subject: [Bioperl-l] Bio/SearchIO/Writer/GbrowseGFF.pm Message-ID: <47CD3A8F.9050902@ice.mpg.de> hello all, 1) I was wondering if you would you know what this error means and had time to help... Use of uninitialized value in concatenation (.) or string at /usr/local/share/perl/5.8.8/Bio/SearchIO/Writer/GbrowseGFF.pm line 287 line 287 is else { $tags{'Target'} = "$prefix:$seqname $qpmax $qpmin"; } this is the header # $Id: GbrowseGFF.pm,v 1.15.4.1 2006/10/02 23:10:27 sendu Exp $ # # BioPerl module Bio::SearchIO::Writer::GbrowseGFF.pm this is how I call it... ( 2.6.18-6-amd64, x86_64, perl, v5.8.8, bioperl: tried with both 1.5.2_102 from cvs and checked out svn version today) use Bio::SearchIO::Writer::GbrowseGFF; use Bio::SearchIO; if ($program eq "blastn"){ #my $out_gff = new Bio::SearchIO(-writer => $writer_gff, my $out_gff = new Bio::SearchIO(-output_format => 'GbrowseGFF', -output_cigar => 1, -output_signif => 1, -file => ">$infile.$query.blast.gff"); #my $out_gff_whole = new Bio::SearchIO(-writer => $writer_gff, my $out_gff_whole = new Bio::SearchIO(-output_format => 'GbrowseGFF', -output_cigar => 1, -output_signif => 1, -file => ">>$infile.blast.gff"); $out_gff->write_result($result); $out_gff_whole->write_result($result); } Where $result is a blast result... The aim is to parse a multi-query blast report and split it into different queries and make another file with all the queries. I'm sure i'm forgetting something but I can't figure what... The GFF file is produced, but I do get the error above... 2) Finally, there is a small bug but I don't think it comes from this module? The id attribute is printed out e.g iD=match_sequence31 with iD wrongly capitalised... many thanks for your time alexie -- -- Alexie Papanicolaou Entomology Max Planck Institute for Chemical Ecology Hans Knoell Str 8 Jena 07745 Germany Email apapanicolaou at ice.mpg.de Tel +493641571561 From apapanicolaou at ice.mpg.de Tue Mar 4 07:04:16 2008 From: apapanicolaou at ice.mpg.de (Alexie Papanicolaou) Date: Tue, 04 Mar 2008 13:04:16 +0100 Subject: [Bioperl-l] Gbrowse.pm followup Message-ID: <47CD3AC0.4080801@ice.mpg.de> Oh the iD bug is fixed in the svn developer branch. ta a -- -- Alexie Papanicolaou Entomology Max Planck Institute for Chemical Ecology Hans Knoell Str 8 Jena 07745 Germany Email apapanicolaou at ice.mpg.de Tel +493641571561 From cjfields at uiuc.edu Tue Mar 4 08:16:04 2008 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 4 Mar 2008 07:16:04 -0600 Subject: [Bioperl-l] Bio/SearchIO/Writer/GbrowseGFF.pm In-Reply-To: <47CD3A8F.9050902@ice.mpg.de> References: <47CD3A8F.9050902@ice.mpg.de> Message-ID: <4A68AA28-E508-4257-86E1-393CA9B74082@uiuc.edu> I have run into a number of problems with the GbrowseGFF module myself (I think I committed the ID fix, actually). It works but needs revision and needs better conformity with GFF3. You can post (1) as a bug and well look into it when we can. It's possible (depending on how extensive the fix is) this may have to wait until 1.7. chris On Mar 4, 2008, at 6:03 AM, Alexie Papanicolaou wrote: > hello all, > > 1) I was wondering if you would you know what this error means and > had time to help... > > Use of uninitialized value in concatenation (.) or string at /usr/ > local/share/perl/5.8.8/Bio/SearchIO/Writer/GbrowseGFF.pm line 287 > > line 287 is > else { > $tags{'Target'} = "$prefix:$seqname $qpmax $qpmin"; > } > > this is the header > # $Id: GbrowseGFF.pm,v 1.15.4.1 2006/10/02 23:10:27 sendu Exp $ > # > # BioPerl module Bio::SearchIO::Writer::GbrowseGFF.pm > > > this is how I call it... ( 2.6.18-6-amd64, x86_64, perl, v5.8.8, > bioperl: tried with both 1.5.2_102 from cvs and checked out svn > version today) > > use Bio::SearchIO::Writer::GbrowseGFF; > use Bio::SearchIO; > if ($program eq "blastn"){ > #my $out_gff = new Bio::SearchIO(-writer => $writer_gff, > my $out_gff = new Bio::SearchIO(-output_format => 'GbrowseGFF', > -output_cigar => 1, > -output_signif => 1, > -file => ">$infile.$query.blast.gff"); > #my $out_gff_whole = new Bio::SearchIO(-writer => $writer_gff, > my $out_gff_whole = new Bio::SearchIO(-output_format => 'GbrowseGFF', > -output_cigar => 1, > -output_signif => 1, > -file => ">>$infile.blast.gff"); > $out_gff->write_result($result); > $out_gff_whole->write_result($result); > } > > > > Where $result is a blast result... > > The aim is to parse a multi-query blast report and split it into > different queries and make another file with all the queries. I'm > sure i'm forgetting something but I can't figure what... > > The GFF file is produced, but I do get the error above... > > 2) Finally, there is a small bug but I don't think it comes from > this module? The id attribute is printed out e.g iD=match_sequence31 > with iD wrongly capitalised... > > many thanks for your time > alexie > > -- > -- > Alexie Papanicolaou > Entomology > Max Planck Institute for Chemical Ecology > Hans Knoell Str 8 > Jena 07745 > Germany > Email apapanicolaou at ice.mpg.de > Tel +493641571561 > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From Daniel.Gerlach at medecine.unige.ch Tue Mar 4 07:35:03 2008 From: Daniel.Gerlach at medecine.unige.ch (Daniel Gerlach) Date: Tue, 04 Mar 2008 13:35:03 +0100 Subject: [Bioperl-l] Remove columns containing more than 75% gaps in an alignment References: <200502151616.j1FGGnKr023827@portal.open-bio.org> Message-ID: <47CD41F7.2000401@medecine.unige.ch> Hello, Is it possible to remove only columns containing e.g. more than 75% gaps from an alignment? I was thinking at $aln2 = $aln->remove_gaps('-'[,$all_gaps_columns]) This would allow me to remove all gaps or gap-only columns but not using a threshold. Greetings, Daniel From Daniel.Gerlach at medecine.unige.ch Tue Mar 4 08:46:33 2008 From: Daniel.Gerlach at medecine.unige.ch (Daniel Gerlach) Date: Tue, 04 Mar 2008 14:46:33 +0100 Subject: [Bioperl-l] branch length score - total length of the spanning subtree Message-ID: <47CD52B9.5060906@medecine.unige.ch> Hello, I would like to use bioperl to calculate a branch length score for a given set of nodes and a tree. I know how to get the total branch length by using $tree->total_branch_length, but how could I get the length of the subtree spanning some given nodes which are dispersed over the whole tree (a subset of nodes from the tree which are not monophyletic)? Greetings, Daniel From bix at sendu.me.uk Tue Mar 4 09:37:53 2008 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 04 Mar 2008 14:37:53 +0000 Subject: [Bioperl-l] branch length score - total length of the spanning subtree In-Reply-To: <47CD52B9.5060906@medecine.unige.ch> References: <47CD52B9.5060906@medecine.unige.ch> Message-ID: <47CD5EC1.2020103@sendu.me.uk> Daniel Gerlach wrote: > Hello, > > I would like to use bioperl to calculate a branch length score for a > given set of nodes and a tree. I know how to get the total branch length > by using $tree->total_branch_length, but how could I get the length of > the subtree spanning some given nodes which are dispersed over the whole > tree (a subset of nodes from the tree which are not monophyletic)? One 'cheat' way of doing it might be to use splice(-keep_ids => \@node_ids) or similar, then run total_branch_length() on that. No idea if it will actually give you the right answer though. Let us know! :) From bix at sendu.me.uk Tue Mar 4 09:26:10 2008 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 04 Mar 2008 14:26:10 +0000 Subject: [Bioperl-l] Remove columns containing more than 75% gaps in an alignment In-Reply-To: <47CD41F7.2000401@medecine.unige.ch> References: <200502151616.j1FGGnKr023827@portal.open-bio.org> <47CD41F7.2000401@medecine.unige.ch> Message-ID: <47CD5C02.8060306@sendu.me.uk> Daniel Gerlach wrote: > Hello, > > Is it possible to remove only columns containing e.g. more than 75% gaps > from an alignment? I was thinking at > > $aln2 = $aln->remove_gaps('-'[,$all_gaps_columns]) > > This would allow me to remove all gaps or gap-only columns but not using > a threshold. Well, you can use gap_col_matrix() to decide which columns you don't want, and then use remove_columns(). From hlapp at gmx.net Tue Mar 4 10:24:13 2008 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 4 Mar 2008 10:24:13 -0500 Subject: [Bioperl-l] Bio/SearchIO/Writer/GbrowseGFF.pm In-Reply-To: <47CD3A8F.9050902@ice.mpg.de> References: <47CD3A8F.9050902@ice.mpg.de> Message-ID: <87808BE4-B6A3-4C7F-A6DC-42ED2686375B@gmx.net> On Mar 4, 2008, at 7:03 AM, Alexie Papanicolaou wrote: > Use of uninitialized value in concatenation (.) or string at /usr/ > local/share/perl/5.8.8/Bio/SearchIO/Writer/GbrowseGFF.pm line 287 > > line 287 is > else { > $tags{'Target'} = "$prefix:$seqname $qpmax $qpmin"; > } Note that this is a warning, not an error. However, if none of $prefix, $seqname, $qpmax, $qpmin can be undefined (or be equal to an empty string, which they will default to if undefined) at this position, then there is a problem (and it is before the above line). -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Tue Mar 4 11:02:02 2008 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 4 Mar 2008 11:02:02 -0500 Subject: [Bioperl-l] branch length score - total length of the spanning subtree In-Reply-To: <47CD5EC1.2020103@sendu.me.uk> References: <47CD52B9.5060906@medecine.unige.ch> <47CD5EC1.2020103@sendu.me.uk> Message-ID: On Mar 4, 2008, at 9:37 AM, Sendu Bala wrote: > Daniel Gerlach wrote: >> Hello, >> I would like to use bioperl to calculate a branch length score for >> a given set of nodes and a tree. I know how to get the total >> branch length by using $tree->total_branch_length, but how could I >> get the length of the subtree spanning some given nodes which are >> dispersed over the whole tree (a subset of nodes from the tree >> which are not monophyletic)? > > One 'cheat' way of doing it might be to use splice(-keep_ids => > \@node_ids) or similar, then run total_branch_length() on that. No > idea if it will actually give you the right answer though. Let us > know! :) Related to that, will contract_linear_paths() actually do the right thing and adjust branch lengths if it removes internal nodes with outdegree 1? Rutger - does Bio::Phylo handle this correctly? -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From Daniel.Gerlach at medecine.unige.ch Tue Mar 4 11:12:53 2008 From: Daniel.Gerlach at medecine.unige.ch (Daniel Gerlach) Date: Tue, 04 Mar 2008 17:12:53 +0100 Subject: [Bioperl-l] branch length score - total length of the spanning subtree In-Reply-To: <47CD5EC1.2020103@sendu.me.uk> References: <47CD52B9.5060906@medecine.unige.ch> <47CD5EC1.2020103@sendu.me.uk> Message-ID: <47CD7505.5080105@medecine.unige.ch> Hello, Thanks for the quick answer. I tried: use Bio::TreeIO; my $treeio = Bio::TreeIO->new(-format => 'newick', -fh => \*DATA); my $tree = $treeio->next_tree; print $tree->total_branch_length,"\n"; $tree->splice(-keep_id => [A,B,E]); print $tree->total_branch_length,"\n"; __DATA__ (((A:5,B:5)x:2,(C:4,D:4)y:1)z:3,E:10); Which gives me the message "MSG: After splicing, the original root was removed but there are multiple candidates for the new root!" however the root E was not removed. If I do it the complementary way by splicing out all unwanted nodes - splice(-remove_id => [C,D]) - I get what I want: 34 25 Greetings, Daniel Sendu Bala wrote: > Daniel Gerlach wrote: >> Hello, >> >> I would like to use bioperl to calculate a branch length score for a >> given set of nodes and a tree. I know how to get the total branch >> length by using $tree->total_branch_length, but how could I get the >> length of the subtree spanning some given nodes which are dispersed >> over the whole tree (a subset of nodes from the tree which are not >> monophyletic)? > > One 'cheat' way of doing it might be to use splice(-keep_ids => > \@node_ids) or similar, then run total_branch_length() on that. No idea > if it will actually give you the right answer though. Let us know! :) From bix at sendu.me.uk Tue Mar 4 11:37:47 2008 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 04 Mar 2008 16:37:47 +0000 Subject: [Bioperl-l] branch length score - total length of the spanning subtree In-Reply-To: References: <47CD52B9.5060906@medecine.unige.ch> <47CD5EC1.2020103@sendu.me.uk> Message-ID: <47CD7ADB.6050808@sendu.me.uk> Hilmar Lapp wrote: > > On Mar 4, 2008, at 9:37 AM, Sendu Bala wrote: > >> Daniel Gerlach wrote: >>> Hello, >>> I would like to use bioperl to calculate a branch length score for a >>> given set of nodes and a tree. I know how to get the total branch >>> length by using $tree->total_branch_length, but how could I get the >>> length of the subtree spanning some given nodes which are dispersed >>> over the whole tree (a subset of nodes from the tree which are not >>> monophyletic)? >> >> One 'cheat' way of doing it might be to use splice(-keep_ids => >> \@node_ids) or similar, then run total_branch_length() on that. No >> idea if it will actually give you the right answer though. Let us >> know! :) > > Related to that, will contract_linear_paths() actually do the right > thing and adjust branch lengths if it removes internal nodes with > outdegree 1? I think ultimately it boils down to remove_Descendent() being called as appropriate which does the branch length alteration. From a glance I can't answer your question with certainly, but it 'should' do the right thing. It needs to be tested; when I implemented these things I was only concerned with tree topology, not branch lengths or anything else. From David.Messina at sbc.su.se Tue Mar 4 15:47:06 2008 From: David.Messina at sbc.su.se (Dave Messina) Date: Tue, 4 Mar 2008 21:47:06 +0100 Subject: [Bioperl-l] problem with Bio::Tools::EMBOSS In-Reply-To: References: <47CC509A.10306@campus.iztacala.unam.mx> Message-ID: <628aabb70803041247l6c5a52d6n5cab24e7059f15fb@mail.gmail.com> > where do i find the > correct names of the keys to create the input hash? I've never used this module, but from a quick look at the code it appears to pass on any parameters to palindrome. I'm guessing you've already done this, but have you tried using the parameter names and values that palindrome itself asks for? Dave From cjfields at uiuc.edu Tue Mar 4 16:34:21 2008 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 4 Mar 2008 15:34:21 -0600 Subject: [Bioperl-l] OBDA, Bio::DB::Flat, and bioperl Message-ID: I don't know what the current status is for OBDA, but we have several bugs listed for Bio::DB::Flat which need someone versed in OBDA to look at them (they are all interrelated): http://bugzilla.open-bio.org/show_bug.cgi?id=2336 http://bugzilla.open-bio.org/show_bug.cgi?id=2337 http://bugzilla.open-bio.org/show_bug.cgi?id=2338 http://bugzilla.open-bio.org/show_bug.cgi?id=2339 If anyone has any input I would greatly appreciate it. I have been trying to stomp as many bugs as possible so we can work on a new release. chris From bosborne11 at verizon.net Tue Mar 4 16:42:05 2008 From: bosborne11 at verizon.net (Brian Osborne) Date: Tue, 04 Mar 2008 16:42:05 -0500 Subject: [Bioperl-l] OBDA, Bio::DB::Flat, and bioperl In-Reply-To: References: Message-ID: Chris, I'll take a look at them this weekend. Brian O. On Mar 4, 2008, at 4:34 PM, Chris Fields wrote: > I don't know what the current status is for OBDA, but we have > several bugs listed for Bio::DB::Flat which need someone versed in > OBDA to look at them (they are all interrelated): > > http://bugzilla.open-bio.org/show_bug.cgi?id=2336 > http://bugzilla.open-bio.org/show_bug.cgi?id=2337 > http://bugzilla.open-bio.org/show_bug.cgi?id=2338 > http://bugzilla.open-bio.org/show_bug.cgi?id=2339 > > If anyone has any input I would greatly appreciate it. I have been > trying to stomp as many bugs as possible so we can work on a new > release. > > chris > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From anjan.purkayastha at gmail.com Tue Mar 4 18:52:09 2008 From: anjan.purkayastha at gmail.com (ANJAN PURKAYASTHA) Date: Tue, 4 Mar 2008 18:52:09 -0500 Subject: [Bioperl-l] problem with Bio::Tools::EMBOSS In-Reply-To: <628aabb70803041247l6c5a52d6n5cab24e7059f15fb@mail.gmail.com> References: <47CC509A.10306@campus.iztacala.unam.mx> <628aabb70803041247l6c5a52d6n5cab24e7059f15fb@mail.gmail.com> Message-ID: guys, thanks for all your inputs. i went to the following site: http://www.koders.com/perl/fid5F28A3DDD453F0DB4995B7DDF304B02DBBACE0A0.aspx?s=calculate they have the key names for most of the emboss programs. thanks, anjan On Tue, Mar 4, 2008 at 3:47 PM, Dave Messina wrote: > > where do i find the > > correct names of the keys to create the input hash? > > > > I've never used this module, but from a quick look at the code it appears > to pass on any parameters to palindrome. > > I'm guessing you've already done this, but have you tried using the > parameter names and values that palindrome itself asks for? > > > Dave > > -- ANJAN PURKAYASTHA, PhD. Senior Computational Biologist ========================== 1101 King Street, Suite 310, Alexandria, VA 22314. 703.518.8040 (office) 703.740.6939 (mobile) email: anjan at vbi.vt.edu; anjan.purkayastha at gmail.com http://www.vbi.vt.edu ========================== From staffa at niehs.nih.gov Wed Mar 5 18:43:30 2008 From: staffa at niehs.nih.gov (Staffa, Nick (NIH/NIEHS)) Date: Wed, 05 Mar 2008 18:43:30 -0500 Subject: [Bioperl-l] SeqIO Message-ID: So the Howto says that Bio::SeqIO will read almost any known format including GCG. So I create a GCG file with Seqlab and try to printout its sequence as a string. ( I did guess at the way to get the sequence string: #!/usr/bin/perl -w use strict; $| = 1; use Bio::SeqIO; my $number_of_files = @ARGV; if(!$number_of_files){print "no files entered\n";exit:} foreach my $file (@ARGV){ my $seqio_object = Bio::SeqIO->new(-file => $file); my $seq_object = $seqio_object->next_seq; my $sequence = $seq_object->seq; print "$sequence\n"; my $status = &windowscore($sequence); } But what it returned was the entire contents of the file with no format decoding. Have I been deluded? NewDNALength:810March5,200818:26Type:NCheck:3368..1TGTTCGAATTCCGTGCGGTCCACCT CCCCTAGGAGCTCAGTGGGCTGGTT51GGATTCCGTGCCATCCCGGCAGGGCAGAGCCTCGGGAGGGGGCGAAGGT T101GCCCGGGGCCGTGCGCTGGGTGCTGCTGCTGCGGTGGCGGCGGCGGTGCC151TGCGGTTGCAGCGGCTGCT GGGGTTGCGCGTGGAAACCGCGCCCCGCACT201TGCGGCGGGCGAGCCCATCGCGCCGTAGTACAGGTGCAGAGC GCTGGGGG251GCGCCAGGATCCCCGGCATCGCAGGGCCCGAGGGGTCCGGCCCCACTCGC301ATGGGGCCAGCG GGCGGCTCTACGGACACTGCATAGTCCGAGACTGGAGC351GTAAGTGTAGGTGCCGGCCGCCGGGCAGTCCCCTG GCAGCGGGGCTGCAA401AGAAAGCCGGGTCCTGCTCCACGCCATCCAGCGGGGATGTGTCCGGAGTG451GGCAG AGGGTAGCCGTCGAGCGCGGGAGCGCCCAGTCCCTGGCAGTCCCG501ATAGTGGGGGCCCATGTGCGGAGACATC AGCGGAGGACCGGCCGGATAGC551CCGGCTCCGGGAAAGGCAGACCCAGGCCATCCATGGCCACGCGGCCGCCC6 01TCGGGACCAAGCGCGCCGGCCTGGGGCTCGACGAGAGCGTGCAGGAAGCC651TCCCTCCACCCGCTTCATGCG CTTCACCTGCTTGCGCCGCCGCGGCCGGT701ACTTGTAGTTGGGGTGGTCCTGCATATGCTGCACGCGCAGCCGC TCGGCC751TCTTCCACGAAGGGCCGCTTCTCTGCCAAGGTCAACGCCTTCCAAGACTT801GCCTGCAGGG Nick Staffa Telephone: 919-316-4569 (NIEHS: 6-4569) Scientific Computing Support Group NIEHS Information Technology Support Services Contract (Science Task Monitor: Roy W. Reter (reter at niehs.nih.gov) National Institute of Environmental Health Sciences National Institutes of Health Research Triangle Park, North Carolina From cjfields at uiuc.edu Wed Mar 5 21:22:53 2008 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 5 Mar 2008 20:22:53 -0600 Subject: [Bioperl-l] SeqIO In-Reply-To: References: Message-ID: I thought GCG format changed somewhere along the way but I maybe I'm wrong? Regardless, you'll have to post this as a bug (along with an example file). Also, kind of odd that the sequence data wasn't checked... chris On Mar 5, 2008, at 5:43 PM, Staffa, Nick (NIH/NIEHS) wrote: > So the Howto says that Bio::SeqIO will read almost any known format > including GCG. > So I create a GCG file with Seqlab and try to printout its sequence > as a > string. ( I did guess at the way to get the sequence string: > > #!/usr/bin/perl -w > use strict; > $| = 1; > use Bio::SeqIO; > my $number_of_files = @ARGV; > if(!$number_of_files){print "no files entered\n";exit:} > foreach my $file (@ARGV){ > my $seqio_object = Bio::SeqIO->new(-file => $file); > my $seq_object = $seqio_object->next_seq; > my $sequence = $seq_object->seq; > print "$sequence\n"; > my $status = &windowscore($sequence); > } > > But what it returned was the entire contents of the file with no > format > decoding. Have I been deluded? > > NewDNALength:810March5,200818:26Type:NCheck: > 3368..1TGTTCGAATTCCGTGCGGTCCACCT > CCCCTAGGAGCTCAGTGGGCTGGTT51GGATTCCGTGCCATCCCGGCAGGGCAGAGCCTCGGGAGGGGGCGAAGGT > T101GCCCGGGGCCGTGCGCTGGGTGCTGCTGCTGCGGTGGCGGCGGCGGTGCC151TGCGGTTGCAGCGGCTGCT > GGGGTTGCGCGTGGAAACCGCGCCCCGCACT201TGCGGCGGGCGAGCCCATCGCGCCGTAGTACAGGTGCAGAGC > GCTGGGGG251GCGCCAGGATCCCCGGCATCGCAGGGCCCGAGGGGTCCGGCCCCACTCGC301ATGGGGCCAGCG > GGCGGCTCTACGGACACTGCATAGTCCGAGACTGGAGC351GTAAGTGTAGGTGCCGGCCGCCGGGCAGTCCCCTG > GCAGCGGGGCTGCAA401AGAAAGCCGGGTCCTGCTCCACGCCATCCAGCGGGGATGTGTCCGGAGTG451GGCAG > AGGGTAGCCGTCGAGCGCGGGAGCGCCCAGTCCCTGGCAGTCCCG501ATAGTGGGGGCCCATGTGCGGAGACATC > AGCGGAGGACCGGCCGGATAGC551CCGGCTCCGGGAAAGGCAGACCCAGGCCATCCATGGCCACGCGGCCGCCC6 > 01TCGGGACCAAGCGCGCCGGCCTGGGGCTCGACGAGAGCGTGCAGGAAGCC651TCCCTCCACCCGCTTCATGCG > CTTCACCTGCTTGCGCCGCCGCGGCCGGT701ACTTGTAGTTGGGGTGGTCCTGCATATGCTGCACGCGCAGCCGC > TCGGCC751TCTTCCACGAAGGGCCGCTTCTCTGCCAAGGTCAACGCCTTCCAAGACTT801GCCTGCAGGG > > > > Nick Staffa > Telephone: 919-316-4569 (NIEHS: 6-4569) > Scientific Computing Support Group > NIEHS Information Technology Support Services Contract > (Science Task Monitor: Roy W. Reter (reter at niehs.nih.gov) > National Institute of Environmental Health Sciences > National Institutes of Health > Research Triangle Park, North Carolina > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From jason at bioperl.org Wed Mar 5 21:33:48 2008 From: jason at bioperl.org (Jason Stajich) Date: Wed, 5 Mar 2008 18:33:48 -0800 Subject: [Bioperl-l] SeqIO In-Reply-To: References: Message-ID: <797E42FC-C59F-4431-BAF1-11D3FAE9F9D0@bioperl.org> probably you should try specifying the format explicitly first- as in (-format => 'gcg') -j On Mar 5, 2008, at 6:22 PM, Chris Fields wrote: > I thought GCG format changed somewhere along the way but I maybe > I'm wrong? Regardless, you'll have to post this as a bug (along > with an example file). > > Also, kind of odd that the sequence data wasn't checked... > > chris > > On Mar 5, 2008, at 5:43 PM, Staffa, Nick (NIH/NIEHS) wrote: > >> So the Howto says that Bio::SeqIO will read almost any known format >> including GCG. >> So I create a GCG file with Seqlab and try to printout its >> sequence as a >> string. ( I did guess at the way to get the sequence string: >> >> #!/usr/bin/perl -w >> use strict; >> $| = 1; >> use Bio::SeqIO; >> my $number_of_files = @ARGV; >> if(!$number_of_files){print "no files entered\n";exit:} >> foreach my $file (@ARGV){ >> my $seqio_object = Bio::SeqIO->new(-file => $file); >> my $seq_object = $seqio_object->next_seq; >> my $sequence = $seq_object->seq; >> print "$sequence\n"; >> my $status = &windowscore($sequence); >> } >> >> But what it returned was the entire contents of the file with no >> format >> decoding. Have I been deluded? >> >> NewDNALength:810March5,200818:26Type:NCheck: >> 3368..1TGTTCGAATTCCGTGCGGTCCACCT >> CCCCTAGGAGCTCAGTGGGCTGGTT51GGATTCCGTGCCATCCCGGCAGGGCAGAGCCTCGGGAGGGGG >> CGAAGGT >> T101GCCCGGGGCCGTGCGCTGGGTGCTGCTGCTGCGGTGGCGGCGGCGGTGCC151TGCGGTTGCAGC >> GGCTGCT >> GGGGTTGCGCGTGGAAACCGCGCCCCGCACT201TGCGGCGGGCGAGCCCATCGCGCCGTAGTACAGGT >> GCAGAGC >> GCTGGGGG251GCGCCAGGATCCCCGGCATCGCAGGGCCCGAGGGGTCCGGCCCCACTCGC301ATGGG >> GCCAGCG >> GGCGGCTCTACGGACACTGCATAGTCCGAGACTGGAGC351GTAAGTGTAGGTGCCGGCCGCCGGGCAG >> TCCCCTG >> GCAGCGGGGCTGCAA401AGAAAGCCGGGTCCTGCTCCACGCCATCCAGCGGGGATGTGTCCGGAGTG4 >> 51GGCAG >> AGGGTAGCCGTCGAGCGCGGGAGCGCCCAGTCCCTGGCAGTCCCG501ATAGTGGGGGCCCATGTGCGG >> AGACATC >> AGCGGAGGACCGGCCGGATAGC551CCGGCTCCGGGAAAGGCAGACCCAGGCCATCCATGGCCACGCGG >> CCGCCC6 >> 01TCGGGACCAAGCGCGCCGGCCTGGGGCTCGACGAGAGCGTGCAGGAAGCC651TCCCTCCACCCGCT >> TCATGCG >> CTTCACCTGCTTGCGCCGCCGCGGCCGGT701ACTTGTAGTTGGGGTGGTCCTGCATATGCTGCACGCG >> CAGCCGC >> TCGGCC751TCTTCCACGAAGGGCCGCTTCTCTGCCAAGGTCAACGCCTTCCAAGACTT801GCCTGCA >> GGG >> >> >> >> Nick Staffa >> Telephone: 919-316-4569 (NIEHS: 6-4569) >> Scientific Computing Support Group >> NIEHS Information Technology Support Services Contract >> (Science Task Monitor: Roy W. Reter (reter at niehs.nih.gov) >> National Institute of Environmental Health Sciences >> National Institutes of Health >> Research Triangle Park, North Carolina >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From bosborne11 at verizon.net Wed Mar 5 21:01:07 2008 From: bosborne11 at verizon.net (Brian Osborne) Date: Wed, 05 Mar 2008 21:01:07 -0500 Subject: [Bioperl-l] SeqIO In-Reply-To: References: Message-ID: <19DC527F-3D34-4F3E-9B4C-D2C6011A2C8F@verizon.net> Nick, Take a look at the GCG files that are used in the SeqIO tests: bioperl-live//t/data/test.gcg bioperl-live//t/data/test_badlf.gcg Does the file that you created have a format like the format in those files? I'm guessing you're going to say 'yes', from the looks of your output. Brian O. On Mar 5, 2008, at 6:43 PM, Staffa, Nick (NIH/NIEHS) wrote: > So the Howto says that Bio::SeqIO will read almost any known format > including GCG. > So I create a GCG file with Seqlab and try to printout its sequence > as a > string. ( I did guess at the way to get the sequence string: > > #!/usr/bin/perl -w > use strict; > $| = 1; > use Bio::SeqIO; > my $number_of_files = @ARGV; > if(!$number_of_files){print "no files entered\n";exit:} > foreach my $file (@ARGV){ > my $seqio_object = Bio::SeqIO->new(-file => $file); > my $seq_object = $seqio_object->next_seq; > my $sequence = $seq_object->seq; > print "$sequence\n"; > my $status = &windowscore($sequence); > } > > But what it returned was the entire contents of the file with no > format > decoding. Have I been deluded? > > NewDNALength:810March5,200818:26Type:NCheck: > 3368..1TGTTCGAATTCCGTGCGGTCCACCT > CCCCTAGGAGCTCAGTGGGCTGGTT51GGATTCCGTGCCATCCCGGCAGGGCAGAGCCTCGGGAGGGGGCGAAGGT > T101GCCCGGGGCCGTGCGCTGGGTGCTGCTGCTGCGGTGGCGGCGGCGGTGCC151TGCGGTTGCAGCGGCTGCT > GGGGTTGCGCGTGGAAACCGCGCCCCGCACT201TGCGGCGGGCGAGCCCATCGCGCCGTAGTACAGGTGCAGAGC > GCTGGGGG251GCGCCAGGATCCCCGGCATCGCAGGGCCCGAGGGGTCCGGCCCCACTCGC301ATGGGGCCAGCG > GGCGGCTCTACGGACACTGCATAGTCCGAGACTGGAGC351GTAAGTGTAGGTGCCGGCCGCCGGGCAGTCCCCTG > GCAGCGGGGCTGCAA401AGAAAGCCGGGTCCTGCTCCACGCCATCCAGCGGGGATGTGTCCGGAGTG451GGCAG > AGGGTAGCCGTCGAGCGCGGGAGCGCCCAGTCCCTGGCAGTCCCG501ATAGTGGGGGCCCATGTGCGGAGACATC > AGCGGAGGACCGGCCGGATAGC551CCGGCTCCGGGAAAGGCAGACCCAGGCCATCCATGGCCACGCGGCCGCCC6 > 01TCGGGACCAAGCGCGCCGGCCTGGGGCTCGACGAGAGCGTGCAGGAAGCC651TCCCTCCACCCGCTTCATGCG > CTTCACCTGCTTGCGCCGCCGCGGCCGGT701ACTTGTAGTTGGGGTGGTCCTGCATATGCTGCACGCGCAGCCGC > TCGGCC751TCTTCCACGAAGGGCCGCTTCTCTGCCAAGGTCAACGCCTTCCAAGACTT801GCCTGCAGGG > > > > Nick Staffa > Telephone: 919-316-4569 (NIEHS: 6-4569) > Scientific Computing Support Group > NIEHS Information Technology Support Services Contract > (Science Task Monitor: Roy W. Reter (reter at niehs.nih.gov) > National Institute of Environmental Health Sciences > National Institutes of Health > Research Triangle Park, North Carolina > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From staffa at niehs.nih.gov Wed Mar 5 22:09:11 2008 From: staffa at niehs.nih.gov (Staffa, Nick (NIH/NIEHS)) Date: Wed, 05 Mar 2008 22:09:11 -0500 Subject: [Bioperl-l] SeqIO In-Reply-To: <797E42FC-C59F-4431-BAF1-11D3FAE9F9D0@bioperl.org> Message-ID: Verily, One interpretation of the docs might be: will read any format if the format is specified. I was hoping that I could write a program that one needn't specify format. It'd be more user-friendly and useful. On 3/5/08 9:33 PM, "Jason Stajich" wrote: > probably you should try specifying the format explicitly first- as in > (-format => 'gcg') > > -j > On Mar 5, 2008, at 6:22 PM, Chris Fields wrote: > >> I thought GCG format changed somewhere along the way but I maybe >> I'm wrong? Regardless, you'll have to post this as a bug (along >> with an example file). >> >> Also, kind of odd that the sequence data wasn't checked... >> >> chris >> >> On Mar 5, 2008, at 5:43 PM, Staffa, Nick (NIH/NIEHS) wrote: >> >>> So the Howto says that Bio::SeqIO will read almost any known format >>> including GCG. >>> So I create a GCG file with Seqlab and try to printout its >>> sequence as a >>> string. ( I did guess at the way to get the sequence string: >>> >>> #!/usr/bin/perl -w >>> use strict; >>> $| = 1; >>> use Bio::SeqIO; >>> my $number_of_files = @ARGV; >>> if(!$number_of_files){print "no files entered\n";exit:} >>> foreach my $file (@ARGV){ >>> my $seqio_object = Bio::SeqIO->new(-file => $file); >>> my $seq_object = $seqio_object->next_seq; >>> my $sequence = $seq_object->seq; >>> print "$sequence\n"; >>> my $status = &windowscore($sequence); >>> } >>> >>> But what it returned was the entire contents of the file with no >>> format >>> decoding. Have I been deluded? >>> >>> NewDNALength:810March5,200818:26Type:NCheck: >>> 3368..1TGTTCGAATTCCGTGCGGTCCACCT >>> CCCCTAGGAGCTCAGTGGGCTGGTT51GGATTCCGTGCCATCCCGGCAGGGCAGAGCCTCGGGAGGGGG >>> CGAAGGT >>> T101GCCCGGGGCCGTGCGCTGGGTGCTGCTGCTGCGGTGGCGGCGGCGGTGCC151TGCGGTTGCAGC >>> GGCTGCT >>> GGGGTTGCGCGTGGAAACCGCGCCCCGCACT201TGCGGCGGGCGAGCCCATCGCGCCGTAGTACAGGT >>> GCAGAGC >>> GCTGGGGG251GCGCCAGGATCCCCGGCATCGCAGGGCCCGAGGGGTCCGGCCCCACTCGC301ATGGG >>> GCCAGCG >>> GGCGGCTCTACGGACACTGCATAGTCCGAGACTGGAGC351GTAAGTGTAGGTGCCGGCCGCCGGGCAG >>> TCCCCTG >>> GCAGCGGGGCTGCAA401AGAAAGCCGGGTCCTGCTCCACGCCATCCAGCGGGGATGTGTCCGGAGTG4 >>> 51GGCAG >>> AGGGTAGCCGTCGAGCGCGGGAGCGCCCAGTCCCTGGCAGTCCCG501ATAGTGGGGGCCCATGTGCGG >>> AGACATC >>> AGCGGAGGACCGGCCGGATAGC551CCGGCTCCGGGAAAGGCAGACCCAGGCCATCCATGGCCACGCGG >>> CCGCCC6 >>> 01TCGGGACCAAGCGCGCCGGCCTGGGGCTCGACGAGAGCGTGCAGGAAGCC651TCCCTCCACCCGCT >>> TCATGCG >>> CTTCACCTGCTTGCGCCGCCGCGGCCGGT701ACTTGTAGTTGGGGTGGTCCTGCATATGCTGCACGCG >>> CAGCCGC >>> TCGGCC751TCTTCCACGAAGGGCCGCTTCTCTGCCAAGGTCAACGCCTTCCAAGACTT801GCCTGCA >>> GGG >>> >>> >>> >>> Nick Staffa >>> Telephone: 919-316-4569 (NIEHS: 6-4569) >>> Scientific Computing Support Group >>> NIEHS Information Technology Support Services Contract >>> (Science Task Monitor: Roy W. Reter (reter at niehs.nih.gov) >>> National Institute of Environmental Health Sciences >>> National Institutes of Health >>> Research Triangle Park, North Carolina >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> Christopher Fields >> Postdoctoral Researcher >> Lab of Dr. Robert Switzer >> Dept of Biochemistry >> University of Illinois Urbana-Champaign >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > From cjfields at uiuc.edu Wed Mar 5 22:44:14 2008 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 5 Mar 2008 21:44:14 -0600 Subject: [Bioperl-l] SeqIO In-Reply-To: <1B139406-897F-496F-8709-FCAAD4EFEDE3@verizon.net> References: <1B139406-897F-496F-8709-FCAAD4EFEDE3@verizon.net> Message-ID: <9146DF9D-C0D6-4F18-9B7E-7BB42FCE0737@uiuc.edu> Heh, good one! Though Jason may have worked out the issue (not indicating the format explicitly). Would be worth looking at the tested files. As for dinosaurs, well I can't talk ... chris On Mar 5, 2008, at 8:49 PM, Brian Osborne wrote: > Chris, > > Many many years ago, when dinosaurs roamed the earth, only about > half of the formats had their own tests. A primitive being saw this > and created simple tests for all the 'missing' formats. His thought > probably was 'this is better than nothing'. In fact this being > assumed that GCG was an outdated and unused format, even as long ago > as that time was. > > The origins of so much of what we now know as 'Bioperl' are > frequently mysterious, or incomprehensible to modern day humans... > > Brian O. > > On Mar 5, 2008, at 9:22 PM, Chris Fields wrote: > >> Also, kind of odd that the sequence data wasn't checked... From bosborne11 at verizon.net Wed Mar 5 21:49:26 2008 From: bosborne11 at verizon.net (Brian Osborne) Date: Wed, 05 Mar 2008 21:49:26 -0500 Subject: [Bioperl-l] SeqIO In-Reply-To: References: Message-ID: <1B139406-897F-496F-8709-FCAAD4EFEDE3@verizon.net> Chris, Many many years ago, when dinosaurs roamed the earth, only about half of the formats had their own tests. A primitive being saw this and created simple tests for all the 'missing' formats. His thought probably was 'this is better than nothing'. In fact this being assumed that GCG was an outdated and unused format, even as long ago as that time was. The origins of so much of what we now know as 'Bioperl' are frequently mysterious, or incomprehensible to modern day humans... Brian O. On Mar 5, 2008, at 9:22 PM, Chris Fields wrote: > Also, kind of odd that the sequence data wasn't checked... From cjfields at uiuc.edu Wed Mar 5 22:54:15 2008 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 5 Mar 2008 21:54:15 -0600 Subject: [Bioperl-l] SeqIO In-Reply-To: References: Message-ID: <67C6AE9D-3934-4717-A97A-4C31DB4F7E33@uiuc.edu> You can leave off the format, but you must append the correct file extension for the parser to determine the correct format ('.gcg' for GCG, for example). There is also Bio::Tools::GuessSeqFormat though it doesn't cover all formats. chris On Mar 5, 2008, at 9:09 PM, Staffa, Nick (NIH/NIEHS) wrote: > Verily, > One interpretation of the docs might be: will read any format if the > format > is specified. > I was hoping that I could write a program that one needn't specify > format. > It'd be more user-friendly and useful. > > > On 3/5/08 9:33 PM, "Jason Stajich" wrote: > >> probably you should try specifying the format explicitly first- as in >> (-format => 'gcg') >> >> -j >> On Mar 5, 2008, at 6:22 PM, Chris Fields wrote: >> >>> I thought GCG format changed somewhere along the way but I maybe >>> I'm wrong? Regardless, you'll have to post this as a bug (along >>> with an example file). >>> >>> Also, kind of odd that the sequence data wasn't checked... >>> >>> chris >>> >>> On Mar 5, 2008, at 5:43 PM, Staffa, Nick (NIH/NIEHS) wrote: >>> >>>> So the Howto says that Bio::SeqIO will read almost any known format >>>> including GCG. >>>> So I create a GCG file with Seqlab and try to printout its >>>> sequence as a >>>> string. ( I did guess at the way to get the sequence string: >>>> >>>> #!/usr/bin/perl -w >>>> use strict; >>>> $| = 1; >>>> use Bio::SeqIO; >>>> my $number_of_files = @ARGV; >>>> if(!$number_of_files){print "no files entered\n";exit:} >>>> foreach my $file (@ARGV){ >>>> my $seqio_object = Bio::SeqIO->new(-file => $file); >>>> my $seq_object = $seqio_object->next_seq; >>>> my $sequence = $seq_object->seq; >>>> print "$sequence\n"; >>>> my $status = &windowscore($sequence); >>>> } >>>> >>>> But what it returned was the entire contents of the file with no >>>> format >>>> decoding. Have I been deluded? >>>> >>>> NewDNALength:810March5,200818:26Type:NCheck: >>>> 3368..1TGTTCGAATTCCGTGCGGTCCACCT >>>> CCCCTAGGAGCTCAGTGGGCTGGTT51GGATTCCGTGCCATCCCGGCAGGGCAGAGCCTCGGGAGGGGG >>>> CGAAGGT >>>> T101GCCCGGGGCCGTGCGCTGGGTGCTGCTGCTGCGGTGGCGGCGGCGGTGCC151TGCGGTTGCAGC >>>> GGCTGCT >>>> GGGGTTGCGCGTGGAAACCGCGCCCCGCACT201TGCGGCGGGCGAGCCCATCGCGCCGTAGTACAGGT >>>> GCAGAGC >>>> GCTGGGGG251GCGCCAGGATCCCCGGCATCGCAGGGCCCGAGGGGTCCGGCCCCACTCGC301ATGGG >>>> GCCAGCG >>>> GGCGGCTCTACGGACACTGCATAGTCCGAGACTGGAGC351GTAAGTGTAGGTGCCGGCCGCCGGGCAG >>>> TCCCCTG >>>> GCAGCGGGGCTGCAA401AGAAAGCCGGGTCCTGCTCCACGCCATCCAGCGGGGATGTGTCCGGAGTG4 >>>> 51GGCAG >>>> AGGGTAGCCGTCGAGCGCGGGAGCGCCCAGTCCCTGGCAGTCCCG501ATAGTGGGGGCCCATGTGCGG >>>> AGACATC >>>> AGCGGAGGACCGGCCGGATAGC551CCGGCTCCGGGAAAGGCAGACCCAGGCCATCCATGGCCACGCGG >>>> CCGCCC6 >>>> 01TCGGGACCAAGCGCGCCGGCCTGGGGCTCGACGAGAGCGTGCAGGAAGCC651TCCCTCCACCCGCT >>>> TCATGCG >>>> CTTCACCTGCTTGCGCCGCCGCGGCCGGT701ACTTGTAGTTGGGGTGGTCCTGCATATGCTGCACGCG >>>> CAGCCGC >>>> TCGGCC751TCTTCCACGAAGGGCCGCTTCTCTGCCAAGGTCAACGCCTTCCAAGACTT801GCCTGCA >>>> GGG >>>> >>>> >>>> >>>> Nick Staffa >>>> Telephone: 919-316-4569 (NIEHS: 6-4569) >>>> Scientific Computing Support Group >>>> NIEHS Information Technology Support Services Contract >>>> (Science Task Monitor: Roy W. Reter (reter at niehs.nih.gov) >>>> National Institute of Environmental Health Sciences >>>> National Institutes of Health >>>> Research Triangle Park, North Carolina >>>> >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> Christopher Fields >>> Postdoctoral Researcher >>> Lab of Dr. Robert Switzer >>> Dept of Biochemistry >>> University of Illinois Urbana-Champaign >>> >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From ewijaya at gmail.com Thu Mar 6 03:16:25 2008 From: ewijaya at gmail.com (Edward Wijaya) Date: Thu, 6 Mar 2008 16:16:25 +0800 Subject: [Bioperl-l] BioPerl Module to Parse Transfac Flat File Database Message-ID: <3521d3670803060016r35ac720ar9f2190631ddaf629@mail.gmail.com> Dear experts, Is there any? The TRANSFAC text file which contain entry like this. Especially we wich to capture the PWM for each of the Transcription factor. Regards, Edward __BEGIN__ VV TRANSFAC MATRIX TABLE, Release 11.1 - licensed - 2007-03-31, (C) Biobase GmbH XX // AC M00001 XX ID V$MYOD_01 XX DT 19.10.1992 (created); ewi. DT 22.10.1997 (updated); dbo. CO Copyright (C), Biobase GmbH. XX NA MyoD XX DE myoblast determination gene product XX BF T00526; MyoD; Species: mouse, Mus musculus. BF T09177; MyoD; Species: mouse, Mus musculus. XX P0 A C G T 01 1 2 2 0 S 02 2 1 2 0 R 03 3 0 1 1 A 04 0 5 0 0 C 05 5 0 0 0 A 06 0 0 4 1 G 07 0 1 4 0 G 08 0 0 0 5 T 09 0 0 5 0 G 10 0 1 2 2 K 11 0 2 0 3 Y 12 1 0 3 1 G ....etc.... From watashi at post.com Thu Mar 6 07:06:42 2008 From: watashi at post.com (Masa Masa) Date: Thu, 6 Mar 2008 07:06:42 -0500 Subject: [Bioperl-l] failure of add_seqfeature Message-ID: <20080306120642.4800F16427A@ws1-4.us4.outblaze.com> Dear experts, Would anybody know why the following codes generate an error of: ------------- EXCEPTION ------------- MSG: Bio::SeqFeature::Generic=HASH(0x94583c0) is not contained within parent feature, and expansion is not valid STACK Bio::SeqFeature::Generic::add_SeqFeature /usr/lib/perl5/site_perl/5.8.0/Bio/SeqFeature/Generic.pm:767 STACK toplevel test.pl:118 -------------------------------------- 15616 15693 79568 83016 ================= use Bio::Graphics; use Bio::SeqFeature::Generic; use Bio::SeqIO; my $bsg = 'Bio::SeqFeature::Generic'; my $unseqfea = $bsg->new( -start=>$from[$i], -end=>$to[$i], -display_name=>'U'); for (my $i=0; $i < @from; $i++) { print "$from[$i] $to[$i]\n"; $unseqfea->add_SeqFeature($bsg->new(-start=>$from[$i],-end=>$to[$i])); if ($i > 10) { exit; } } -- Want an e-mail address like mine? Get a free e-mail account today at www.mail.com! From heikki at sanbi.ac.za Thu Mar 6 07:20:03 2008 From: heikki at sanbi.ac.za (Heikki Lehvaslaiho) Date: Thu, 6 Mar 2008 14:20:03 +0200 Subject: [Bioperl-l] SeqIO In-Reply-To: References: Message-ID: <200803061420.04123.heikki@sanbi.ac.za> Nick, This is the regex that Bio::Tools::GuessSeqFormat uses to identify a gcg file: /Length: .*Type: .*Check: .*\.\.$/ It is the second line in GCG file. If first line matches to some other format regex, this will not not be evaluated. Let us know, -Heikki On Thursday 06 March 2008 05:09:11 Staffa, Nick (NIH/NIEHS) wrote: > Verily, > One interpretation of the docs might be: will read any format if the format > is specified. > I was hoping that I could write a program that one needn't specify format. > It'd be more user-friendly and useful. > > On 3/5/08 9:33 PM, "Jason Stajich" wrote: > > probably you should try specifying the format explicitly first- as in > > (-format => 'gcg') > > > > -j > > > > On Mar 5, 2008, at 6:22 PM, Chris Fields wrote: > >> I thought GCG format changed somewhere along the way but I maybe > >> I'm wrong? Regardless, you'll have to post this as a bug (along > >> with an example file). > >> > >> Also, kind of odd that the sequence data wasn't checked... > >> > >> chris > >> > >> On Mar 5, 2008, at 5:43 PM, Staffa, Nick (NIH/NIEHS) wrote: > >>> So the Howto says that Bio::SeqIO will read almost any known format > >>> including GCG. > >>> So I create a GCG file with Seqlab and try to printout its > >>> sequence as a > >>> string. ( I did guess at the way to get the sequence string: > >>> > >>> #!/usr/bin/perl -w > >>> use strict; > >>> $| = 1; > >>> use Bio::SeqIO; > >>> my $number_of_files = @ARGV; > >>> if(!$number_of_files){print "no files entered\n";exit:} > >>> foreach my $file (@ARGV){ > >>> my $seqio_object = Bio::SeqIO->new(-file => $file); > >>> my $seq_object = $seqio_object->next_seq; > >>> my $sequence = $seq_object->seq; > >>> print "$sequence\n"; > >>> my $status = &windowscore($sequence); > >>> } > >>> > >>> But what it returned was the entire contents of the file with no > >>> format > >>> decoding. Have I been deluded? > >>> > >>> NewDNALength:810March5,200818:26Type:NCheck: > >>> 3368..1TGTTCGAATTCCGTGCGGTCCACCT > >>> CCCCTAGGAGCTCAGTGGGCTGGTT51GGATTCCGTGCCATCCCGGCAGGGCAGAGCCTCGGGAGGGGG > >>> CGAAGGT > >>> T101GCCCGGGGCCGTGCGCTGGGTGCTGCTGCTGCGGTGGCGGCGGCGGTGCC151TGCGGTTGCAGC > >>> GGCTGCT > >>> GGGGTTGCGCGTGGAAACCGCGCCCCGCACT201TGCGGCGGGCGAGCCCATCGCGCCGTAGTACAGGT > >>> GCAGAGC > >>> GCTGGGGG251GCGCCAGGATCCCCGGCATCGCAGGGCCCGAGGGGTCCGGCCCCACTCGC301ATGGG > >>> GCCAGCG > >>> GGCGGCTCTACGGACACTGCATAGTCCGAGACTGGAGC351GTAAGTGTAGGTGCCGGCCGCCGGGCAG > >>> TCCCCTG > >>> GCAGCGGGGCTGCAA401AGAAAGCCGGGTCCTGCTCCACGCCATCCAGCGGGGATGTGTCCGGAGTG4 > >>> 51GGCAG > >>> AGGGTAGCCGTCGAGCGCGGGAGCGCCCAGTCCCTGGCAGTCCCG501ATAGTGGGGGCCCATGTGCGG > >>> AGACATC > >>> AGCGGAGGACCGGCCGGATAGC551CCGGCTCCGGGAAAGGCAGACCCAGGCCATCCATGGCCACGCGG > >>> CCGCCC6 > >>> 01TCGGGACCAAGCGCGCCGGCCTGGGGCTCGACGAGAGCGTGCAGGAAGCC651TCCCTCCACCCGCT > >>> TCATGCG > >>> CTTCACCTGCTTGCGCCGCCGCGGCCGGT701ACTTGTAGTTGGGGTGGTCCTGCATATGCTGCACGCG > >>> CAGCCGC > >>> TCGGCC751TCTTCCACGAAGGGCCGCTTCTCTGCCAAGGTCAACGCCTTCCAAGACTT801GCCTGCA > >>> GGG > >>> > >>> > >>> > >>> Nick Staffa > >>> Telephone: 919-316-4569 (NIEHS: 6-4569) > >>> Scientific Computing Support Group > >>> NIEHS Information Technology Support Services Contract > >>> (Science Task Monitor: Roy W. Reter (reter at niehs.nih.gov) > >>> National Institute of Environmental Health Sciences > >>> National Institutes of Health > >>> Research Triangle Park, North Carolina > >>> > >>> > >>> _______________________________________________ > >>> Bioperl-l mailing list > >>> Bioperl-l at lists.open-bio.org > >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >> > >> Christopher Fields > >> Postdoctoral Researcher > >> Lab of Dr. Robert Switzer > >> Dept of Biochemistry > >> University of Illinois Urbana-Champaign > >> > >> > >> > >> _______________________________________________ > >> Bioperl-l mailing list > >> Bioperl-l at lists.open-bio.org > >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- ______ _/ _/_____________________________________________________ _/ _/ _/ _/ _/ Heikki Lehvaslaiho heikki at_sanbi _ac _za _/_/_/_/_/ Senior Scientist skype: heikki_lehvaslaiho _/ _/ _/ SANBI, South African National Bioinformatics Institute _/ _/ _/ University of Western Cape, South Africa _/ Phone: +27 21 959 2096 FAX: +27 21 959 2512 ___ _/_/_/_/_/________________________________________________________ From bix at sendu.me.uk Thu Mar 6 08:07:21 2008 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 06 Mar 2008 13:07:21 +0000 Subject: [Bioperl-l] BioPerl Module to Parse Transfac Flat File Database In-Reply-To: <3521d3670803060016r35ac720ar9f2190631ddaf629@mail.gmail.com> References: <3521d3670803060016r35ac720ar9f2190631ddaf629@mail.gmail.com> Message-ID: <47CFEC89.1000705@sendu.me.uk> Edward Wijaya wrote: > Dear experts, > > Is there any? The TRANSFAC text file which contain entry like this. > Especially we wich to capture the PWM for each of the Transcription > factor. Yes; I've written a module to do this, I just haven't committed it yet because certain things aren't quite right in terms of the API. But to just grab the PWM it should work fine. If you want I can email you the modules. From sdavis2 at mail.nih.gov Thu Mar 6 08:40:25 2008 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Thu, 6 Mar 2008 08:40:25 -0500 Subject: [Bioperl-l] BioPerl Module to Parse Transfac Flat File Database In-Reply-To: <47CFEC89.1000705@sendu.me.uk> References: <3521d3670803060016r35ac720ar9f2190631ddaf629@mail.gmail.com> <47CFEC89.1000705@sendu.me.uk> Message-ID: <264855a00803060540u1f3d0f92pbab13349595a0eb3@mail.gmail.com> On Thu, Mar 6, 2008 at 8:07 AM, Sendu Bala wrote: > Edward Wijaya wrote: > > Dear experts, > > > > Is there any? The TRANSFAC text file which contain entry like this. > > Especially we wich to capture the PWM for each of the Transcription > > factor. > > Yes; I've written a module to do this, I just haven't committed it yet > because certain things aren't quite right in terms of the API. But to > just grab the PWM it should work fine. If you want I can email you the > modules. I believe there are a set of non-bioperl modules called TFBS. See here (although I'm not sure this is the most up-to-date site): http://tfbs.genereg.net/ Sean From David.Messina at sbc.su.se Thu Mar 6 09:55:24 2008 From: David.Messina at sbc.su.se (Dave Messina) Date: Thu, 6 Mar 2008 15:55:24 +0100 Subject: [Bioperl-l] failure of add_seqfeature In-Reply-To: <20080306120642.4800F16427A@ws1-4.us4.outblaze.com> References: <20080306120642.4800F16427A@ws1-4.us4.outblaze.com> Message-ID: <628aabb70803060655k5245296etf5ee2f31755230d3@mail.gmail.com> Hi Masa, Could you give us a little more information? A complete test case (the code you included doesn't run because for example the @from array doesn't exist) and input file would be helpful, as well as the version of BioPerl you are using. Dave From staffa at niehs.nih.gov Thu Mar 6 10:23:34 2008 From: staffa at niehs.nih.gov (Staffa, Nick (NIH/NIEHS)) Date: Thu, 06 Mar 2008 10:23:34 -0500 Subject: [Bioperl-l] SeqIO In-Reply-To: <200803061420.04123.heikki@sanbi.ac.za> Message-ID: Here's the scoop: When I use Jason's suggestion, (-format => 'gcg'), My program works without complaint on the original file that looks like: !!NA_SEQUENCE 1.0 NewDNA Length: 810 March 5, 2008 18:26 Type: N Check: 3368 .. 1 TGTTCGAATT CCGTGCGGTC CACCTCCCCT AGGAGCTCAG TGGGCTGGTT et c. BUT if I remove the first line to test Bio::Tools::GuessSeqFormat, (which should be retro-gcg format (before version 11?)), my program runs, but there IS a complaint: Use of uninitialized value in scalar chomp at /usr/lib/perl5/site_perl/5.8.5/Bio/SeqIO/gcg.pm line 118, line 1. BUT If I remove (-format => 'gcg'), I get no complaint, but the sequence returned still has its numbers imbedded. This effects my calculations. Thanks, at least i know what my options are. Nick Staffa Telephone: 919-316-4569 (NIEHS: 6-4569) Scientific Computing Support Group NIEHS Information Technology Support Services Contract (Science Task Monitor: Roy W. Reter (reter at niehs.nih.gov) National Institute of Environmental Health Sciences National Institutes of Health Research Triangle Park, North Carolina On 3/6/08 7:20 AM, "Heikki Lehvaslaiho" wrote: > > Nick, > > This is the regex that Bio::Tools::GuessSeqFormat uses to identify a gcg file: > > /Length: .*Type: .*Check: .*\.\.$/ > > It is the second line in GCG file. If first line matches to some other format > regex, this will not not be evaluated. > > Let us know, > > -Heikki > > On Thursday 06 March 2008 05:09:11 Staffa, Nick (NIH/NIEHS) wrote: >> Verily, >> One interpretation of the docs might be: will read any format if the format >> is specified. >> I was hoping that I could write a program that one needn't specify format. >> It'd be more user-friendly and useful. >> >> On 3/5/08 9:33 PM, "Jason Stajich" wrote: >>> probably you should try specifying the format explicitly first- as in >>> (-format => 'gcg') >>> >>> -j >>> >>> On Mar 5, 2008, at 6:22 PM, Chris Fields wrote: >>>> I thought GCG format changed somewhere along the way but I maybe >>>> I'm wrong? Regardless, you'll have to post this as a bug (along >>>> with an example file). >>>> >>>> Also, kind of odd that the sequence data wasn't checked... >>>> >>>> chris >>>> >>>> On Mar 5, 2008, at 5:43 PM, Staffa, Nick (NIH/NIEHS) wrote: >>>>> So the Howto says that Bio::SeqIO will read almost any known format >>>>> including GCG. >>>>> So I create a GCG file with Seqlab and try to printout its >>>>> sequence as a >>>>> string. ( I did guess at the way to get the sequence string: >>>>> >>>>> #!/usr/bin/perl -w >>>>> use strict; >>>>> $| = 1; >>>>> use Bio::SeqIO; >>>>> my $number_of_files = @ARGV; >>>>> if(!$number_of_files){print "no files entered\n";exit:} >>>>> foreach my $file (@ARGV){ >>>>> my $seqio_object = Bio::SeqIO->new(-file => $file); >>>>> my $seq_object = $seqio_object->next_seq; >>>>> my $sequence = $seq_object->seq; >>>>> print "$sequence\n"; >>>>> my $status = &windowscore($sequence); >>>>> } >>>>> >>>>> But what it returned was the entire contents of the file with no >>>>> format >>>>> decoding. Have I been deluded? >>>>> >>>>> NewDNALength:810March5,200818:26Type:NCheck: >>>>> 3368..1TGTTCGAATTCCGTGCGGTCCACCT >>>>> CCCCTAGGAGCTCAGTGGGCTGGTT51GGATTCCGTGCCATCCCGGCAGGGCAGAGCCTCGGGAGGGGG >>>>> CGAAGGT >>>>> T101GCCCGGGGCCGTGCGCTGGGTGCTGCTGCTGCGGTGGCGGCGGCGGTGCC151TGCGGTTGCAGC >>>>> GGCTGCT >>>>> GGGGTTGCGCGTGGAAACCGCGCCCCGCACT201TGCGGCGGGCGAGCCCATCGCGCCGTAGTACAGGT >>>>> GCAGAGC >>>>> GCTGGGGG251GCGCCAGGATCCCCGGCATCGCAGGGCCCGAGGGGTCCGGCCCCACTCGC301ATGGG >>>>> GCCAGCG >>>>> GGCGGCTCTACGGACACTGCATAGTCCGAGACTGGAGC351GTAAGTGTAGGTGCCGGCCGCCGGGCAG >>>>> TCCCCTG >>>>> GCAGCGGGGCTGCAA401AGAAAGCCGGGTCCTGCTCCACGCCATCCAGCGGGGATGTGTCCGGAGTG4 >>>>> 51GGCAG >>>>> AGGGTAGCCGTCGAGCGCGGGAGCGCCCAGTCCCTGGCAGTCCCG501ATAGTGGGGGCCCATGTGCGG >>>>> AGACATC >>>>> AGCGGAGGACCGGCCGGATAGC551CCGGCTCCGGGAAAGGCAGACCCAGGCCATCCATGGCCACGCGG >>>>> CCGCCC6 >>>>> 01TCGGGACCAAGCGCGCCGGCCTGGGGCTCGACGAGAGCGTGCAGGAAGCC651TCCCTCCACCCGCT >>>>> TCATGCG >>>>> CTTCACCTGCTTGCGCCGCCGCGGCCGGT701ACTTGTAGTTGGGGTGGTCCTGCATATGCTGCACGCG >>>>> CAGCCGC >>>>> TCGGCC751TCTTCCACGAAGGGCCGCTTCTCTGCCAAGGTCAACGCCTTCCAAGACTT801GCCTGCA >>>>> GGG >>>>> >>>>> >>>>> >>>>> Nick Staffa >>>>> Telephone: 919-316-4569 (NIEHS: 6-4569) >>>>> Scientific Computing Support Group >>>>> NIEHS Information Technology Support Services Contract >>>>> (Science Task Monitor: Roy W. Reter (reter at niehs.nih.gov) >>>>> National Institute of Environmental Health Sciences >>>>> National Institutes of Health >>>>> Research Triangle Park, North Carolina >>>>> >>>>> >>>>> _______________________________________________ >>>>> Bioperl-l mailing list >>>>> Bioperl-l at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>>> Christopher Fields >>>> Postdoctoral Researcher >>>> Lab of Dr. Robert Switzer >>>> Dept of Biochemistry >>>> University of Illinois Urbana-Champaign >>>> >>>> >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From hlapp at gmx.net Thu Mar 6 10:26:52 2008 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 6 Mar 2008 10:26:52 -0500 Subject: [Bioperl-l] failure of add_seqfeature In-Reply-To: <20080306120642.4800F16427A@ws1-4.us4.outblaze.com> References: <20080306120642.4800F16427A@ws1-4.us4.outblaze.com> Message-ID: <6BD917FC-803E-471B-A0C4-219286E53C47@gmx.net> It seems you are adding subfeatures with a location that is not within their parent feature location. If that's indeed what you want to do, add the 'EXPAND' argument. Excerpted from the POD of Bio::SeqFeature::Generic: Usage : $feat->add_SeqFeature($subfeat); $feat->add_SeqFeature($subfeat,'EXPAND') Function: adds a SeqFeature into the subSeqFeature array. with no 'EXPAND' qualifer, subfeat will be tested as to whether it lies inside the parent, and throw an exception if not. If EXPAND is used, the parent's start/end/strand will be adjusted so that it grows to accommodate the new subFeature On Mar 6, 2008, at 7:06 AM, Masa Masa wrote: > Dear experts, > > Would anybody know why the following codes generate an error of: > > > ------------- EXCEPTION ------------- > MSG: Bio::SeqFeature::Generic=HASH(0x94583c0) is not contained > within parent feature, and expansion is not valid > STACK Bio::SeqFeature::Generic::add_SeqFeature /usr/lib/perl5/ > site_perl/5.8.0/Bio/SeqFeature/Generic.pm:767 > STACK toplevel test.pl:118 > > -------------------------------------- > 15616 15693 > 79568 83016 > > ================= > > > use Bio::Graphics; > use Bio::SeqFeature::Generic; > use Bio::SeqIO; > > > my $bsg = 'Bio::SeqFeature::Generic'; > > my $unseqfea = $bsg->new( -start=>$from[$i], -end=>$to[$i], - > display_name=>'U'); > > for (my $i=0; $i < @from; $i++) { > print "$from[$i] $to[$i]\n"; > $unseqfea->add_SeqFeature($bsg->new(-start=>$from[$i],-end=>$to > [$i])); > if ($i > 10) { > exit; > } > } > > -- > Want an e-mail address like mine? > Get a free e-mail account today at www.mail.com! > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From bix at sendu.me.uk Thu Mar 6 10:41:49 2008 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 06 Mar 2008 15:41:49 +0000 Subject: [Bioperl-l] BioPerl Module to Parse Transfac Flat File Database In-Reply-To: <264855a00803060540u1f3d0f92pbab13349595a0eb3@mail.gmail.com> References: <3521d3670803060016r35ac720ar9f2190631ddaf629@mail.gmail.com> <47CFEC89.1000705@sendu.me.uk> <264855a00803060540u1f3d0f92pbab13349595a0eb3@mail.gmail.com> Message-ID: <47D010BD.4000801@sendu.me.uk> Sean Davis wrote: > On Thu, Mar 6, 2008 at 8:07 AM, Sendu Bala wrote: >> Edward Wijaya wrote: >> > Dear experts, >> > >> > Is there any? The TRANSFAC text file which contain entry like this. >> > Especially we wich to capture the PWM for each of the Transcription >> > factor. >> >> Yes; I've written a module to do this, I just haven't committed it yet >> because certain things aren't quite right in terms of the API. But to >> just grab the PWM it should work fine. If you want I can email you the >> modules. > > I believe there are a set of non-bioperl modules called TFBS. See > here (although I'm not sure this is the most up-to-date site): > > http://tfbs.genereg.net/ I believe it's out of date enough to not work on the latest Transfac data, though I haven't used tried to confirm. At any rate, the Transfac (Pro) database is pretty strange and complicated, and the TFBS modules certainly don't let you access everything in the way you might want or expect. From cain.cshl at gmail.com Thu Mar 6 11:43:35 2008 From: cain.cshl at gmail.com (Scott Cain) Date: Thu, 06 Mar 2008 11:43:35 -0500 Subject: [Bioperl-l] anonymous cvs? Message-ID: <1204821815.6689.7.camel@frissell> Hi All, So now that the transition to svn is complete (and I like it), should anonymous cvs still be working? I believe there was discussion about keeping it going via mirroring, and I hope that is the case. It will make life a little easier for people who want to do automated installs of GBrowse and would like to use the installer script to get bioperl via anon cvs. If anon cvs is no longer available, does anyone have suggestions for the best route to take for getting command line svn on Windows? Thanks, Scott -- ------------------------------------------------------------------------ Scott Cain, Ph. D. cain.cshl at gmail.com GMOD Coordinator (http://www.gmod.org/) 216-392-3087 Cold Spring Harbor Laboratory From cain.cshl at gmail.com Thu Mar 6 11:48:08 2008 From: cain.cshl at gmail.com (Scott Cain) Date: Thu, 06 Mar 2008 11:48:08 -0500 Subject: [Bioperl-l] anonymous cvs? In-Reply-To: <1204821815.6689.7.camel@frissell> References: <1204821815.6689.7.camel@frissell> Message-ID: <1204822088.6689.8.camel@frissell> I should have mentioned that I tried it and it is not currently working: $ cvs -d :pserver:cvs at code.open-bio.org:/home/repository/bioperl checkout bioperl-live can't create temporary directory /tmp/cvs-serv32067 No space left on device On Thu, 2008-03-06 at 11:43 -0500, Scott Cain wrote: > Hi All, > > So now that the transition to svn is complete (and I like it), should > anonymous cvs still be working? I believe there was discussion about > keeping it going via mirroring, and I hope that is the case. It will > make life a little easier for people who want to do automated installs > of GBrowse and would like to use the installer script to get bioperl via > anon cvs. If anon cvs is no longer available, does anyone have > suggestions for the best route to take for getting command line svn on > Windows? > > Thanks, > Scott > -- ------------------------------------------------------------------------ Scott Cain, Ph. D. cain at cshl.edu GMOD Coordinator (http://www.gmod.org/) 216-392-3087 Cold Spring Harbor Laboratory From Marc.Logghe at ablynx.com Thu Mar 6 11:22:10 2008 From: Marc.Logghe at ablynx.com (Marc Logghe) Date: Thu, 6 Mar 2008 17:22:10 +0100 Subject: [Bioperl-l] SeqIO In-Reply-To: Message-ID: <03C512635899144083CADB0EE22201890172A836@alpaca.lan.ablynx.com> Hi Nick, I don't think you should leave out the -format option. You have to leave it in but the format should be provided by the B::T::GuessSeqFormat object. Something like: #!/usr/bin/perl use strict; use Bio::SeqIO; use Bio::Tools::GuessSeqFormat; $| = 1; my $number_of_files = @ARGV; if(!$number_of_files){print "no files entered\n";exit:} foreach my $file (@ARGV){ my $guesser = Bio::Tools::GuessSeqFormat->new(-file => $file); my $seqio_object = Bio::SeqIO->new(-file => $guesser->file, -format => $guesser->guess); my $seq_object = $seqio_object->next_seq; my $sequence = $seq_object->seq; print "$sequence\n"; } HTH, Marc > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Staffa, Nick (NIH/NIEHS) > Sent: donderdag 6 maart 2008 16:24 > To: Heikki Lehvaslaiho; bioperl-l at lists.open-bio.org > Cc: Chris Fields > Subject: Re: [Bioperl-l] SeqIO > > Here's the scoop: > When I use Jason's suggestion, (-format => 'gcg'), > My program works without complaint on the original file that looks like: > !!NA_SEQUENCE 1.0 > NewDNA Length: 810 March 5, 2008 18:26 Type: N Check: 3368 .. > > 1 TGTTCGAATT CCGTGCGGTC CACCTCCCCT AGGAGCTCAG TGGGCTGGTT > et c. > > BUT if I remove the first line to test Bio::Tools::GuessSeqFormat, > (which should be retro-gcg format (before version 11?)), > my program runs, but there IS a complaint: > Use of uninitialized value in scalar chomp at > /usr/lib/perl5/site_perl/5.8.5/Bio/SeqIO/gcg.pm line 118, line 1. > BUT > If I remove (-format => 'gcg'), I get no complaint, but the sequence > returned still has its numbers imbedded. This effects my calculations. > > Thanks, at least i know what my options are. > > > > Nick Staffa > Telephone: 919-316-4569 (NIEHS: 6-4569) > Scientific Computing Support Group > NIEHS Information Technology Support Services Contract > (Science Task Monitor: Roy W. Reter (reter at niehs.nih.gov) > National Institute of Environmental Health Sciences > National Institutes of Health > Research Triangle Park, North Carolina > > > > > > > > > > > On 3/6/08 7:20 AM, "Heikki Lehvaslaiho" wrote: > > > > > Nick, > > > > This is the regex that Bio::Tools::GuessSeqFormat uses to identify a gcg > file: > > > > /Length: .*Type: .*Check: .*\.\.$/ > > > > It is the second line in GCG file. If first line matches to some other > format > > regex, this will not not be evaluated. > > > > Let us know, > > > > -Heikki > > > > On Thursday 06 March 2008 05:09:11 Staffa, Nick (NIH/NIEHS) wrote: > >> Verily, > >> One interpretation of the docs might be: will read any format if the > format > >> is specified. > >> I was hoping that I could write a program that one needn't specify > format. > >> It'd be more user-friendly and useful. > >> > >> On 3/5/08 9:33 PM, "Jason Stajich" wrote: > >>> probably you should try specifying the format explicitly first- as in > >>> (-format => 'gcg') > >>> > >>> -j > >>> > >>> On Mar 5, 2008, at 6:22 PM, Chris Fields wrote: > >>>> I thought GCG format changed somewhere along the way but I maybe > >>>> I'm wrong? Regardless, you'll have to post this as a bug (along > >>>> with an example file). > >>>> > >>>> Also, kind of odd that the sequence data wasn't checked... > >>>> > >>>> chris > >>>> > >>>> On Mar 5, 2008, at 5:43 PM, Staffa, Nick (NIH/NIEHS) wrote: > >>>>> So the Howto says that Bio::SeqIO will read almost any known format > >>>>> including GCG. > >>>>> So I create a GCG file with Seqlab and try to printout its > >>>>> sequence as a > >>>>> string. ( I did guess at the way to get the sequence string: > >>>>> > >>>>> #!/usr/bin/perl -w > >>>>> use strict; > >>>>> $| = 1; > >>>>> use Bio::SeqIO; > >>>>> my $number_of_files = @ARGV; > >>>>> if(!$number_of_files){print "no files entered\n";exit:} > >>>>> foreach my $file (@ARGV){ > >>>>> my $seqio_object = Bio::SeqIO->new(-file => $file); > >>>>> my $seq_object = $seqio_object->next_seq; > >>>>> my $sequence = $seq_object->seq; > >>>>> print "$sequence\n"; > >>>>> my $status = &windowscore($sequence); > >>>>> } > >>>>> > >>>>> But what it returned was the entire contents of the file with no > >>>>> format > >>>>> decoding. Have I been deluded? > >>>>> > >>>>> NewDNALength:810March5,200818:26Type:NCheck: > >>>>> 3368..1TGTTCGAATTCCGTGCGGTCCACCT > >>>>> > CCCCTAGGAGCTCAGTGGGCTGGTT51GGATTCCGTGCCATCCCGGCAGGGCAGAGCCTCGGGAGGGGG > >>>>> CGAAGGT > >>>>> > T101GCCCGGGGCCGTGCGCTGGGTGCTGCTGCTGCGGTGGCGGCGGCGGTGCC151TGCGGTTGCAGC > >>>>> GGCTGCT > >>>>> > GGGGTTGCGCGTGGAAACCGCGCCCCGCACT201TGCGGCGGGCGAGCCCATCGCGCCGTAGTACAGGT > >>>>> GCAGAGC > >>>>> > GCTGGGGG251GCGCCAGGATCCCCGGCATCGCAGGGCCCGAGGGGTCCGGCCCCACTCGC301ATGGG > >>>>> GCCAGCG > >>>>> > GGCGGCTCTACGGACACTGCATAGTCCGAGACTGGAGC351GTAAGTGTAGGTGCCGGCCGCCGGGCAG > >>>>> TCCCCTG > >>>>> > GCAGCGGGGCTGCAA401AGAAAGCCGGGTCCTGCTCCACGCCATCCAGCGGGGATGTGTCCGGAGTG4 > >>>>> 51GGCAG > >>>>> > AGGGTAGCCGTCGAGCGCGGGAGCGCCCAGTCCCTGGCAGTCCCG501ATAGTGGGGGCCCATGTGCGG > >>>>> AGACATC > >>>>> > AGCGGAGGACCGGCCGGATAGC551CCGGCTCCGGGAAAGGCAGACCCAGGCCATCCATGGCCACGCGG > >>>>> CCGCCC6 > >>>>> > 01TCGGGACCAAGCGCGCCGGCCTGGGGCTCGACGAGAGCGTGCAGGAAGCC651TCCCTCCACCCGCT > >>>>> TCATGCG > >>>>> > CTTCACCTGCTTGCGCCGCCGCGGCCGGT701ACTTGTAGTTGGGGTGGTCCTGCATATGCTGCACGCG > >>>>> CAGCCGC > >>>>> > TCGGCC751TCTTCCACGAAGGGCCGCTTCTCTGCCAAGGTCAACGCCTTCCAAGACTT801GCCTGCA > >>>>> GGG > >>>>> > >>>>> > >>>>> > >>>>> Nick Staffa > >>>>> Telephone: 919-316-4569 (NIEHS: 6-4569) > >>>>> Scientific Computing Support Group > >>>>> NIEHS Information Technology Support Services Contract > >>>>> (Science Task Monitor: Roy W. Reter (reter at niehs.nih.gov) > >>>>> National Institute of Environmental Health Sciences > >>>>> National Institutes of Health > >>>>> Research Triangle Park, North Carolina > >>>>> > >>>>> > >>>>> _______________________________________________ > >>>>> Bioperl-l mailing list > >>>>> Bioperl-l at lists.open-bio.org > >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >>>> > >>>> Christopher Fields > >>>> Postdoctoral Researcher > >>>> Lab of Dr. Robert Switzer > >>>> Dept of Biochemistry > >>>> University of Illinois Urbana-Champaign > >>>> > >>>> > >>>> > >>>> _______________________________________________ > >>>> Bioperl-l mailing list > >>>> Bioperl-l at lists.open-bio.org > >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >> > >> _______________________________________________ > >> Bioperl-l mailing list > >> Bioperl-l at lists.open-bio.org > >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From stefan.kirov at bms.com Thu Mar 6 10:51:25 2008 From: stefan.kirov at bms.com (Stefan Kirov) Date: Thu, 06 Mar 2008 10:51:25 -0500 Subject: [Bioperl-l] BioPerl Module to Parse Transfac Flat File Database In-Reply-To: <47D010BD.4000801@sendu.me.uk> References: <3521d3670803060016r35ac720ar9f2190631ddaf629@mail.gmail.com> <47CFEC89.1000705@sendu.me.uk> <264855a00803060540u1f3d0f92pbab13349595a0eb3@mail.gmail.com> <47D010BD.4000801@sendu.me.uk> Message-ID: <47D012FD.7090600@bms.com> Sendu Bala wrote: > Sean Davis wrote: >> On Thu, Mar 6, 2008 at 8:07 AM, Sendu Bala wrote: >>> Edward Wijaya wrote: >>> > Dear experts, >>> > >>> > Is there any? The TRANSFAC text file which contain entry like this. >>> > Especially we wich to capture the PWM for each of the Transcription >>> > factor. >>> >>> Yes; I've written a module to do this, I just haven't committed it yet >>> because certain things aren't quite right in terms of the API. But to >>> just grab the PWM it should work fine. If you want I can email you the >>> modules. >> >> I believe there are a set of non-bioperl modules called TFBS. See >> here (although I'm not sure this is the most up-to-date site): >> >> http://tfbs.genereg.net/ > > I believe it's out of date enough to not work on the latest Transfac > data, though I haven't used tried to confirm. > > At any rate, the Transfac (Pro) database is pretty strange and > complicated, and the TFBS modules certainly don't let you access > everything in the way you might want or expect. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > Also be careful: there is a difference between PFM and PWM. Getting PWM through most programs I have encountered will assume random distribution (0.25 per each position in the background), unless you specify your own. This could be something you may be comfortable with, but you definitely should be aware of. From jay at jays.net Thu Mar 6 12:03:51 2008 From: jay at jays.net (Jay Hannah) Date: Thu, 06 Mar 2008 11:03:51 -0600 Subject: [Bioperl-l] anonymous cvs? In-Reply-To: <1204821815.6689.7.camel@frissell> References: <1204821815.6689.7.camel@frissell> Message-ID: <47D023F7.4000803@jays.net> Scott Cain wrote: > It will make life a little easier for people who want to do automated installs > of GBrowse and would like to use the installer script to get bioperl via > anon cvs. Those installer scripts can't use anon SVN instead? > If anon cvs is no longer available, does anyone have > suggestions for the best route to take for getting command line svn on > Windows? > At $work our Windows guys use GUIs for both CVS (repo dead this summer) and SVN. Are there command-line (MS-DOS?) CVS clients for Windows? And there isn't an SVN equivalent? j http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah From whs at ebi.ac.uk Thu Mar 6 12:08:51 2008 From: whs at ebi.ac.uk (William Spooner) Date: Thu, 6 Mar 2008 17:08:51 +0000 Subject: [Bioperl-l] anonymous cvs? In-Reply-To: <1204821815.6689.7.camel@frissell> References: <1204821815.6689.7.camel@frissell> Message-ID: <07E3119E-0354-4E93-9980-3CB2B26DF2BE@ebi.ac.uk> This will be important for Ensembl as well. As far as I know all of their install docs refer to BioPerl's anonymous CVS. On 6 Mar 2008, at 16:43, Scott Cain wrote: > Hi All, > > So now that the transition to svn is complete (and I like it), should > anonymous cvs still be working? I believe there was discussion about > keeping it going via mirroring, and I hope that is the case. It will > make life a little easier for people who want to do automated installs > of GBrowse and would like to use the installer script to get bioperl > via > anon cvs. If anon cvs is no longer available, does anyone have > suggestions for the best route to take for getting command line svn on > Windows? > > Thanks, > Scott > > -- > ------------------------------------------------------------------------ > Scott Cain, Ph. D. cain.cshl at gmail.com > GMOD Coordinator (http://www.gmod.org/) > 216-392-3087 > Cold Spring Harbor Laboratory > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l --- William Spooner Visiting Scientist whs at ebi.ac.uk From MEC at stowers-institute.org Thu Mar 6 11:58:57 2008 From: MEC at stowers-institute.org (Cook, Malcolm) Date: Thu, 6 Mar 2008 10:58:57 -0600 Subject: [Bioperl-l] BioPerl Module to Parse Transfac Flat File Database In-Reply-To: <47D010BD.4000801@sendu.me.uk> References: <3521d3670803060016r35ac720ar9f2190631ddaf629@mail.gmail.com> <47CFEC89.1000705@sendu.me.uk> <264855a00803060540u1f3d0f92pbab13349595a0eb3@mail.gmail.com> <47D010BD.4000801@sendu.me.uk> Message-ID: we use TFBS all the time against data coming from a recent local install of TRANSFAC(r) Professional 11.1 (2007-03-31) the most recent is 11.4 (2007-12-14) TFBS::* has the nice advantage that you can interoperate Transfac pwms with other (say, Jaspar) matrices and/or simple consesus sequence patterns; and it COULD be fairly easily extended to allow interoperation with other sources, say cisRED. "One interface to rule them all" - bwa ha ha. However, if you DO have locally installed Transfac (Pro) ($$), and want to use just it, then you should know that you can also call their `match` routines from the unix command line (though this is not documented to my knowledge). I can supply my cheat sheet or otherwise advise if desired. Also, if you go this way, I've written the requisite TFMatchOut2GFF to convert TRANSFAC match's output to GFF, if it suits your purpose, which I could release if asked. If you want to use TFBS::**, I have written a command-line wrapper for the TFBS perl modules that might give you a leg up if you decide to use TFBS::**. I could release them too, if useful. But I agree, if I recall, TFBS::* were dropped from ongoing active development due to issues with data access policys. And, I think that they no longer with with remotely hosted Transfac. They did a few years ago. I think I tested a while ago and found that they do not. Malcolm Cook Stowers Institute for Medical Research - Kansas City, Missouri > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org > [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Sendu Bala > Sent: Thursday, March 06, 2008 9:42 AM > To: Sean Davis > Cc: bioperl-l at lists.open-bio.org; Edward Wijaya > Subject: Re: [Bioperl-l] BioPerl Module to Parse Transfac > Flat File Database > > Sean Davis wrote: > > On Thu, Mar 6, 2008 at 8:07 AM, Sendu Bala wrote: > >> Edward Wijaya wrote: > >> > Dear experts, > >> > > >> > Is there any? The TRANSFAC text file which contain > entry like this. > >> > Especially we wich to capture the PWM for each of the > >> Transcription > factor. > >> > >> Yes; I've written a module to do this, I just haven't > committed it > >> yet because certain things aren't quite right in terms of > the API. > >> But to just grab the PWM it should work fine. If you want I can > >> email you the modules. > > > > I believe there are a set of non-bioperl modules called TFBS. See > > here (although I'm not sure this is the most up-to-date site): > > > > http://tfbs.genereg.net/ > > I believe it's out of date enough to not work on the latest > Transfac data, though I haven't used tried to confirm. > > At any rate, the Transfac (Pro) database is pretty strange > and complicated, and the TFBS modules certainly don't let you > access everything in the way you might want or expect. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From cjfields at uiuc.edu Thu Mar 6 12:10:35 2008 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 6 Mar 2008 11:10:35 -0600 Subject: [Bioperl-l] anonymous cvs? In-Reply-To: <1204821815.6689.7.camel@frissell> References: <1204821815.6689.7.camel@frissell> Message-ID: <84E60454-2B09-4F77-9BD6-4B9150304B2D@uiuc.edu> BioPerl CVS is no longer being updated; you have to use Subversion to grab the latest (we have anon. svn set up for this). We discussed syncing svn commits over to cvs but found it way too problematic and decided to make a clean break. The best option I can think of as a replacement (so everyone isn't dependent on installing svn to get Gbrowse and bioperl-live) is to get a cron job set up which drops a bioperl-live archive into bioperl.org/ DIST or bioperl.org/SRC. We have already talked about doing this for nightly builds from svn main trunk; we can probably set that up on our end. Would that be feasible as a fallback in case svn isn't present? The subversion project page has information on Windows versions: http://subversion.tigris.org/project_packages.html chris On Mar 6, 2008, at 10:43 AM, Scott Cain wrote: > Hi All, > > So now that the transition to svn is complete (and I like it), should > anonymous cvs still be working? I believe there was discussion about > keeping it going via mirroring, and I hope that is the case. It will > make life a little easier for people who want to do automated installs > of GBrowse and would like to use the installer script to get bioperl > via > anon cvs. If anon cvs is no longer available, does anyone have > suggestions for the best route to take for getting command line svn on > Windows? > > Thanks, > Scott > > -- > ------------------------------------------------------------------------ > Scott Cain, Ph. D. cain.cshl at gmail.com > GMOD Coordinator (http://www.gmod.org/) > 216-392-3087 > Cold Spring Harbor Laboratory > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cain.cshl at gmail.com Thu Mar 6 12:22:29 2008 From: cain.cshl at gmail.com (Scott Cain) Date: Thu, 06 Mar 2008 12:22:29 -0500 Subject: [Bioperl-l] anonymous cvs? In-Reply-To: <84E60454-2B09-4F77-9BD6-4B9150304B2D@uiuc.edu> References: <1204821815.6689.7.camel@frissell> <84E60454-2B09-4F77-9BD6-4B9150304B2D@uiuc.edu> Message-ID: <1204824149.6689.14.camel@frissell> Hi Chris, I think a nightly generated tarball would be sufficient for my use. We used anon cvs to get the lastest bioperl and then threw it away once it was installed, so a tarball is just as good,if not better, since users wouldn't need to install svn. Not needing to install svn is good thing for all my users, since I think many distributions do not supply it by default. Thanks, Scott On Thu, 2008-03-06 at 11:10 -0600, Chris Fields wrote: > BioPerl CVS is no longer being updated; you have to use Subversion to > grab the latest (we have anon. svn set up for this). We discussed > syncing svn commits over to cvs but found it way too problematic and > decided to make a clean break. > > The best option I can think of as a replacement (so everyone isn't > dependent on installing svn to get Gbrowse and bioperl-live) is to get > a cron job set up which drops a bioperl-live archive into bioperl.org/ > DIST or bioperl.org/SRC. We have already talked about doing this for > nightly builds from svn main trunk; we can probably set that up on our > end. Would that be feasible as a fallback in case svn isn't present? > > The subversion project page has information on Windows versions: > > http://subversion.tigris.org/project_packages.html > > chris > > On Mar 6, 2008, at 10:43 AM, Scott Cain wrote: > > > Hi All, > > > > So now that the transition to svn is complete (and I like it), should > > anonymous cvs still be working? I believe there was discussion about > > keeping it going via mirroring, and I hope that is the case. It will > > make life a little easier for people who want to do automated installs > > of GBrowse and would like to use the installer script to get bioperl > > via > > anon cvs. If anon cvs is no longer available, does anyone have > > suggestions for the best route to take for getting command line svn on > > Windows? > > > > Thanks, > > Scott > > > > -- > > ------------------------------------------------------------------------ > > Scott Cain, Ph. D. cain.cshl at gmail.com > > GMOD Coordinator (http://www.gmod.org/) > > 216-392-3087 > > Cold Spring Harbor Laboratory > > > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > -- ------------------------------------------------------------------------ Scott Cain, Ph. D. cain.cshl at gmail.com GMOD Coordinator (http://www.gmod.org/) 216-392-3087 Cold Spring Harbor Laboratory From cain.cshl at gmail.com Thu Mar 6 12:28:13 2008 From: cain.cshl at gmail.com (Scott Cain) Date: Thu, 06 Mar 2008 12:28:13 -0500 Subject: [Bioperl-l] anonymous cvs? In-Reply-To: <47D023F7.4000803@jays.net> References: <1204821815.6689.7.camel@frissell> <47D023F7.4000803@jays.net> Message-ID: <1204824493.6689.19.camel@frissell> Hi Jay, It could use anon svn, though svn is considerably less ubiquitous, so it effectively adds another prerequisite. For cvs, the GUI WinCVS provides command line cvs as well. I was wondering if there was an easy to install equivalent for svn, though it may be moot for me if the powers that be will provide a nightly tarball :-) Scott On Thu, 2008-03-06 at 11:03 -0600, Jay Hannah wrote: > Scott Cain wrote: > > It will make life a little easier for people who want to do automated installs > > of GBrowse and would like to use the installer script to get bioperl via > > anon cvs. > > Those installer scripts can't use anon SVN instead? > > > If anon cvs is no longer available, does anyone have > > suggestions for the best route to take for getting command line svn on > > Windows? > > > > At $work our Windows guys use GUIs for both CVS (repo dead this summer) > and SVN. Are there command-line (MS-DOS?) CVS clients for Windows? And > there isn't an SVN equivalent? > > j > http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- ------------------------------------------------------------------------ Scott Cain, Ph. D. cain at cshl.edu GMOD Coordinator (http://www.gmod.org/) 216-392-3087 Cold Spring Harbor Laboratory From cjfields at uiuc.edu Thu Mar 6 12:28:36 2008 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 6 Mar 2008 11:28:36 -0600 Subject: [Bioperl-l] anonymous cvs? In-Reply-To: <1204824149.6689.14.camel@frissell> References: <1204821815.6689.7.camel@frissell> <84E60454-2B09-4F77-9BD6-4B9150304B2D@uiuc.edu> <1204824149.6689.14.camel@frissell> Message-ID: I'm working on the nightly build script now and will post back when everything is set up. chris On Mar 6, 2008, at 11:22 AM, Scott Cain wrote: > Hi Chris, > > I think a nightly generated tarball would be sufficient for my use. > We > used anon cvs to get the lastest bioperl and then threw it away once > it > was installed, so a tarball is just as good,if not better, since users > wouldn't need to install svn. Not needing to install svn is good > thing > for all my users, since I think many distributions do not supply it by > default. > > Thanks, > Scott > > > > On Thu, 2008-03-06 at 11:10 -0600, Chris Fields wrote: >> BioPerl CVS is no longer being updated; you have to use Subversion to >> grab the latest (we have anon. svn set up for this). We discussed >> syncing svn commits over to cvs but found it way too problematic and >> decided to make a clean break. >> >> The best option I can think of as a replacement (so everyone isn't >> dependent on installing svn to get Gbrowse and bioperl-live) is to >> get >> a cron job set up which drops a bioperl-live archive into >> bioperl.org/ >> DIST or bioperl.org/SRC. We have already talked about doing this for >> nightly builds from svn main trunk; we can probably set that up on >> our >> end. Would that be feasible as a fallback in case svn isn't present? >> >> The subversion project page has information on Windows versions: >> >> http://subversion.tigris.org/project_packages.html >> >> chris >> >> On Mar 6, 2008, at 10:43 AM, Scott Cain wrote: >> >>> Hi All, >>> >>> So now that the transition to svn is complete (and I like it), >>> should >>> anonymous cvs still be working? I believe there was discussion >>> about >>> keeping it going via mirroring, and I hope that is the case. It >>> will >>> make life a little easier for people who want to do automated >>> installs >>> of GBrowse and would like to use the installer script to get bioperl >>> via >>> anon cvs. If anon cvs is no longer available, does anyone have >>> suggestions for the best route to take for getting command line >>> svn on >>> Windows? >>> >>> Thanks, >>> Scott >>> >>> -- >>> ------------------------------------------------------------------------ >>> Scott Cain, Ph. D. cain.cshl at gmail.com >>> GMOD Coordinator (http://www.gmod.org/) >>> 216-392-3087 >>> Cold Spring Harbor Laboratory >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> Christopher Fields >> Postdoctoral Researcher >> Lab of Dr. Robert Switzer >> Dept of Biochemistry >> University of Illinois Urbana-Champaign >> >> >> > -- > ------------------------------------------------------------------------ > Scott Cain, Ph. D. cain.cshl at gmail.com > GMOD Coordinator (http://www.gmod.org/) > 216-392-3087 > Cold Spring Harbor Laboratory > > Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Thu Mar 6 15:38:22 2008 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 6 Mar 2008 14:38:22 -0600 Subject: [Bioperl-l] anonymous cvs? In-Reply-To: References: <1204821815.6689.7.camel@frissell> <84E60454-2B09-4F77-9BD6-4B9150304B2D@uiuc.edu> <1204824149.6689.14.camel@frissell> Message-ID: <2F746C5B-902C-4510-AEA3-2C46D4F51E7A@uiuc.edu> Okay, I have set up nightly builds for bioperl-live, db, network, and run here: http://www.bioperl.org/DIST/nightly_builds/ ftp://ftp.open-bio.org/pub/bioperl/DIST/nightly_builds At the moment this is running via a crontab off a script in my portal account, retrieving everything via anon. svn and bundling it up into zip and tarball archives. I would like to set it up to grab everything off dev but I don't want to mess with my ssh setup, so if anyone has ideas there... The script also adds a CHANGELOG file (last 10 commits) and removes the .svn directories prior to bundling. The archive name has the subversion revision number and date included; md5 checksums are in the SIGNATURES file. I'll check on it again tomorrow to make sure cron ran it. We can probably set up automated PPM builds as well; might be worth testing down the road (we need a way to set defaults for Build args prior to getting that running). chris On Mar 6, 2008, at 11:28 AM, Chris Fields wrote: > I'm working on the nightly build script now and will post back when > everything is set up. > > chris > > On Mar 6, 2008, at 11:22 AM, Scott Cain wrote: > >> Hi Chris, >> >> I think a nightly generated tarball would be sufficient for my >> use. We >> used anon cvs to get the lastest bioperl and then threw it away >> once it >> was installed, so a tarball is just as good,if not better, since >> users >> wouldn't need to install svn. Not needing to install svn is good >> thing >> for all my users, since I think many distributions do not supply it >> by >> default. >> >> Thanks, >> Scott >> >> >> >> On Thu, 2008-03-06 at 11:10 -0600, Chris Fields wrote: >>> BioPerl CVS is no longer being updated; you have to use Subversion >>> to >>> grab the latest (we have anon. svn set up for this). We discussed >>> syncing svn commits over to cvs but found it way too problematic and >>> decided to make a clean break. >>> >>> The best option I can think of as a replacement (so everyone isn't >>> dependent on installing svn to get Gbrowse and bioperl-live) is to >>> get >>> a cron job set up which drops a bioperl-live archive into >>> bioperl.org/ >>> DIST or bioperl.org/SRC. We have already talked about doing this >>> for >>> nightly builds from svn main trunk; we can probably set that up on >>> our >>> end. Would that be feasible as a fallback in case svn isn't >>> present? >>> >>> The subversion project page has information on Windows versions: >>> >>> http://subversion.tigris.org/project_packages.html >>> >>> chris >>> >>> On Mar 6, 2008, at 10:43 AM, Scott Cain wrote: >>> >>>> Hi All, >>>> >>>> So now that the transition to svn is complete (and I like it), >>>> should >>>> anonymous cvs still be working? I believe there was discussion >>>> about >>>> keeping it going via mirroring, and I hope that is the case. It >>>> will >>>> make life a little easier for people who want to do automated >>>> installs >>>> of GBrowse and would like to use the installer script to get >>>> bioperl >>>> via >>>> anon cvs. If anon cvs is no longer available, does anyone have >>>> suggestions for the best route to take for getting command line >>>> svn on >>>> Windows? >>>> >>>> Thanks, >>>> Scott >>>> >>>> -- >>>> ------------------------------------------------------------------------ >>>> Scott Cain, Ph. D. cain.cshl at gmail.com >>>> GMOD Coordinator (http://www.gmod.org/) >>>> 216-392-3087 >>>> Cold Spring Harbor Laboratory >>>> >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> Christopher Fields >>> Postdoctoral Researcher >>> Lab of Dr. Robert Switzer >>> Dept of Biochemistry >>> University of Illinois Urbana-Champaign >>> >>> >>> >> -- >> ------------------------------------------------------------------------ >> Scott Cain, Ph. D. cain.cshl at gmail.com >> GMOD Coordinator (http://www.gmod.org/) >> 216-392-3087 >> Cold Spring Harbor Laboratory >> >> > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Thu Mar 6 16:48:37 2008 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 6 Mar 2008 15:48:37 -0600 Subject: [Bioperl-l] Nightly build archives now available Message-ID: We now have nightly bundled archives for bioperl-live, bioperl-db, bioperl-run, and bioperl-network running; these will be updated ~ 1:00 am every night. http://www.bioperl.org/DIST/nightly_builds/ ftp://ftp.open-bio.org/pub/bioperl/DIST/nightly_builds The archives are date-stamped and also have the Subversion revision, just in case one wanted to ensure they get the correct version for the bug fix. They also contain a CHANGELOG file for the last 10 revisions (if there are any). These are currently derived off the anon. svn repository. chris From David.Messina at sbc.su.se Thu Mar 6 18:50:04 2008 From: David.Messina at sbc.su.se (Dave Messina) Date: Fri, 7 Mar 2008 00:50:04 +0100 Subject: [Bioperl-l] Nightly build archives now available In-Reply-To: References: Message-ID: <628aabb70803061550s24d7d8cfhf80495ea970a6c19@mail.gmail.com> Very slick and well-thought-out, Chris -- nice job! Dave From hlapp at gmx.net Thu Mar 6 19:06:41 2008 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 6 Mar 2008 19:06:41 -0500 Subject: [Bioperl-l] Nightly build archives now available In-Reply-To: References: Message-ID: Awesome - thanks for doing this, Chris! -hilmar On Mar 6, 2008, at 4:48 PM, Chris Fields wrote: > We now have nightly bundled archives for bioperl-live, bioperl-db, > bioperl-run, and bioperl-network running; these will be updated ~ > 1:00 am every night. > > http://www.bioperl.org/DIST/nightly_builds/ > ftp://ftp.open-bio.org/pub/bioperl/DIST/nightly_builds > > The archives are date-stamped and also have the Subversion > revision, just in case one wanted to ensure they get the correct > version for the bug fix. They also contain a CHANGELOG file for > the last 10 revisions (if there are any). These are currently > derived off the anon. svn repository. > > chris > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From staffa at niehs.nih.gov Thu Mar 6 18:27:31 2008 From: staffa at niehs.nih.gov (Staffa, Nick (NIH/NIEHS)) Date: Thu, 06 Mar 2008 18:27:31 -0500 Subject: [Bioperl-l] SeqIO In-Reply-To: <03C512635899144083CADB0EE22201890172A836@alpaca.lan.ablynx.com> Message-ID: Thanks I really appreciate all the interest given and help generated. that sure sounds like a great idea, but i think Bio::Tools::GuessSeqFormat needs more RIGOR before it declares itself. Is there a substitute? It works great with >> !!NA_SEQUENCE 1.0 >> NewDNA Length: 810 March 5, 2008 18:26 Type: N Check: 3368 .. >> >> 1 TGTTCGAATT CCGTGCGGTC CACCTCCCCT AGGAGCTCAG TGGGCTGGTT >> et c. as seen in: gir.niehs.nih.gov> CGwindows.pl TestDNA.seq.org | more guesser guesses gcg TGTTCGAATTCCGTGCGGTCCACCTCCCCTAGGAGCTCAGTGGGCTGGTTGGATTCCGTGCCATCCCGGCAGGGCA GAGCCTCGGGA et c. (yes, I added my $file_type = $guesser->guess; print "guesser guesses $file_type\n"; ) BUT when applied to a genbank sequence passed thru the Seqlab editor and turned into GCG, to wit: !!NA_SEQUENCE 1.0 LOCUS HSPGK2G 1911 bp DNA PRI 12-SEP-1993 DEFINITION Human testis-specific PGK-2 gene for phosphoglycerate kinase (ATP:3-phospho-D-glycerate 1-phosphotransferase, EC 2.7.2.3). ACCESSION X05246 Y00261 ... ... BASE COUNT 583 a 367 c 442 g 519 t ORIGIN HSPGK2G Length: 1911 August 24, 1998 10:56 Type: N Check: 4156 .. 1 GCCCCTCAAC AGCAAGTTGG TTCTTCAGCA TTAAGATCCA GGTGTCAGCC et c. It thinks it is a flawed PIR: gir.niehs.nih.gov> CGwindows.pl hspgk2g.seq | more guesser guesses pir ------------- EXCEPTION ------------- MSG: PIR stream read attempted without leading '>P1;' [ !!NA_SEQUENCE 1.0 LOCUS HSPGK2G 1911 bp DNA PRI 12-SEP-1993 Must look at why guesser is thinking PIR. On 3/6/08 11:22 AM, "Marc Logghe" wrote: > Hi Nick, > I don't think you should leave out the -format option. You have to leave > it in but the format should be provided by the B::T::GuessSeqFormat > object. > Something like: > > #!/usr/bin/perl > use strict; > use Bio::SeqIO; > use Bio::Tools::GuessSeqFormat; > > $| = 1; > my $number_of_files = @ARGV; > if(!$number_of_files){print "no files entered\n";exit:} > foreach my $file (@ARGV){ > my $guesser = Bio::Tools::GuessSeqFormat->new(-file => $file); > my $seqio_object = Bio::SeqIO->new(-file => $guesser->file, -format => > $guesser->guess); > my $seq_object = $seqio_object->next_seq; > my $sequence = $seq_object->seq; > print "$sequence\n"; > } > > HTH, > Marc > > >> -----Original Message----- >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- >> bounces at lists.open-bio.org] On Behalf Of Staffa, Nick (NIH/NIEHS) >> Sent: donderdag 6 maart 2008 16:24 >> To: Heikki Lehvaslaiho; bioperl-l at lists.open-bio.org >> Cc: Chris Fields >> Subject: Re: [Bioperl-l] SeqIO >> >> Here's the scoop: >> When I use Jason's suggestion, (-format => 'gcg'), >> My program works without complaint on the original file that looks > like: >> !!NA_SEQUENCE 1.0 >> NewDNA Length: 810 March 5, 2008 18:26 Type: N Check: 3368 .. >> >> 1 TGTTCGAATT CCGTGCGGTC CACCTCCCCT AGGAGCTCAG TGGGCTGGTT >> et c. >> >> BUT if I remove the first line to test Bio::Tools::GuessSeqFormat, >> (which should be retro-gcg format (before version 11?)), >> my program runs, but there IS a complaint: >> Use of uninitialized value in scalar chomp at >> /usr/lib/perl5/site_perl/5.8.5/Bio/SeqIO/gcg.pm line 118, line > 1. >> BUT >> If I remove (-format => 'gcg'), I get no complaint, but the sequence >> returned still has its numbers imbedded. This effects my calculations. >> >> Thanks, at least i know what my options are. >> >> >> >> Nick Staffa >> Telephone: 919-316-4569 (NIEHS: 6-4569) >> Scientific Computing Support Group >> NIEHS Information Technology Support Services Contract >> (Science Task Monitor: Roy W. Reter (reter at niehs.nih.gov) >> National Institute of Environmental Health Sciences >> National Institutes of Health >> Research Triangle Park, North Carolina > From cjfields at uiuc.edu Thu Mar 6 23:32:39 2008 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 6 Mar 2008 22:32:39 -0600 Subject: [Bioperl-l] Nightly build archives now available In-Reply-To: <628aabb70803061550s24d7d8cfhf80495ea970a6c19@mail.gmail.com> References: <628aabb70803061550s24d7d8cfhf80495ea970a6c19@mail.gmail.com> Message-ID: <5A67E3A9-9997-4A6B-AB07-8403D5FF388E@uiuc.edu> I would like to get automated PPM builds set up as well but I think we have to rework some Build.PL stuff to get that going. The next thing is to set up a regular script to check test/POD coverage. chris On Mar 6, 2008, at 5:50 PM, Dave Messina wrote: > Very slick and well-thought-out, Chris -- nice job! > > > Dave From Marc.Logghe at ablynx.com Fri Mar 7 04:04:35 2008 From: Marc.Logghe at ablynx.com (Marc Logghe) Date: Fri, 7 Mar 2008 10:04:35 +0100 Subject: [Bioperl-l] SeqIO In-Reply-To: Message-ID: <03C512635899144083CADB0EE22201890172A938@alpaca.lan.ablynx.com> Ahh, my reply did not make much sense when I took a new look. I was the one who learnt something here :-) Did not know that Bio::SeqIO was already using B::T::GuessSeqFormat under the hood. Learnt as well that you have to be careful with the filename extension because this seems to have precedence. Regards, Marc > -----Original Message----- > From: Staffa, Nick (NIH/NIEHS) [mailto:staffa at niehs.nih.gov] > Sent: vrijdag 7 maart 2008 0:28 > To: Marc Logghe; Heikki Lehvaslaiho; bioperl-l at lists.open-bio.org > Cc: Chris Fields > Subject: Re: [Bioperl-l] SeqIO > > Thanks > I really appreciate all the interest given and help generated. > that sure sounds like a great idea, but i think > Bio::Tools::GuessSeqFormat needs more RIGOR before it declares itself. > Is there a substitute? > It works great with > >> !!NA_SEQUENCE 1.0 > >> NewDNA Length: 810 March 5, 2008 18:26 Type: N Check: 3368 .. > >> > >> 1 TGTTCGAATT CCGTGCGGTC CACCTCCCCT AGGAGCTCAG TGGGCTGGTT > >> et c. > > as seen in: > gir.niehs.nih.gov> CGwindows.pl TestDNA.seq.org | more > guesser guesses gcg > TGTTCGAATTCCGTGCGGTCCACCTCCCCTAGGAGCTCAGTGGGCTGGTTGGATTCCGTGCCATCCCGGCAG GG > CA > GAGCCTCGGGA et c. > (yes, I added > my $file_type = $guesser->guess; > print "guesser guesses $file_type\n"; > ) > > BUT > when applied to a genbank sequence passed thru the Seqlab editor and > turned > into GCG, to wit: > !!NA_SEQUENCE 1.0 > LOCUS HSPGK2G 1911 bp DNA PRI 12-SEP-1993 > DEFINITION Human testis-specific PGK-2 gene for phosphoglycerate kinase > (ATP:3-phospho-D-glycerate 1-phosphotransferase, EC 2.7.2.3). > ACCESSION X05246 Y00261 > ... > ... > BASE COUNT 583 a 367 c 442 g 519 t > ORIGIN > > HSPGK2G Length: 1911 August 24, 1998 10:56 Type: N Check: 4156 .. > > 1 GCCCCTCAAC AGCAAGTTGG TTCTTCAGCA TTAAGATCCA GGTGTCAGCC > et c. > > It thinks it is a flawed PIR: > > gir.niehs.nih.gov> CGwindows.pl hspgk2g.seq | more > guesser guesses pir > > ------------- EXCEPTION ------------- > MSG: PIR stream read attempted without leading '>P1;' [ !!NA_SEQUENCE 1.0 > LOCUS HSPGK2G 1911 bp DNA PRI 12-SEP-1993 > > > Must look at why guesser is thinking PIR. > > > > > On 3/6/08 11:22 AM, "Marc Logghe" wrote: > > > Hi Nick, > > I don't think you should leave out the -format option. You have to leave > > it in but the format should be provided by the B::T::GuessSeqFormat > > object. > > Something like: > > > > #!/usr/bin/perl > > use strict; > > use Bio::SeqIO; > > use Bio::Tools::GuessSeqFormat; > > > > $| = 1; > > my $number_of_files = @ARGV; > > if(!$number_of_files){print "no files entered\n";exit:} > > foreach my $file (@ARGV){ > > my $guesser = Bio::Tools::GuessSeqFormat->new(-file => $file); > > my $seqio_object = Bio::SeqIO->new(-file => $guesser->file, -format => > > $guesser->guess); > > my $seq_object = $seqio_object->next_seq; > > my $sequence = $seq_object->seq; > > print "$sequence\n"; > > } > > > > HTH, > > Marc > > > > > >> -----Original Message----- > >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > >> bounces at lists.open-bio.org] On Behalf Of Staffa, Nick (NIH/NIEHS) > >> Sent: donderdag 6 maart 2008 16:24 > >> To: Heikki Lehvaslaiho; bioperl-l at lists.open-bio.org > >> Cc: Chris Fields > >> Subject: Re: [Bioperl-l] SeqIO > >> > >> Here's the scoop: > >> When I use Jason's suggestion, (-format => 'gcg'), > >> My program works without complaint on the original file that looks > > like: > >> !!NA_SEQUENCE 1.0 > >> NewDNA Length: 810 March 5, 2008 18:26 Type: N Check: 3368 .. > >> > >> 1 TGTTCGAATT CCGTGCGGTC CACCTCCCCT AGGAGCTCAG TGGGCTGGTT > >> et c. > >> > >> BUT if I remove the first line to test Bio::Tools::GuessSeqFormat, > >> (which should be retro-gcg format (before version 11?)), > >> my program runs, but there IS a complaint: > >> Use of uninitialized value in scalar chomp at > >> /usr/lib/perl5/site_perl/5.8.5/Bio/SeqIO/gcg.pm line 118, line > > 1. > >> BUT > >> If I remove (-format => 'gcg'), I get no complaint, but the sequence > >> returned still has its numbers imbedded. This effects my calculations. > >> > >> Thanks, at least i know what my options are. > >> > >> > >> > >> Nick Staffa > >> Telephone: 919-316-4569 (NIEHS: 6-4569) > >> Scientific Computing Support Group > >> NIEHS Information Technology Support Services Contract > >> (Science Task Monitor: Roy W. Reter (reter at niehs.nih.gov) > >> National Institute of Environmental Health Sciences > >> National Institutes of Health > >> Research Triangle Park, North Carolina > > From bix at sendu.me.uk Fri Mar 7 05:32:01 2008 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 07 Mar 2008 10:32:01 +0000 Subject: [Bioperl-l] Nightly build archives now available In-Reply-To: <5A67E3A9-9997-4A6B-AB07-8403D5FF388E@uiuc.edu> References: <628aabb70803061550s24d7d8cfhf80495ea970a6c19@mail.gmail.com> <5A67E3A9-9997-4A6B-AB07-8403D5FF388E@uiuc.edu> Message-ID: <47D119A1.10408@sendu.me.uk> Chris Fields wrote: > I would like to get automated PPM builds set up as well but I think we > have to rework some Build.PL stuff to get that going. What's the hold-up on that front? From heikki at sanbi.ac.za Fri Mar 7 06:09:25 2008 From: heikki at sanbi.ac.za (Heikki Lehvaslaiho) Date: Fri, 7 Mar 2008 13:09:25 +0200 Subject: [Bioperl-l] BioSQL V1.0.0 released Message-ID: <200803071309.25294.heikki@sanbi.ac.za> BIOSQL V1.0.0 RELEASED http://news.open-bio.org/archives/2008_03.html#000094 Congratulations, Hilmar! -Heikki -- ______ _/ _/_____________________________________________________ _/ _/ _/ _/ _/ Heikki Lehvaslaiho heikki at_sanbi _ac _za _/_/_/_/_/ Senior Scientist skype: heikki_lehvaslaiho _/ _/ _/ SANBI, South African National Bioinformatics Institute _/ _/ _/ University of Western Cape, South Africa _/ Phone: +27 21 959 2096 FAX: +27 21 959 2512 ___ _/_/_/_/_/________________________________________________________ From cjfields at uiuc.edu Fri Mar 7 08:53:50 2008 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 7 Mar 2008 07:53:50 -0600 Subject: [Bioperl-l] Nightly build archives now available In-Reply-To: <47D119A1.10408@sendu.me.uk> References: <628aabb70803061550s24d7d8cfhf80495ea970a6c19@mail.gmail.com> <5A67E3A9-9997-4A6B-AB07-8403D5FF388E@uiuc.edu> <47D119A1.10408@sendu.me.uk> Message-ID: I haven't tried it out yet, to tell the truth. The worry I have is prompting during the build process for database tests, networking, etc. I have looked for it, but couldn't determine whether we have a way to run 'perl Build.PL' and bypass prompts with passed arguments. The only one I could find was 'network', for network tests. Scott Cain and I have corresponded about this before, i.e. it would be nice to have boolean flags for each prompt (prereqs, database tests, scripts, network, etc). For nightly PPMs I would forego tests and include scripts. chris On Mar 7, 2008, at 4:32 AM, Sendu Bala wrote: > Chris Fields wrote: >> I would like to get automated PPM builds set up as well but I think >> we have to rework some Build.PL stuff to get that going. > > What's the hold-up on that front? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at uiuc.edu Fri Mar 7 08:22:27 2008 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 7 Mar 2008 07:22:27 -0600 Subject: [Bioperl-l] BioSQL V1.0.0 released In-Reply-To: <200803071309.25294.heikki@sanbi.ac.za> References: <200803071309.25294.heikki@sanbi.ac.za> Message-ID: <7558F8C6-FE40-4BAE-BA6A-D5039B10F350@uiuc.edu> Same here. Great news! chris On Mar 7, 2008, at 5:09 AM, Heikki Lehvaslaiho wrote: > BIOSQL V1.0.0 RELEASED > http://news.open-bio.org/archives/2008_03.html#000094 > > > Congratulations, Hilmar! > > -Heikki > > -- > ______ _/ _/_____________________________________________________ > _/ _/ > _/ _/ _/ Heikki Lehvaslaiho heikki at_sanbi _ac _za > _/_/_/_/_/ Senior Scientist skype: heikki_lehvaslaiho > _/ _/ _/ SANBI, South African National Bioinformatics Institute > _/ _/ _/ University of Western Cape, South Africa > _/ Phone: +27 21 959 2096 FAX: +27 21 959 2512 > ___ _/_/_/_/_/________________________________________________________ > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Fri Mar 7 09:10:08 2008 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 07 Mar 2008 14:10:08 +0000 Subject: [Bioperl-l] Nightly build archives now available In-Reply-To: References: <628aabb70803061550s24d7d8cfhf80495ea970a6c19@mail.gmail.com> <5A67E3A9-9997-4A6B-AB07-8403D5FF388E@uiuc.edu> <47D119A1.10408@sendu.me.uk> Message-ID: <47D14CC0.8000104@sendu.me.uk> Chris Fields wrote: > I haven't tried it out yet, to tell the truth. The worry I have is > prompting during the build process for database tests, networking, etc. > > I have looked for it, but couldn't determine whether we have a way to > run 'perl Build.PL' and bypass prompts with passed arguments. The only > one I could find was 'network', for network tests. > > Scott Cain and I have corresponded about this before, i.e. it would be > nice to have boolean flags for each prompt (prereqs, database tests, > scripts, network, etc). For nightly PPMs I would forego tests and > include scripts. I don't quite understand how you're making the nightlys right now, but you should be using the dist actions: http://www.bioperl.org/wiki/Making_a_BioPerl_release Ie. One time (and one time only): perl Build.PL (it doesn't matter how you answer the questions) Then every night: ./Build dist ./Build ppmdist You then upload the resulting .tar.gz and .zip files. Only if Build.PL or ModuleBuildBioperl are updated might you need to: ./Build realclean perl Build.PL again. But this should be a rare event and even more rarely would it be /required/ (probably never). From bix at sendu.me.uk Fri Mar 7 09:19:36 2008 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 07 Mar 2008 14:19:36 +0000 Subject: [Bioperl-l] Nightly build archives now available In-Reply-To: <47D14CC0.8000104@sendu.me.uk> References: <628aabb70803061550s24d7d8cfhf80495ea970a6c19@mail.gmail.com> <5A67E3A9-9997-4A6B-AB07-8403D5FF388E@uiuc.edu> <47D119A1.10408@sendu.me.uk> <47D14CC0.8000104@sendu.me.uk> Message-ID: <47D14EF8.5090107@sendu.me.uk> Sendu Bala wrote: > Chris Fields wrote: >> I haven't tried it out yet, to tell the truth. The worry I have is >> prompting during the build process for database tests, networking, etc. >> >> I have looked for it, but couldn't determine whether we have a way to >> run 'perl Build.PL' and bypass prompts with passed arguments. The >> only one I could find was 'network', for network tests. >> >> Scott Cain and I have corresponded about this before, i.e. it would be >> nice to have boolean flags for each prompt (prereqs, database tests, >> scripts, network, etc). For nightly PPMs I would forego tests and >> include scripts. > > I don't quite understand how you're making the nightlys right now, but > you should be using the dist actions: > > http://www.bioperl.org/wiki/Making_a_BioPerl_release > > Ie. > > One time (and one time only): > perl Build.PL (it doesn't matter how you answer the questions) > > Then every night: > ./Build dist > ./Build ppmdist > > You then upload the resulting .tar.gz and .zip files. Ah, having uploaded the various archives you'll have to manually delete them before dunning the dist action the next night, otherwise dist will ask you if you want to overwrite them. Otherwise dist asks no questions. From cjfields at uiuc.edu Fri Mar 7 09:28:36 2008 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 7 Mar 2008 08:28:36 -0600 Subject: [Bioperl-l] Nightly build archives now available In-Reply-To: <47D14CC0.8000104@sendu.me.uk> References: <628aabb70803061550s24d7d8cfhf80495ea970a6c19@mail.gmail.com> <5A67E3A9-9997-4A6B-AB07-8403D5FF388E@uiuc.edu> <47D119A1.10408@sendu.me.uk> <47D14CC0.8000104@sendu.me.uk> Message-ID: On Mar 7, 2008, at 8:10 AM, Sendu Bala wrote: > Chris Fields wrote: >> I haven't tried it out yet, to tell the truth. The worry I have is >> prompting during the build process for database tests, networking, >> etc. >> I have looked for it, but couldn't determine whether we have a way >> to run 'perl Build.PL' and bypass prompts with passed arguments. >> The only one I could find was 'network', for network tests. >> Scott Cain and I have corresponded about this before, i.e. it would >> be nice to have boolean flags for each prompt (prereqs, database >> tests, scripts, network, etc). For nightly PPMs I would forego >> tests and include scripts. > > I don't quite understand how you're making the nightlys right now, > but you should be using the dist actions: > > http://www.bioperl.org/wiki/Making_a_BioPerl_release >