From barry.moore at genetics.utah.edu Thu Nov 1 00:03:01 2007 From: barry.moore at genetics.utah.edu (Barry Moore) Date: Wed, 31 Oct 2007 22:03:01 -0600 Subject: [Bioperl-l] BLAST output parsing In-Reply-To: References: <13519112.post@talk.nabble.com> Message-ID: <7BDC2187-1ABE-4CA1-AB86-98D5FD5433A4@genetics.utah.edu> Swapna- If you are using NCBI fasta files you can use files from NCBIs gene database to map your gene IDs to names and organisms. Look in particular at the files gene2accession, gene2refseq, and gene_info. For example, if you had RefSeq protein IDs like NP_123456, you could use gene2refseq to map those RefSeq accessions to gene IDs and then gene_info to map the gene IDs to organisms and gene name. B On Oct 31, 2007, at 7:27 PM, Torsten Seemann wrote: > Swapna, > >> I am new to bioperl. I did BLAST search of ~4000 genes and I need >> to parse >> it. I did use -m 9 option to get a tabular information of the >> blast data. >> But it does not include the gene names or the names of the >> organisms of each >> hit. Are there any parsers that can do this job ?? > > The -m 9 tabular output does not include gene descriptions and > organisms. It only includes the "gene id" that was present immediately > after the ">" sign in the FASTA file that was used to create the BLAST > database you specified with the -d option when you ran BLAST. > > Hence, no parser will help you. You either have to re-do the BLAST > with a different -m value that includes the information you desire, or > write code to convert your gene IDs into what you want. > > -- > --Torsten Seemann > --Victorian Bioinformatics Consortium, Monash University > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From Rohit.Ghai at mikrobio.med.uni-giessen.de Thu Nov 1 05:45:43 2007 From: Rohit.Ghai at mikrobio.med.uni-giessen.de (Rohit Ghai) Date: Thu, 01 Nov 2007 10:45:43 +0100 Subject: [Bioperl-l] bioperl: cannot run emboss programs using bioperl on windows Message-ID: <4729A047.2060507@mikrobio.med.uni-giessen.de> Dear all, I have emboss installed on a windows machine. (Embosswin). I can run this from the dos command line and the path is present. However, when I try to call an emboss application from bioperl I get a "Application not found error" my $f = Bio::Factory::EMBOSS->new(); # get an EMBOSS application object from the factory my $fuzznuc = $f->program('fuzznuc'); $fuzznuc->run( { -sequence => $infile, -pattern => $motif, -outfile => $outfile }); gives the following error -------------------- WARNING --------------------- MSG: Application [fuzznuc] is not available! --------------------------------------------------- Can't call method "run" on an undefined value at searchPatterns.pl line 102. Can somebody help me fix this ? best regards Rohit -- Dr. Rohit Ghai Institute of Medical Microbiology Faculty of Medicine Justus-Liebig University Frankfurter Strasse 107 35392 - Giessen GERMANY Tel : 0049 (0)641-9946413 Fax : 0049 (0)641-9946409 Email: Rohit.Ghai at mikrobio.med.uni-giessen.de From jason at bioperl.org Thu Nov 1 10:22:14 2007 From: jason at bioperl.org (Jason Stajich) Date: Thu, 1 Nov 2007 10:22:14 -0400 Subject: [Bioperl-l] PAML/Codeml parsing Message-ID: PAML4 breaks our PAML parser right now because the order of things in the result file has changed. Now sequences precede the information about the version or the program run. This means that $result- >get_seqs() fails because we don't parse the sequences. We'll see what we can do, but as usual with supporting 3rd party programs it is brittle when file formats change. Th -jason -- Jason Stajich jason at bioperl.org From jason at bioperl.org Thu Nov 1 10:32:06 2007 From: jason at bioperl.org (Jason Stajich) Date: Thu, 1 Nov 2007 10:32:06 -0400 Subject: [Bioperl-l] bioperl: cannot run emboss programs using bioperl on windows In-Reply-To: <4729A047.2060507@mikrobio.med.uni-giessen.de> References: <4729A047.2060507@mikrobio.med.uni-giessen.de> Message-ID: <80BA54B5-72E6-4A5B-A124-D73256644DC9@bioperl.org> Presumably the PATH is not getting set properly - you should play around printing the $ENV{PATH} variable in a perl script to see if actually contains the directory where the emboss programs are installed. Bioperl can only guess so much as to where to find an application. It is also possible that we aren't creating the proper path to the executable - you can print the executable path with print $fuzznuc->executable I believe unless it is throwing an error at the program() line. It looks like the code in the Factory object is a little fragile assuming that the programs HAVE to be in your $PATH. I don't know if windows+perl is special in any way that it run things so I can't really tell if there is specific things you have to do here. You may have to run this through cygwin in case PATH and such are just not available properly to windowsPerl. -jason On Nov 1, 2007, at 5:45 AM, Rohit Ghai wrote: > Dear all, > > I have emboss installed on a windows machine. (Embosswin). I can run > this from the dos command line and the path is present. However, > when I > try to call > an emboss application from bioperl I get a "Application not found > error" > > > my $f = Bio::Factory::EMBOSS->new(); > # get an EMBOSS application object from the factory > my $fuzznuc = $f->program('fuzznuc'); > $fuzznuc->run( > { -sequence => $infile, > -pattern => $motif, > -outfile => $outfile > }); > gives the following error > > -------------------- WARNING --------------------- > MSG: Application [fuzznuc] is not available! > --------------------------------------------------- > Can't call method "run" on an undefined value at searchPatterns.pl > line > 102. > > Can somebody help me fix this ? > > best regards > Rohit > > -- > > Dr. Rohit Ghai > Institute of Medical Microbiology > Faculty of Medicine > Justus-Liebig University > Frankfurter Strasse 107 > 35392 - Giessen > GERMANY > > Tel : 0049 (0)641-9946413 > Fax : 0049 (0)641-9946409 > Email: Rohit.Ghai at mikrobio.med.uni-giessen.de > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org From cjfields at uiuc.edu Thu Nov 1 10:54:09 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 1 Nov 2007 09:54:09 -0500 Subject: [Bioperl-l] bioperl: cannot run emboss programs using bioperl on windows In-Reply-To: <80BA54B5-72E6-4A5B-A124-D73256644DC9@bioperl.org> References: <4729A047.2060507@mikrobio.med.uni-giessen.de> <80BA54B5-72E6-4A5B-A124-D73256644DC9@bioperl.org> Message-ID: <325E8599-793F-49DC-8680-9823F9389D4C@uiuc.edu> This worked for me previously when I tested with WinXP on my old machine using EMBOSS v5: ftp://emboss.open-bio.org/pub/EMBOSS/windows I haven't tried it with EMBOSSWin (latest is v 2.7); it's probably better to use the latest EMBOSS version anyway so I suggest trying the version in the above link. I'll test it again today and let you know what I find. chris On Nov 1, 2007, at 9:32 AM, Jason Stajich wrote: > Presumably the PATH is not getting set properly - you should play > around printing the $ENV{PATH} variable in a perl script to see if > actually contains the directory where the emboss programs are > installed. Bioperl can only guess so much as to where to find an > application. It is also possible that we aren't creating the proper > path to the executable - you can print the executable path with > print $fuzznuc->executable > I believe unless it is throwing an error at the program() line. > > It looks like the code in the Factory object is a little fragile > assuming that the programs HAVE to be in your $PATH. I don't know if > windows+perl is special in any way that it run things so I can't > really tell if there is specific things you have to do here. You may > have to run this through cygwin in case PATH and such are just not > available properly to windowsPerl. > > -jason > On Nov 1, 2007, at 5:45 AM, Rohit Ghai wrote: > >> Dear all, >> >> I have emboss installed on a windows machine. (Embosswin). I can run >> this from the dos command line and the path is present. However, >> when I >> try to call >> an emboss application from bioperl I get a "Application not found >> error" >> >> >> my $f = Bio::Factory::EMBOSS->new(); >> # get an EMBOSS application object from the factory >> my $fuzznuc = $f->program('fuzznuc'); >> $fuzznuc->run( >> { -sequence => $infile, >> -pattern => $motif, >> -outfile => $outfile >> }); >> gives the following error >> >> -------------------- WARNING --------------------- >> MSG: Application [fuzznuc] is not available! >> --------------------------------------------------- >> Can't call method "run" on an undefined value at searchPatterns.pl >> line >> 102. >> >> Can somebody help me fix this ? >> >> best regards >> Rohit >> >> -- >> >> Dr. Rohit Ghai >> Institute of Medical Microbiology >> Faculty of Medicine >> Justus-Liebig University >> Frankfurter Strasse 107 >> 35392 - Giessen >> GERMANY >> >> Tel : 0049 (0)641-9946413 >> Fax : 0049 (0)641-9946409 >> Email: Rohit.Ghai at mikrobio.med.uni-giessen.de >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > Jason Stajich > jason at bioperl.org > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From jason at bioperl.org Thu Nov 1 11:31:40 2007 From: jason at bioperl.org (Jason Stajich) Date: Thu, 1 Nov 2007 11:31:40 -0400 Subject: [Bioperl-l] PAML3 vs 4 Message-ID: <23575228-2FA3-4F07-BED4-4A2309A36D71@bioperl.org> Small tweaks were needed to parse PAML4 results. Pairwise Ka, Ks parsing (runmode -2) should be working more smoothly now on both PAML 3 and 4. You'll need to get the latest code from CVS in order to see the changes to Bio/Tools/Phylo/PAML.pm I've added tests for PAML4 in the parser and the run code. If you have scripts that use codeml please give it a try. I have not attempted to play with BASEML or AAML results at this point so if you also have codes that use those programs, please try it out and provide bugreports if we need to fix things. -jason -- Jason Stajich jason at bioperl.org From Kevin.M.Brown at asu.edu Thu Nov 1 13:25:30 2007 From: Kevin.M.Brown at asu.edu (Kevin Brown) Date: Thu, 1 Nov 2007 10:25:30 -0700 Subject: [Bioperl-l] bioperl: cannot run emboss programs using bioperl onwindows In-Reply-To: <4729A047.2060507@mikrobio.med.uni-giessen.de> References: <4729A047.2060507@mikrobio.med.uni-giessen.de> Message-ID: <1A4207F8295607498283FE9E93B775B403EA7E06@EX02.asurite.ad.asu.edu> Sounds like a path issue. Try to tell bioperl the full path to the executable rather than just the executable name. > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org > [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Rohit Ghai > Sent: Thursday, November 01, 2007 2:46 AM > To: bioperl-l at lists.open-bio.org > Subject: [Bioperl-l] bioperl: cannot run emboss programs > using bioperl onwindows > > Dear all, > > I have emboss installed on a windows machine. (Embosswin). I can run > this from the dos command line and the path is present. > However, when I > try to call > an emboss application from bioperl I get a "Application not > found error" > > > my $f = Bio::Factory::EMBOSS->new(); > # get an EMBOSS application object from the factory > my $fuzznuc = $f->program('fuzznuc'); > $fuzznuc->run( > { -sequence => $infile, > -pattern => $motif, > -outfile => $outfile > }); > gives the following error > > -------------------- WARNING --------------------- > MSG: Application [fuzznuc] is not available! > --------------------------------------------------- > Can't call method "run" on an undefined value at > searchPatterns.pl line > 102. > > Can somebody help me fix this ? > > best regards > Rohit > > -- > > Dr. Rohit Ghai > Institute of Medical Microbiology > Faculty of Medicine > Justus-Liebig University > Frankfurter Strasse 107 > 35392 - Giessen > GERMANY > > Tel : 0049 (0)641-9946413 > Fax : 0049 (0)641-9946409 > Email: Rohit.Ghai at mikrobio.med.uni-giessen.de > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From Rohit.Ghai at mikrobio.med.uni-giessen.de Thu Nov 1 14:06:48 2007 From: Rohit.Ghai at mikrobio.med.uni-giessen.de (Rohit Ghai) Date: Thu, 01 Nov 2007 19:06:48 +0100 Subject: [Bioperl-l] bioperl: cannot run emboss programs using bioperlon windows In-Reply-To: <80BA54B5-72E6-4A5B-A124-D73256644DC9@bioperl.org> References: <4729A047.2060507@mikrobio.med.uni-giessen.de> <80BA54B5-72E6-4A5B-A124-D73256644DC9@bioperl.org> Message-ID: <472A15B8.7040502@mikrobio.med.uni-giessen.de> Thanks for all the suggestions... but I unfortunately still cannot run emboss. I am running the latest version of embosswin (2.10.0-Win-0.8), and the path is set correctly. I printed $ENV{$PATH} and this contains C:\EMBOSSwin which is the correct location. I also tried setting the path directly but I'm not sure how to do this, so I tried this... my $fuzznuc = $f->program('C:\\EMBOSSwin\\fuzznuc'); this also did not work. Also tried printing... $fuzznuc->executable() gave the following error again -------------------- WARNING --------------------- MSG: Application [C:\EMBOSSwin\fuzznuc] is not available! --------------------------------------------------- Any more ideas ? thanks ! Rohit here's the code... use strict; use Bio::Factory::EMBOSS; use Data::Dumper; # # print "PATH=$ENV{PATH}\n"; # path contains C:\EMBOSSwin which is the correct location # embossversion is 2.10.0-Win-0.8 my $f = Bio::Factory::EMBOSS->new(); # get an EMBOSS application object from the factory print Dumper ($f); my $fuzznuc = $f->program('C:\\EMBOSSwin\\fuzznuc'); #tried fuzznuc.exe as well, print Dump ($fuzznuc); #dump of fuzznuc #$VAR1 = bless( { # '_programgroup' => {}, # '_programs' => {}, # '_groups' => {} # }, 'Bio::Factory::EMBOSS' ); #print "executing -- >", $fuzznuc->executable, "\n" ; # doesn't work my $infile = "temp.fasta"; my $motif = "ATGTCGATC"; my $outfile = "test.out"; $fuzznuc->run( { -sequence => $infile, -pattern => $motif, -outfile => $outfile }); Here's the error again.... #-------------------- WARNING --------------------- #MSG: Application [C:\EMBOSSwin\fuzznuc] is not available! #--------------------------------------------------- Jason Stajich wrote: > Presumably the PATH is not getting set properly - you should play > around printing the $ENV{PATH} variable in a perl script to see if > actually contains the directory where the emboss programs are > installed. Bioperl can only guess so much as to where to find an > application. It is also possible that we aren't creating the proper > path to the executable - you can print the executable path with > print $fuzznuc->executable > I believe unless it is throwing an error at the program() line. > > It looks like the code in the Factory object is a little fragile > assuming that the programs HAVE to be in your $PATH. I don't know if > windows+perl is special in any way that it run things so I can't > really tell if there is specific things you have to do here. You may > have to run this through cygwin in case PATH and such are just not > available properly to windowsPerl. > > -jason > On Nov 1, 2007, at 5:45 AM, Rohit Ghai wrote: > >> Dear all, >> >> I have emboss installed on a windows machine. (Embosswin). I can run >> this from the dos command line and the path is present. However, when I >> try to call >> an emboss application from bioperl I get a "Application not found error" >> >> >> my $f = Bio::Factory::EMBOSS->new(); >> # get an EMBOSS application object from the factory >> my $fuzznuc = $f->program('fuzznuc'); >> $fuzznuc->run( >> { -sequence => $infile, >> -pattern => $motif, >> -outfile => $outfile >> }); >> gives the following error >> >> -------------------- WARNING --------------------- >> MSG: Application [fuzznuc] is not available! >> --------------------------------------------------- >> Can't call method "run" on an undefined value at searchPatterns.pl line >> 102. >> >> Can somebody help me fix this ? >> >> best regards >> Rohit >> >> -- >> >> Dr. Rohit Ghai >> Institute of Medical Microbiology >> Faculty of Medicine >> Justus-Liebig University >> Frankfurter Strasse 107 >> 35392 - Giessen >> GERMANY >> >> Tel : 0049 (0)641-9946413 >> Fax : 0049 (0)641-9946409 >> Email: Rohit.Ghai at mikrobio.med.uni-giessen.de >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > Jason Stajich > jason at bioperl.org > -- Dr. Rohit Ghai Institute of Medical Microbiology Faculty of Medicine Justus-Liebig University Frankfurter Strasse 107 35392 - Giessen GERMANY Tel : 0049 (0)641-9946413 Fax : 0049 (0)641-9946409 Email: Rohit.Ghai at mikrobio.med.uni-giessen.de From jason at bioperl.org Thu Nov 1 14:37:24 2007 From: jason at bioperl.org (Jason Stajich) Date: Thu, 1 Nov 2007 14:37:24 -0400 Subject: [Bioperl-l] bioperl: cannot run emboss programs using bioperlon windows In-Reply-To: <472A15B8.7040502@mikrobio.med.uni-giessen.de> References: <4729A047.2060507@mikrobio.med.uni-giessen.de> <80BA54B5-72E6-4A5B-A124-D73256644DC9@bioperl.org> <472A15B8.7040502@mikrobio.med.uni-giessen.de> Message-ID: <6968D1EB-FED3-463D-AF12-74A7D7F2FF3C@bioperl.org> You could try this - can't test it though so not sure. my $fuzznuc = $f->program('fuzznuc'); $fuzznuc->executable('C:\EMBOSSwin\fuzznuc'); -jason On Nov 1, 2007, at 2:06 PM, Rohit Ghai wrote: > > > Thanks for all the suggestions... but I unfortunately still cannot run > emboss. I am running the latest version of embosswin (2.10.0- > Win-0.8), > and the > path is set correctly. I printed $ENV{$PATH} and this contains > C:\EMBOSSwin which is the correct location. > I also tried setting the path directly but I'm not sure how to do > this, > so I tried this... > > my $fuzznuc = $f->program('C:\\EMBOSSwin\\fuzznuc'); > > this also did not work. > > Also tried printing... > $fuzznuc->executable() > > gave the following error again > -------------------- WARNING --------------------- > MSG: Application [C:\EMBOSSwin\fuzznuc] is not available! > --------------------------------------------------- > > Any more ideas ? > > thanks ! > Rohit > > > here's the code... > > use strict; > use Bio::Factory::EMBOSS; > use Data::Dumper; > > # > # print "PATH=$ENV{PATH}\n"; > # path contains C:\EMBOSSwin which is the correct location > # embossversion is 2.10.0-Win-0.8 > > my $f = Bio::Factory::EMBOSS->new(); > # get an EMBOSS application object from the factory > print Dumper ($f); > my $fuzznuc = $f->program('C:\\EMBOSSwin\\fuzznuc'); #tried > fuzznuc.exe > as well, > print Dump ($fuzznuc); > > #dump of fuzznuc > #$VAR1 = bless( { > # '_programgroup' => {}, > # '_programs' => {}, > # '_groups' => {} > # }, 'Bio::Factory::EMBOSS' ); > > #print "executing -- >", $fuzznuc->executable, "\n" ; # doesn't work > > my $infile = "temp.fasta"; > my $motif = "ATGTCGATC"; > my $outfile = "test.out"; > > > $fuzznuc->run( > { -sequence => $infile, > -pattern => $motif, > -outfile => $outfile > }); > > Here's the error again.... > > #-------------------- WARNING --------------------- > #MSG: Application [C:\EMBOSSwin\fuzznuc] is not available! > #--------------------------------------------------- > > > > > Jason Stajich wrote: >> Presumably the PATH is not getting set properly - you should play >> around printing the $ENV{PATH} variable in a perl script to see if >> actually contains the directory where the emboss programs are >> installed. Bioperl can only guess so much as to where to find an >> application. It is also possible that we aren't creating the proper >> path to the executable - you can print the executable path with >> print $fuzznuc->executable >> I believe unless it is throwing an error at the program() line. >> >> It looks like the code in the Factory object is a little fragile >> assuming that the programs HAVE to be in your $PATH. I don't know if >> windows+perl is special in any way that it run things so I can't >> really tell if there is specific things you have to do here. You may >> have to run this through cygwin in case PATH and such are just not >> available properly to windowsPerl. >> >> -jason >> On Nov 1, 2007, at 5:45 AM, Rohit Ghai wrote: >> >>> Dear all, >>> >>> I have emboss installed on a windows machine. (Embosswin). I can run >>> this from the dos command line and the path is present. However, >>> when I >>> try to call >>> an emboss application from bioperl I get a "Application not found >>> error" >>> >>> >>> my $f = Bio::Factory::EMBOSS->new(); >>> # get an EMBOSS application object from the factory >>> my $fuzznuc = $f->program('fuzznuc'); >>> $fuzznuc->run( >>> { -sequence => $infile, >>> -pattern => $motif, >>> -outfile => $outfile >>> }); >>> gives the following error >>> >>> -------------------- WARNING --------------------- >>> MSG: Application [fuzznuc] is not available! >>> --------------------------------------------------- >>> Can't call method "run" on an undefined value at >>> searchPatterns.pl line >>> 102. >>> >>> Can somebody help me fix this ? >>> >>> best regards >>> Rohit >>> >>> -- >>> >>> Dr. Rohit Ghai >>> Institute of Medical Microbiology >>> Faculty of Medicine >>> Justus-Liebig University >>> Frankfurter Strasse 107 >>> 35392 - Giessen >>> GERMANY >>> >>> Tel : 0049 (0)641-9946413 >>> Fax : 0049 (0)641-9946409 >>> Email: Rohit.Ghai at mikrobio.med.uni-giessen.de >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> -- >> Jason Stajich >> jason at bioperl.org >> > > -- > > Dr. Rohit Ghai > Institute of Medical Microbiology > Faculty of Medicine > Justus-Liebig University > Frankfurter Strasse 107 > 35392 - Giessen > GERMANY > > Tel : 0049 (0)641-9946413 > Fax : 0049 (0)641-9946409 > Email: Rohit.Ghai at mikrobio.med.uni-giessen.de > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org From Rohit.Ghai at mikrobio.med.uni-giessen.de Thu Nov 1 14:41:41 2007 From: Rohit.Ghai at mikrobio.med.uni-giessen.de (Rohit Ghai) Date: Thu, 01 Nov 2007 19:41:41 +0100 Subject: [Bioperl-l] bioperl: cannot run emboss programs using bioperlonwindows In-Reply-To: <6968D1EB-FED3-463D-AF12-74A7D7F2FF3C@bioperl.org> References: <4729A047.2060507@mikrobio.med.uni-giessen.de> <80BA54B5-72E6-4A5B-A124-D73256644DC9@bioperl.org> <472A15B8.7040502@mikrobio.med.uni-giessen.de> <6968D1EB-FED3-463D-AF12-74A7D7F2FF3C@bioperl.org> Message-ID: <472A1DE5.30207@mikrobio.med.uni-giessen.de> Hi Jason I tried this as well. This also gives the same error message. -Rohit Jason Stajich wrote: > You could try this - can't test it though so not sure. > my $fuzznuc = $f->program('fuzznuc'); > $fuzznuc->executable('C:\EMBOSSwin\fuzznuc'); > > -jason > On Nov 1, 2007, at 2:06 PM, Rohit Ghai wrote: > >> >> >> Thanks for all the suggestions... but I unfortunately still cannot run >> emboss. I am running the latest version of embosswin (2.10.0-Win-0.8), >> and the >> path is set correctly. I printed $ENV{$PATH} and this contains >> C:\EMBOSSwin which is the correct location. >> I also tried setting the path directly but I'm not sure how to do this, >> so I tried this... >> >> my $fuzznuc = $f->program('C:\\EMBOSSwin\\fuzznuc'); >> >> this also did not work. >> >> Also tried printing... >> $fuzznuc->executable() >> >> gave the following error again >> -------------------- WARNING --------------------- >> MSG: Application [C:\EMBOSSwin\fuzznuc] is not available! >> --------------------------------------------------- >> >> Any more ideas ? >> >> thanks ! >> Rohit >> >> >> here's the code... >> >> use strict; >> use Bio::Factory::EMBOSS; >> use Data::Dumper; >> >> # >> # print "PATH=$ENV{PATH}\n"; >> # path contains C:\EMBOSSwin which is the correct location >> # embossversion is 2.10.0-Win-0.8 >> >> my $f = Bio::Factory::EMBOSS->new(); >> # get an EMBOSS application object from the factory >> print Dumper ($f); >> my $fuzznuc = $f->program('C:\\EMBOSSwin\\fuzznuc'); #tried fuzznuc.exe >> as well, >> print Dump ($fuzznuc); >> >> #dump of fuzznuc >> #$VAR1 = bless( { >> # '_programgroup' => {}, >> # '_programs' => {}, >> # '_groups' => {} >> # }, 'Bio::Factory::EMBOSS' ); >> >> #print "executing -- >", $fuzznuc->executable, "\n" ; # doesn't work >> >> my $infile = "temp.fasta"; >> my $motif = "ATGTCGATC"; >> my $outfile = "test.out"; >> >> >> $fuzznuc->run( >> { -sequence => $infile, >> -pattern => $motif, >> -outfile => $outfile >> }); >> >> Here's the error again.... >> >> #-------------------- WARNING --------------------- >> #MSG: Application [C:\EMBOSSwin\fuzznuc] is not available! >> #--------------------------------------------------- >> >> >> >> >> Jason Stajich wrote: >>> Presumably the PATH is not getting set properly - you should play >>> around printing the $ENV{PATH} variable in a perl script to see if >>> actually contains the directory where the emboss programs are >>> installed. Bioperl can only guess so much as to where to find an >>> application. It is also possible that we aren't creating the proper >>> path to the executable - you can print the executable path with >>> print $fuzznuc->executable >>> I believe unless it is throwing an error at the program() line. >>> >>> It looks like the code in the Factory object is a little fragile >>> assuming that the programs HAVE to be in your $PATH. I don't know if >>> windows+perl is special in any way that it run things so I can't >>> really tell if there is specific things you have to do here. You may >>> have to run this through cygwin in case PATH and such are just not >>> available properly to windowsPerl. >>> >>> -jason >>> On Nov 1, 2007, at 5:45 AM, Rohit Ghai wrote: >>> >>>> Dear all, >>>> >>>> I have emboss installed on a windows machine. (Embosswin). I can run >>>> this from the dos command line and the path is present. However, >>>> when I >>>> try to call >>>> an emboss application from bioperl I get a "Application not found >>>> error" >>>> >>>> >>>> my $f = Bio::Factory::EMBOSS->new(); >>>> # get an EMBOSS application object from the factory >>>> my $fuzznuc = $f->program('fuzznuc'); >>>> $fuzznuc->run( >>>> { -sequence => $infile, >>>> -pattern => $motif, >>>> -outfile => $outfile >>>> }); >>>> gives the following error >>>> >>>> -------------------- WARNING --------------------- >>>> MSG: Application [fuzznuc] is not available! >>>> --------------------------------------------------- >>>> Can't call method "run" on an undefined value at searchPatterns.pl >>>> line >>>> 102. >>>> >>>> Can somebody help me fix this ? >>>> >>>> best regards >>>> Rohit >>>> >>>> -- >>>> >>>> Dr. Rohit Ghai >>>> Institute of Medical Microbiology >>>> Faculty of Medicine >>>> Justus-Liebig University >>>> Frankfurter Strasse 107 >>>> 35392 - Giessen >>>> GERMANY >>>> >>>> Tel : 0049 (0)641-9946413 >>>> Fax : 0049 (0)641-9946409 >>>> Email: Rohit.Ghai at mikrobio.med.uni-giessen.de >>>> >>>> >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> -- >>> Jason Stajich >>> jason at bioperl.org >>> >> >> -- >> >> Dr. Rohit Ghai >> Institute of Medical Microbiology >> Faculty of Medicine >> Justus-Liebig University >> Frankfurter Strasse 107 >> 35392 - Giessen >> GERMANY >> >> Tel : 0049 (0)641-9946413 >> Fax : 0049 (0)641-9946409 >> Email: Rohit.Ghai at mikrobio.med.uni-giessen.de >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > Jason Stajich > jason at bioperl.org > -- Dr. Rohit Ghai Institute of Medical Microbiology Faculty of Medicine Justus-Liebig University Frankfurter Strasse 107 35392 - Giessen GERMANY Tel : 0049 (0)641-9946413 Fax : 0049 (0)641-9946409 Email: Rohit.Ghai at mikrobio.med.uni-giessen.de From MEC at stowers-institute.org Thu Nov 1 14:57:33 2007 From: MEC at stowers-institute.org (Cook, Malcolm) Date: Thu, 1 Nov 2007 13:57:33 -0500 Subject: [Bioperl-l] bioperl: cannot run emboss programs usingbioperlonwindows In-Reply-To: <472A1DE5.30207@mikrobio.med.uni-giessen.de> References: <4729A047.2060507@mikrobio.med.uni-giessen.de><80BA54B5-72E6-4A5B-A124-D73256644DC9@bioperl.org><472A15B8.7040502@mikrobio.med.uni-giessen.de><6968D1EB-FED3-463D-AF12-74A7D7F2FF3C@bioperl.org> <472A1DE5.30207@mikrobio.med.uni-giessen.de> Message-ID: in the code http://doc.bioperl.org/bioperl-run/Bio/Factory/EMBOSS.html#CODE6 there is a call to `wossname` (c.f. http://emboss.sourceforge.net/apps/release/4.0/emboss/apps/wossname.html ) is wossname in your path? Maybe it needs to be wossname.exe under windows? Malcolm Cook > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org > [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Rohit Ghai > Sent: Thursday, November 01, 2007 1:42 PM > To: Jason Stajich > Cc: bioperl-l at lists.open-bio.org > Subject: Re: [Bioperl-l] bioperl: cannot run emboss programs > usingbioperlonwindows > > Hi Jason > > I tried this as well. This also gives the same error message. > > -Rohit > > Jason Stajich wrote: > > You could try this - can't test it though so not sure. > > my $fuzznuc = $f->program('fuzznuc'); > > $fuzznuc->executable('C:\EMBOSSwin\fuzznuc'); > > > > -jason > > On Nov 1, 2007, at 2:06 PM, Rohit Ghai wrote: > > > >> > >> > >> Thanks for all the suggestions... but I unfortunately still cannot > >> run emboss. I am running the latest version of embosswin > >> (2.10.0-Win-0.8), and the path is set correctly. I printed > >> $ENV{$PATH} and this contains C:\EMBOSSwin which is the correct > >> location. > >> I also tried setting the path directly but I'm not sure how to do > >> this, so I tried this... > >> > >> my $fuzznuc = $f->program('C:\\EMBOSSwin\\fuzznuc'); > >> > >> this also did not work. > >> > >> Also tried printing... > >> $fuzznuc->executable() > >> > >> gave the following error again > >> -------------------- WARNING --------------------- > >> MSG: Application [C:\EMBOSSwin\fuzznuc] is not available! > >> --------------------------------------------------- > >> > >> Any more ideas ? > >> > >> thanks ! > >> Rohit > >> > >> > >> here's the code... > >> > >> use strict; > >> use Bio::Factory::EMBOSS; > >> use Data::Dumper; > >> > >> # > >> # print "PATH=$ENV{PATH}\n"; > >> # path contains C:\EMBOSSwin which is the correct location # > >> embossversion is 2.10.0-Win-0.8 > >> > >> my $f = Bio::Factory::EMBOSS->new(); # get an EMBOSS > application > >> object from the factory print Dumper ($f); my $fuzznuc = > >> $f->program('C:\\EMBOSSwin\\fuzznuc'); #tried fuzznuc.exe > as well, > >> print Dump ($fuzznuc); > >> > >> #dump of fuzznuc > >> #$VAR1 = bless( { > >> # '_programgroup' => {}, > >> # '_programs' => {}, > >> # '_groups' => {} > >> # }, 'Bio::Factory::EMBOSS' ); > >> > >> #print "executing -- >", $fuzznuc->executable, "\n" ; # > doesn't work > >> > >> my $infile = "temp.fasta"; > >> my $motif = "ATGTCGATC"; > >> my $outfile = "test.out"; > >> > >> > >> $fuzznuc->run( > >> { -sequence => $infile, > >> -pattern => $motif, > >> -outfile => $outfile > >> }); > >> > >> Here's the error again.... > >> > >> #-------------------- WARNING --------------------- > >> #MSG: Application [C:\EMBOSSwin\fuzznuc] is not available! > >> #--------------------------------------------------- > >> > >> > >> > >> > >> Jason Stajich wrote: > >>> Presumably the PATH is not getting set properly - you should play > >>> around printing the $ENV{PATH} variable in a perl script > to see if > >>> actually contains the directory where the emboss programs are > >>> installed. Bioperl can only guess so much as to where to find an > >>> application. It is also possible that we aren't creating > the proper > >>> path to the executable - you can print the executable path with > >>> print $fuzznuc->executable I believe unless it is > throwing an error > >>> at the program() line. > >>> > >>> It looks like the code in the Factory object is a little fragile > >>> assuming that the programs HAVE to be in your $PATH. I > don't know > >>> if > >>> windows+perl is special in any way that it run things so I can't > >>> really tell if there is specific things you have to do > here. You may > >>> have to run this through cygwin in case PATH and such are > just not > >>> available properly to windowsPerl. > >>> > >>> -jason > >>> On Nov 1, 2007, at 5:45 AM, Rohit Ghai wrote: > >>> > >>>> Dear all, > >>>> > >>>> I have emboss installed on a windows machine. (Embosswin). I can > >>>> run this from the dos command line and the path is present. > >>>> However, when I try to call an emboss application from bioperl I > >>>> get a "Application not found error" > >>>> > >>>> > >>>> my $f = Bio::Factory::EMBOSS->new(); > >>>> # get an EMBOSS application object from the factory > >>>> my $fuzznuc = $f->program('fuzznuc'); > >>>> $fuzznuc->run( > >>>> { -sequence => $infile, > >>>> -pattern => $motif, > >>>> -outfile => $outfile > > >>>> }); > >>>> gives the following error > >>>> > >>>> -------------------- WARNING --------------------- > >>>> MSG: Application [fuzznuc] is not available! > >>>> --------------------------------------------------- > >>>> Can't call method "run" on an undefined value at > searchPatterns.pl > >>>> line 102. > >>>> > >>>> Can somebody help me fix this ? > >>>> > >>>> best regards > >>>> Rohit > >>>> > >>>> -- > >>>> > >>>> Dr. Rohit Ghai > >>>> Institute of Medical Microbiology > >>>> Faculty of Medicine > >>>> Justus-Liebig University > >>>> Frankfurter Strasse 107 > >>>> 35392 - Giessen > >>>> GERMANY > >>>> > >>>> Tel : 0049 (0)641-9946413 > >>>> Fax : 0049 (0)641-9946409 > >>>> Email: Rohit.Ghai at mikrobio.med.uni-giessen.de > >>>> > >>>> > >>>> > >>>> _______________________________________________ > >>>> Bioperl-l mailing list > >>>> Bioperl-l at lists.open-bio.org > > >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >>> > >>> -- > >>> Jason Stajich > >>> jason at bioperl.org > >>> > >> > >> -- > >> > >> Dr. Rohit Ghai > >> Institute of Medical Microbiology > >> Faculty of Medicine > >> Justus-Liebig University > >> Frankfurter Strasse 107 > >> 35392 - Giessen > >> GERMANY > >> > >> Tel : 0049 (0)641-9946413 > >> Fax : 0049 (0)641-9946409 > >> Email: Rohit.Ghai at mikrobio.med.uni-giessen.de > >> > >> > >> _______________________________________________ > >> Bioperl-l mailing list > >> Bioperl-l at lists.open-bio.org > >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > -- > > Jason Stajich > > jason at bioperl.org > > > > -- > > Dr. Rohit Ghai > Institute of Medical Microbiology > Faculty of Medicine > Justus-Liebig University > Frankfurter Strasse 107 > 35392 - Giessen > GERMANY > > Tel : 0049 (0)641-9946413 > Fax : 0049 (0)641-9946409 > Email: Rohit.Ghai at mikrobio.med.uni-giessen.de > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From arareko at campus.iztacala.unam.mx Thu Nov 1 15:51:41 2007 From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra) Date: Thu, 01 Nov 2007 13:51:41 -0600 Subject: [Bioperl-l] bioperl: cannot run emboss programs usingbioperlonwindows In-Reply-To: References: <4729A047.2060507@mikrobio.med.uni-giessen.de><80BA54B5-72E6-4A5B-A124-D73256644DC9@bioperl.org><472A15B8.7040502@mikrobio.med.uni-giessen.de><6968D1EB-FED3-463D-AF12-74A7D7F2FF3C@bioperl.org> <472A1DE5.30207@mikrobio.med.uni-giessen.de> Message-ID: <472A2E4D.8080903@campus.iztacala.unam.mx> Doesn't EMBOSS binaries live under 'bin'? Perhaps setting PATH=$ENV{PATH} to 'C:\EMBOSSwin\bin' or using this: my $fuzznuc = $f->program('fuzznuc'); $fuzznuc->executable('C:\EMBOSSwin\bin\fuzznuc'); Adding .exe might be worth trying as well. Mauricio. Cook, Malcolm wrote: > in the code > http://doc.bioperl.org/bioperl-run/Bio/Factory/EMBOSS.html#CODE6 > > there is a call to `wossname` (c.f. > http://emboss.sourceforge.net/apps/release/4.0/emboss/apps/wossname.html > ) > > is wossname in your path? > > Maybe it needs to be wossname.exe under windows? > > > Malcolm Cook > > >> -----Original Message----- >> From: bioperl-l-bounces at lists.open-bio.org >> [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Rohit Ghai >> Sent: Thursday, November 01, 2007 1:42 PM >> To: Jason Stajich >> Cc: bioperl-l at lists.open-bio.org >> Subject: Re: [Bioperl-l] bioperl: cannot run emboss programs >> usingbioperlonwindows >> >> Hi Jason >> >> I tried this as well. This also gives the same error message. >> >> -Rohit >> >> Jason Stajich wrote: >>> You could try this - can't test it though so not sure. >>> my $fuzznuc = $f->program('fuzznuc'); >>> $fuzznuc->executable('C:\EMBOSSwin\fuzznuc'); >>> >>> -jason >>> On Nov 1, 2007, at 2:06 PM, Rohit Ghai wrote: >>> >>>> >>>> Thanks for all the suggestions... but I unfortunately still cannot >>>> run emboss. I am running the latest version of embosswin >>>> (2.10.0-Win-0.8), and the path is set correctly. I printed >>>> $ENV{$PATH} and this contains C:\EMBOSSwin which is the correct >>>> location. >>>> I also tried setting the path directly but I'm not sure how to do >>>> this, so I tried this... >>>> >>>> my $fuzznuc = $f->program('C:\\EMBOSSwin\\fuzznuc'); >>>> >>>> this also did not work. >>>> >>>> Also tried printing... >>>> $fuzznuc->executable() >>>> >>>> gave the following error again >>>> -------------------- WARNING --------------------- >>>> MSG: Application [C:\EMBOSSwin\fuzznuc] is not available! >>>> --------------------------------------------------- >>>> >>>> Any more ideas ? >>>> >>>> thanks ! >>>> Rohit >>>> >>>> >>>> here's the code... >>>> >>>> use strict; >>>> use Bio::Factory::EMBOSS; >>>> use Data::Dumper; >>>> >>>> # >>>> # print "PATH=$ENV{PATH}\n"; >>>> # path contains C:\EMBOSSwin which is the correct location # >>>> embossversion is 2.10.0-Win-0.8 >>>> >>>> my $f = Bio::Factory::EMBOSS->new(); # get an EMBOSS >> application >>>> object from the factory print Dumper ($f); my $fuzznuc = >>>> $f->program('C:\\EMBOSSwin\\fuzznuc'); #tried fuzznuc.exe >> as well, >>>> print Dump ($fuzznuc); >>>> >>>> #dump of fuzznuc >>>> #$VAR1 = bless( { >>>> # '_programgroup' => {}, >>>> # '_programs' => {}, >>>> # '_groups' => {} >>>> # }, 'Bio::Factory::EMBOSS' ); >>>> >>>> #print "executing -- >", $fuzznuc->executable, "\n" ; # >> doesn't work >>>> my $infile = "temp.fasta"; >>>> my $motif = "ATGTCGATC"; >>>> my $outfile = "test.out"; >>>> >>>> >>>> $fuzznuc->run( >>>> { -sequence => $infile, >>>> -pattern => $motif, >>>> -outfile => $outfile >>>> }); >>>> >>>> Here's the error again.... >>>> >>>> #-------------------- WARNING --------------------- >>>> #MSG: Application [C:\EMBOSSwin\fuzznuc] is not available! >>>> #--------------------------------------------------- >>>> >>>> >>>> >>>> >>>> Jason Stajich wrote: >>>>> Presumably the PATH is not getting set properly - you should play >>>>> around printing the $ENV{PATH} variable in a perl script >> to see if >>>>> actually contains the directory where the emboss programs are >>>>> installed. Bioperl can only guess so much as to where to find an >>>>> application. It is also possible that we aren't creating >> the proper >>>>> path to the executable - you can print the executable path with >>>>> print $fuzznuc->executable I believe unless it is >> throwing an error >>>>> at the program() line. >>>>> >>>>> It looks like the code in the Factory object is a little fragile >>>>> assuming that the programs HAVE to be in your $PATH. I >> don't know >>>>> if >>>>> windows+perl is special in any way that it run things so I can't >>>>> really tell if there is specific things you have to do >> here. You may >>>>> have to run this through cygwin in case PATH and such are >> just not >>>>> available properly to windowsPerl. >>>>> >>>>> -jason >>>>> On Nov 1, 2007, at 5:45 AM, Rohit Ghai wrote: >>>>> >>>>>> Dear all, >>>>>> >>>>>> I have emboss installed on a windows machine. (Embosswin). I can >>>>>> run this from the dos command line and the path is present. >>>>>> However, when I try to call an emboss application from bioperl I >>>>>> get a "Application not found error" >>>>>> >>>>>> >>>>>> my $f = Bio::Factory::EMBOSS->new(); >>>>>> # get an EMBOSS application object from the factory >>>>>> my $fuzznuc = $f->program('fuzznuc'); >>>>>> $fuzznuc->run( >>>>>> { -sequence => $infile, >>>>>> -pattern => $motif, >>>>>> -outfile => $outfile >> >>>>>> }); >>>>>> gives the following error >>>>>> >>>>>> -------------------- WARNING --------------------- >>>>>> MSG: Application [fuzznuc] is not available! >>>>>> --------------------------------------------------- >>>>>> Can't call method "run" on an undefined value at >> searchPatterns.pl >>>>>> line 102. >>>>>> >>>>>> Can somebody help me fix this ? >>>>>> >>>>>> best regards >>>>>> Rohit >>>>>> >>>>>> -- >>>>>> >>>>>> Dr. Rohit Ghai >>>>>> Institute of Medical Microbiology >>>>>> Faculty of Medicine >>>>>> Justus-Liebig University >>>>>> Frankfurter Strasse 107 >>>>>> 35392 - Giessen >>>>>> GERMANY >>>>>> >>>>>> Tel : 0049 (0)641-9946413 >>>>>> Fax : 0049 (0)641-9946409 >>>>>> Email: Rohit.Ghai at mikrobio.med.uni-giessen.de >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Bioperl-l mailing list >>>>>> Bioperl-l at lists.open-bio.org >> >>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>>> -- >>>>> Jason Stajich >>>>> jason at bioperl.org >>>>> >>>> -- >>>> >>>> Dr. Rohit Ghai >>>> Institute of Medical Microbiology >>>> Faculty of Medicine >>>> Justus-Liebig University >>>> Frankfurter Strasse 107 >>>> 35392 - Giessen >>>> GERMANY >>>> >>>> Tel : 0049 (0)641-9946413 >>>> Fax : 0049 (0)641-9946409 >>>> Email: Rohit.Ghai at mikrobio.med.uni-giessen.de >>>> >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> -- >>> Jason Stajich >>> jason at bioperl.org >>> >> -- >> >> Dr. Rohit Ghai >> Institute of Medical Microbiology >> Faculty of Medicine >> Justus-Liebig University >> Frankfurter Strasse 107 >> 35392 - Giessen >> GERMANY >> >> Tel : 0049 (0)641-9946413 >> Fax : 0049 (0)641-9946409 >> Email: Rohit.Ghai at mikrobio.med.uni-giessen.de >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- MAURICIO HERRERA CUADRA arareko at campus.iztacala.unam.mx Laboratorio de Gen?tica Unidad de Morfofisiolog?a y Funci?n Facultad de Estudios Superiores Iztacala, UNAM From cjfields at uiuc.edu Thu Nov 1 16:07:39 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 1 Nov 2007 15:07:39 -0500 Subject: [Bioperl-l] bioperl: cannot run emboss programs using bioperlonwindows In-Reply-To: <472A1DE5.30207@mikrobio.med.uni-giessen.de> References: <4729A047.2060507@mikrobio.med.uni-giessen.de> <80BA54B5-72E6-4A5B-A124-D73256644DC9@bioperl.org> <472A15B8.7040502@mikrobio.med.uni-giessen.de> <6968D1EB-FED3-463D-AF12-74A7D7F2FF3C@bioperl.org> <472A1DE5.30207@mikrobio.med.uni-giessen.de> Message-ID: <28223F7B-045A-4CC7-8FE7-583D0F8F7D44@uiuc.edu> I did a little investigating using my old PC and was able to get fuzznuc to run using BioPerl and EMBOSS v5. I had to jump through a hoop or two but I managed to get it working. First, realize that EMBOSSWin is NOT the latest EMBOSS for Windows. You need to remove EMBOSSWin and install the one I linked to previously (this is an actual EMBOSS beta release). It's possible older EMBOSSWin can be configured, but I don't plan on checking it out myself. Next, you need to ensure the binaries are in your PATH env. variable (test by running 'wossname' on the command line), then set EMBOSS_DATA to point at the EMBOSS data directory using a UNIX-like path (i.e. 'C:/mEMBOSS/data'); regular Win32 paths didn't work for me and WinXP recognizes the UNIX'y form as a valid path. If you don't know how to set env. variables go here: http://vlaurie.com/computers2/Articles/environment.htm Once that is set up you should be able to run the script using the latest (greatest?) EMBOSS. chris On Nov 1, 2007, at 1:41 PM, Rohit Ghai wrote: > Hi Jason > > I tried this as well. This also gives the same error message. > > -Rohit > > Jason Stajich wrote: >> You could try this - can't test it though so not sure. >> my $fuzznuc = $f->program('fuzznuc'); >> $fuzznuc->executable('C:\EMBOSSwin\fuzznuc'); >> >> -jason >> On Nov 1, 2007, at 2:06 PM, Rohit Ghai wrote: >> >>> >>> >>> Thanks for all the suggestions... but I unfortunately still >>> cannot run >>> emboss. I am running the latest version of embosswin (2.10.0- >>> Win-0.8), >>> and the >>> path is set correctly. I printed $ENV{$PATH} and this contains >>> C:\EMBOSSwin which is the correct location. >>> I also tried setting the path directly but I'm not sure how to do >>> this, >>> so I tried this... >>> >>> my $fuzznuc = $f->program('C:\\EMBOSSwin\\fuzznuc'); >>> >>> this also did not work. >>> >>> Also tried printing... >>> $fuzznuc->executable() >>> >>> gave the following error again >>> -------------------- WARNING --------------------- >>> MSG: Application [C:\EMBOSSwin\fuzznuc] is not available! >>> --------------------------------------------------- >>> >>> Any more ideas ? >>> >>> thanks ! >>> Rohit >>> >>> >>> here's the code... >>> >>> use strict; >>> use Bio::Factory::EMBOSS; >>> use Data::Dumper; >>> >>> # >>> # print "PATH=$ENV{PATH}\n"; >>> # path contains C:\EMBOSSwin which is the correct location >>> # embossversion is 2.10.0-Win-0.8 >>> >>> my $f = Bio::Factory::EMBOSS->new(); >>> # get an EMBOSS application object from the factory >>> print Dumper ($f); >>> my $fuzznuc = $f->program('C:\\EMBOSSwin\\fuzznuc'); #tried >>> fuzznuc.exe >>> as well, >>> print Dump ($fuzznuc); >>> >>> #dump of fuzznuc >>> #$VAR1 = bless( { >>> # '_programgroup' => {}, >>> # '_programs' => {}, >>> # '_groups' => {} >>> # }, 'Bio::Factory::EMBOSS' ); >>> >>> #print "executing -- >", $fuzznuc->executable, "\n" ; # doesn't >>> work >>> >>> my $infile = "temp.fasta"; >>> my $motif = "ATGTCGATC"; >>> my $outfile = "test.out"; >>> >>> >>> $fuzznuc->run( >>> { -sequence => $infile, >>> -pattern => $motif, >>> -outfile => $outfile >>> }); >>> >>> Here's the error again.... >>> >>> #-------------------- WARNING --------------------- >>> #MSG: Application [C:\EMBOSSwin\fuzznuc] is not available! >>> #--------------------------------------------------- >>> >>> >>> >>> >>> Jason Stajich wrote: >>>> Presumably the PATH is not getting set properly - you should play >>>> around printing the $ENV{PATH} variable in a perl script to see if >>>> actually contains the directory where the emboss programs are >>>> installed. Bioperl can only guess so much as to where to find an >>>> application. It is also possible that we aren't creating the >>>> proper >>>> path to the executable - you can print the executable path with >>>> print $fuzznuc->executable >>>> I believe unless it is throwing an error at the program() line. >>>> >>>> It looks like the code in the Factory object is a little fragile >>>> assuming that the programs HAVE to be in your $PATH. I don't >>>> know if >>>> windows+perl is special in any way that it run things so I can't >>>> really tell if there is specific things you have to do here. You >>>> may >>>> have to run this through cygwin in case PATH and such are just not >>>> available properly to windowsPerl. >>>> >>>> -jason >>>> On Nov 1, 2007, at 5:45 AM, Rohit Ghai wrote: >>>> >>>>> Dear all, >>>>> >>>>> I have emboss installed on a windows machine. (Embosswin). I >>>>> can run >>>>> this from the dos command line and the path is present. However, >>>>> when I >>>>> try to call >>>>> an emboss application from bioperl I get a "Application not found >>>>> error" >>>>> >>>>> >>>>> my $f = Bio::Factory::EMBOSS->new(); >>>>> # get an EMBOSS application object from the factory >>>>> my $fuzznuc = $f->program('fuzznuc'); >>>>> $fuzznuc->run( >>>>> { -sequence => $infile, >>>>> -pattern => $motif, >>>>> -outfile => $outfile >>>>> }); >>>>> gives the following error >>>>> >>>>> -------------------- WARNING --------------------- >>>>> MSG: Application [fuzznuc] is not available! >>>>> --------------------------------------------------- >>>>> Can't call method "run" on an undefined value at searchPatterns.pl >>>>> line >>>>> 102. >>>>> >>>>> Can somebody help me fix this ? >>>>> >>>>> best regards >>>>> Rohit >>>>> >>>>> -- >>>>> >>>>> Dr. Rohit Ghai >>>>> Institute of Medical Microbiology >>>>> Faculty of Medicine >>>>> Justus-Liebig University >>>>> Frankfurter Strasse 107 >>>>> 35392 - Giessen >>>>> GERMANY >>>>> >>>>> Tel : 0049 (0)641-9946413 >>>>> Fax : 0049 (0)641-9946409 >>>>> Email: Rohit.Ghai at mikrobio.med.uni-giessen.de >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> Bioperl-l mailing list >>>>> Bioperl-l at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>>> -- >>>> Jason Stajich >>>> jason at bioperl.org >>>> >>> >>> -- >>> >>> Dr. Rohit Ghai >>> Institute of Medical Microbiology >>> Faculty of Medicine >>> Justus-Liebig University >>> Frankfurter Strasse 107 >>> 35392 - Giessen >>> GERMANY >>> >>> Tel : 0049 (0)641-9946413 >>> Fax : 0049 (0)641-9946409 >>> Email: Rohit.Ghai at mikrobio.med.uni-giessen.de >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> -- >> Jason Stajich >> jason at bioperl.org >> > > -- > > Dr. Rohit Ghai > Institute of Medical Microbiology > Faculty of Medicine > Justus-Liebig University > Frankfurter Strasse 107 > 35392 - Giessen > GERMANY > > Tel : 0049 (0)641-9946413 > Fax : 0049 (0)641-9946409 > Email: Rohit.Ghai at mikrobio.med.uni-giessen.de > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From neetisomaiya at gmail.com Fri Nov 2 00:20:27 2007 From: neetisomaiya at gmail.com (neeti somaiya) Date: Fri, 2 Nov 2007 09:50:27 +0530 Subject: [Bioperl-l] need help Message-ID: <764978cf0711012120o11010624r5a43e51d33b25e75@mail.gmail.com> Hi, This is a perl question, not bioperl. Can anyone point me to a perl program/code/function which can calculate the number of days between any two given dates. Any help will be deeply appreciated. Thanks. -- -Neeti Even my blood says, B positive From whs at ebi.ac.uk Fri Nov 2 01:01:20 2007 From: whs at ebi.ac.uk (Will Spooner) Date: Fri, 2 Nov 2007 05:01:20 +0000 (GMT) Subject: [Bioperl-l] need help In-Reply-To: <764978cf0711012120o11010624r5a43e51d33b25e75@mail.gmail.com> References: <764978cf0711012120o11010624r5a43e51d33b25e75@mail.gmail.com> Message-ID: Hi Neeti, A non-bioperl answer to your perl questio; Date::Calc should do the trick. Will On Fri, 2 Nov 2007, neeti somaiya wrote: > Hi, > > This is a perl question, not bioperl. > Can anyone point me to a perl program/code/function which can calculate the > number of days between any two given dates. > Any help will be deeply appreciated. > Thanks. > > From smarkel at accelrys.com Sat Nov 3 02:01:38 2007 From: smarkel at accelrys.com (Scott Markel) Date: Fri, 2 Nov 2007 23:01:38 -0700 Subject: [Bioperl-l] bioperl: cannot run emboss programs using bioperlon windows In-Reply-To: <6968D1EB-FED3-463D-AF12-74A7D7F2FF3C@bioperl.org> References: <4729A047.2060507@mikrobio.med.uni-giessen.de> <80BA54B5-72E6-4A5B-A124-D73256644DC9@bioperl.org> <472A15B8.7040502@mikrobio.med.uni-giessen.de> <6968D1EB-FED3-463D-AF12-74A7D7F2FF3C@bioperl.org> Message-ID: I set multiple environment variables in my code. $ENV{EMBOSS_ROOT} = $embossPath; $ENV{EMBOSS_ACDROOT} = File::Spec->catdir($embossPath, "acd"); $ENV{EMBOSS_DB_DIR} = File::Spec->catdir($embossPath, "test"); $ENV{EMBOSS_DATA} = File::Spec->catdir($embossPath, "data"); $ENV{PATH} = $embossPath; I found it necessary to set both PATH and EMBOSS_ROOT. Scott Scott Markel, Ph.D. Principal Bioinformatics Architect email: smarkel at accelrys.com Accelrys (SciTegic R&D) mobile: +1 858 205 3653 10188 Telesis Court, Suite 100 voice: +1 858 799 5603 San Diego, CA 92121 fax: +1 858 799 5222 USA web: http://www.accelrys.com bioperl-l-bounces at lists.open-bio.org wrote on 01.11.2007 11:37:24: > You could try this - can't test it though so not sure. > my $fuzznuc = $f->program('fuzznuc'); > $fuzznuc->executable('C:\EMBOSSwin\fuzznuc'); > > -jason > On Nov 1, 2007, at 2:06 PM, Rohit Ghai wrote: > > > > > > > Thanks for all the suggestions... but I unfortunately still cannot run > > emboss. I am running the latest version of embosswin (2.10.0- > > Win-0.8), > > and the > > path is set correctly. I printed $ENV{$PATH} and this contains > > C:\EMBOSSwin which is the correct location. > > I also tried setting the path directly but I'm not sure how to do > > this, > > so I tried this... > > > > my $fuzznuc = $f->program('C:\\EMBOSSwin\\fuzznuc'); > > > > this also did not work. > > > > Also tried printing... > > $fuzznuc->executable() > > > > gave the following error again > > -------------------- WARNING --------------------- > > MSG: Application [C:\EMBOSSwin\fuzznuc] is not available! > > --------------------------------------------------- > > > > Any more ideas ? > > > > thanks ! > > Rohit > > > > > > here's the code... > > > > use strict; > > use Bio::Factory::EMBOSS; > > use Data::Dumper; > > > > # > > # print "PATH=$ENV{PATH}\n"; > > # path contains C:\EMBOSSwin which is the correct location > > # embossversion is 2.10.0-Win-0.8 > > > > my $f = Bio::Factory::EMBOSS->new(); > > # get an EMBOSS application object from the factory > > print Dumper ($f); > > my $fuzznuc = $f->program('C:\\EMBOSSwin\\fuzznuc'); #tried > > fuzznuc.exe > > as well, > > print Dump ($fuzznuc); > > > > #dump of fuzznuc > > #$VAR1 = bless( { > > # '_programgroup' => {}, > > # '_programs' => {}, > > # '_groups' => {} > > # }, 'Bio::Factory::EMBOSS' ); > > > > #print "executing -- >", $fuzznuc->executable, "\n" ; # doesn't work > > > > my $infile = "temp.fasta"; > > my $motif = "ATGTCGATC"; > > my $outfile = "test.out"; > > > > > > $fuzznuc->run( > > { -sequence => $infile, > > -pattern => $motif, > > -outfile => $outfile > > }); > > > > Here's the error again.... > > > > #-------------------- WARNING --------------------- > > #MSG: Application [C:\EMBOSSwin\fuzznuc] is not available! > > #--------------------------------------------------- > > > > > > > > > > Jason Stajich wrote: > >> Presumably the PATH is not getting set properly - you should play > >> around printing the $ENV{PATH} variable in a perl script to see if > >> actually contains the directory where the emboss programs are > >> installed. Bioperl can only guess so much as to where to find an > >> application. It is also possible that we aren't creating the proper > >> path to the executable - you can print the executable path with > >> print $fuzznuc->executable > >> I believe unless it is throwing an error at the program() line. > >> > >> It looks like the code in the Factory object is a little fragile > >> assuming that the programs HAVE to be in your $PATH. I don't know if > >> windows+perl is special in any way that it run things so I can't > >> really tell if there is specific things you have to do here. You may > >> have to run this through cygwin in case PATH and such are just not > >> available properly to windowsPerl. > >> > >> -jason > >> On Nov 1, 2007, at 5:45 AM, Rohit Ghai wrote: > >> > >>> Dear all, > >>> > >>> I have emboss installed on a windows machine. (Embosswin). I can run > >>> this from the dos command line and the path is present. However, > >>> when I > >>> try to call > >>> an emboss application from bioperl I get a "Application not found > >>> error" > >>> > >>> > >>> my $f = Bio::Factory::EMBOSS->new(); > >>> # get an EMBOSS application object from the factory > >>> my $fuzznuc = $f->program('fuzznuc'); > >>> $fuzznuc->run( > >>> { -sequence => $infile, > >>> -pattern => $motif, > >>> -outfile => $outfile > >>> }); > >>> gives the following error > >>> > >>> -------------------- WARNING --------------------- > >>> MSG: Application [fuzznuc] is not available! > >>> --------------------------------------------------- > >>> Can't call method "run" on an undefined value at > >>> searchPatterns.pl line > >>> 102. > >>> > >>> Can somebody help me fix this ? > >>> > >>> best regards > >>> Rohit > >>> > >>> -- > >>> > >>> Dr. Rohit Ghai > >>> Institute of Medical Microbiology > >>> Faculty of Medicine > >>> Justus-Liebig University > >>> Frankfurter Strasse 107 > >>> 35392 - Giessen > >>> GERMANY > >>> > >>> Tel : 0049 (0)641-9946413 > >>> Fax : 0049 (0)641-9946409 > >>> Email: Rohit.Ghai at mikrobio.med.uni-giessen.de > >>> > >>> > >>> _______________________________________________ > >>> Bioperl-l mailing list > >>> Bioperl-l at lists.open-bio.org > >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >> > >> -- > >> Jason Stajich > >> jason at bioperl.org > >> > > > > -- > > > > Dr. Rohit Ghai > > Institute of Medical Microbiology > > Faculty of Medicine > > Justus-Liebig University > > Frankfurter Strasse 107 > > 35392 - Giessen > > GERMANY > > > > Tel : 0049 (0)641-9946413 > > Fax : 0049 (0)641-9946409 > > Email: Rohit.Ghai at mikrobio.med.uni-giessen.de > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > Jason Stajich > jason at bioperl.org > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From Rohit.Ghai at mikrobio.med.uni-giessen.de Sat Nov 3 10:07:52 2007 From: Rohit.Ghai at mikrobio.med.uni-giessen.de (Rohit Ghai) Date: Sat, 03 Nov 2007 15:07:52 +0100 Subject: [Bioperl-l] bioperl: cannot run emboss programs using bioperlon windows In-Reply-To: <28223F7B-045A-4CC7-8FE7-583D0F8F7D44@uiuc.edu> References: <4729A047.2060507@mikrobio.med.uni-giessen.de> <80BA54B5-72E6-4A5B-A124-D73256644DC9@bioperl.org> <472A15B8.7040502@mikrobio.med.uni-giessen.de> <6968D1EB-FED3-463D-AF12-74A7D7F2FF3C@bioperl.org> <472A1DE5.30207@mikrobio.med.uni-giessen.de> <28223F7B-045A-4CC7-8FE7-583D0F8F7D44@uiuc.edu> Message-ID: <472C80B8.9050601@mikrobio.med.uni-giessen.de> Dear all, thanks for all the different inputs on this topic, I was able to run emboss applications on windows (vista), but with the following workaround. Chris suggested to remove EMBOSSwin and get another version. This I did. Scott suggested setting all the variables within the program. This I also tried, but actually these were already available to the program so this was also not the problem. The following line... my $fuzznuc = $f->program('fuzznuc') doesn't return a Bio::Tools::Run::EMBOSSApplication object. but using Bio::Tools::Run::EMBOSSApplication directly seems to work. It doesn't have any path issues. What is also curious is that $f->version returns the correct version of emboss running (no path problems here), and it looks like it runs the command "embossversion -auto" to get this information. If it can get at this command, its a bit peculiar why it cannot get the other programs. Or am I missing something here ? Please take a look at the code, I have commented within this... -Rohit use Bio::Factory::EMBOSS; use Data::Dumper; use Bio::Tools::Run::EMBOSSApplication; my $infile = "test.fasta"; my $motif = "AGGAGG"; my $outfile = "test.out"; my $f = Bio::Factory::EMBOSS->new(); # get an EMBOSS application object from the factory print Dumper $f; print "location=",$f->location,"\n"; #returns local print "version=", $f->version,"\n"; # this returns the correct version 5.0 (uses embossversion -auto internally, and seems to know where it is) print "info=", $f->program_info('fuzznuc'),"\n"; #returns nothing print "list=",$f->_program_list,"\n"; #returns nothing #however, my $fuzznuc = $f->program('fuzznuc'); or with path / or \\ or with exe suffix doesn't work #$fuzznuc->executable('C:/mEMBOSS/fuzznuc'); # doesnt work # the problem is that it does not return a Bio::Tools::Run::EMBOSSApplication object. #however, creating a EMBOSSApplication object directly makes it possible to run the program # my $application = Bio::Tools::Run::EMBOSSApplication->new(); $application->name('fuzznuc'); print Dumper $application; $application->run( { -sequence => $infile, -pattern => $motif, -outfile => $outfile }); print "Done\n"; exit; Chris Fields wrote: > I did a little investigating using my old PC and was able to get > fuzznuc to run using BioPerl and EMBOSS v5. I had to jump through a > hoop or two but I managed to get it working. > > First, realize that EMBOSSWin is NOT the latest EMBOSS for Windows. > You need to remove EMBOSSWin and install the one I linked to > previously (this is an actual EMBOSS beta release). It's possible > older EMBOSSWin can be configured, but I don't plan on checking it out > myself. > > Next, you need to ensure the binaries are in your PATH env. variable > (test by running 'wossname' on the command line), then set EMBOSS_DATA > to point at the EMBOSS data directory using a UNIX-like path (i.e. > 'C:/mEMBOSS/data'); regular Win32 paths didn't work for me and WinXP > recognizes the UNIX'y form as a valid path. If you don't know how to > set env. variables go here: > > http://vlaurie.com/computers2/Articles/environment.htm > > Once that is set up you should be able to run the script using the > latest (greatest?) EMBOSS. > > chris > > On Nov 1, 2007, at 1:41 PM, Rohit Ghai wrote: > >> Hi Jason >> >> I tried this as well. This also gives the same error message. >> >> -Rohit >> >> Jason Stajich wrote: >>> You could try this - can't test it though so not sure. >>> my $fuzznuc = $f->program('fuzznuc'); >>> $fuzznuc->executable('C:\EMBOSSwin\fuzznuc'); >>> >>> -jason >>> On Nov 1, 2007, at 2:06 PM, Rohit Ghai wrote: >>> >>>> >>>> >>>> Thanks for all the suggestions... but I unfortunately still cannot run >>>> emboss. I am running the latest version of embosswin >>>> (2.10.0-Win-0.8), >>>> and the >>>> path is set correctly. I printed $ENV{$PATH} and this contains >>>> C:\EMBOSSwin which is the correct location. >>>> I also tried setting the path directly but I'm not sure how to do >>>> this, >>>> so I tried this... >>>> >>>> my $fuzznuc = $f->program('C:\\EMBOSSwin\\fuzznuc'); >>>> >>>> this also did not work. >>>> >>>> Also tried printing... >>>> $fuzznuc->executable() >>>> >>>> gave the following error again >>>> -------------------- WARNING --------------------- >>>> MSG: Application [C:\EMBOSSwin\fuzznuc] is not available! >>>> --------------------------------------------------- >>>> >>>> Any more ideas ? >>>> >>>> thanks ! >>>> Rohit >>>> >>>> >>>> here's the code... >>>> >>>> use strict; >>>> use Bio::Factory::EMBOSS; >>>> use Data::Dumper; >>>> >>>> # >>>> # print "PATH=$ENV{PATH}\n"; >>>> # path contains C:\EMBOSSwin which is the correct location >>>> # embossversion is 2.10.0-Win-0.8 >>>> >>>> my $f = Bio::Factory::EMBOSS->new(); >>>> # get an EMBOSS application object from the factory >>>> print Dumper ($f); >>>> my $fuzznuc = $f->program('C:\\EMBOSSwin\\fuzznuc'); #tried >>>> fuzznuc.exe >>>> as well, >>>> print Dump ($fuzznuc); >>>> >>>> #dump of fuzznuc >>>> #$VAR1 = bless( { >>>> # '_programgroup' => {}, >>>> # '_programs' => {}, >>>> # '_groups' => {} >>>> # }, 'Bio::Factory::EMBOSS' ); >>>> >>>> #print "executing -- >", $fuzznuc->executable, "\n" ; # doesn't work >>>> >>>> my $infile = "temp.fasta"; >>>> my $motif = "ATGTCGATC"; >>>> my $outfile = "test.out"; >>>> >>>> >>>> $fuzznuc->run( >>>> { -sequence => $infile, >>>> -pattern => $motif, >>>> -outfile => $outfile >>>> }); >>>> >>>> Here's the error again.... >>>> >>>> #-------------------- WARNING --------------------- >>>> #MSG: Application [C:\EMBOSSwin\fuzznuc] is not available! >>>> #--------------------------------------------------- >>>> >>>> >>>> >>>> >>>> Jason Stajich wrote: >>>>> Presumably the PATH is not getting set properly - you should play >>>>> around printing the $ENV{PATH} variable in a perl script to see if >>>>> actually contains the directory where the emboss programs are >>>>> installed. Bioperl can only guess so much as to where to find an >>>>> application. It is also possible that we aren't creating the proper >>>>> path to the executable - you can print the executable path with >>>>> print $fuzznuc->executable >>>>> I believe unless it is throwing an error at the program() line. >>>>> >>>>> It looks like the code in the Factory object is a little fragile >>>>> assuming that the programs HAVE to be in your $PATH. I don't know if >>>>> windows+perl is special in any way that it run things so I can't >>>>> really tell if there is specific things you have to do here. You may >>>>> have to run this through cygwin in case PATH and such are just not >>>>> available properly to windowsPerl. >>>>> >>>>> -jason >>>>> On Nov 1, 2007, at 5:45 AM, Rohit Ghai wrote: >>>>> >>>>>> Dear all, >>>>>> >>>>>> I have emboss installed on a windows machine. (Embosswin). I can run >>>>>> this from the dos command line and the path is present. However, >>>>>> when I >>>>>> try to call >>>>>> an emboss application from bioperl I get a "Application not found >>>>>> error" >>>>>> >>>>>> >>>>>> my $f = Bio::Factory::EMBOSS->new(); >>>>>> # get an EMBOSS application object from the factory >>>>>> my $fuzznuc = $f->program('fuzznuc'); >>>>>> $fuzznuc->run( >>>>>> { -sequence => $infile, >>>>>> -pattern => $motif, >>>>>> -outfile => $outfile >>>>>> }); >>>>>> gives the following error >>>>>> >>>>>> -------------------- WARNING --------------------- >>>>>> MSG: Application [fuzznuc] is not available! >>>>>> --------------------------------------------------- >>>>>> Can't call method "run" on an undefined value at searchPatterns.pl >>>>>> line >>>>>> 102. >>>>>> >>>>>> Can somebody help me fix this ? >>>>>> >>>>>> best regards >>>>>> Rohit >>>>>> >>>>>> -- >>>>>> > > From hlapp at gmx.net Sun Nov 4 12:42:13 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Sun, 4 Nov 2007 12:42:13 -0500 Subject: [Bioperl-l] question -- Bio::SeqFeature::Gene::Transcript In-Reply-To: <0918983F-BF45-4466-AF5C-8F1ACAE5EAE2@uni-potsdam.de> References: <0918983F-BF45-4466-AF5C-8F1ACAE5EAE2@uni-potsdam.de> Message-ID: <62FB6DE1-3F1D-428C-B108-4CF9EEB67DDD@gmx.net> Hi Stefanie, sorry for taking so long to respond - your email got buried in a pile while I was away on travel. The Bio::SeqFeature::Gene::* modules were written mostly with the motivation to have a model that can represent the results of gene predictors. GenBank AFAIK doesn't annotate introns explicitly, though they should be implicit from cDNA (or mRNA? or gene, as you say) features on genomic sequence. The Bioperl SeqIO parsers won't transform those into a Bio::SeqFeature::Gene-based model, but instead will yield just plain Bio::SeqFeatureI objects in a flat array. It's up to subsequent processing to build these into more hierarchical models. I'm not sure whether someone's done this already for GenBank-type feature tables. There is a Unflattener that at least attempts to build a feature hierarchy from the flat array that's compliant with the Sequence Ontology (or so I recall). I'm copying the list in case others have additional suggestions. -hilmar On Oct 25, 2007, at 3:40 AM, Stefanie Hartmann wrote: > > > Hello Hilmar, > > I have a question about your bioperl module > Bio::SeqFeature::Gene::Transcript: > > I can't figure out how to generate the $gene object for use in this > line: > @introns = $gene->introns(); > > The data I'm working with is a local file in genbank format, and > I'm interested in extracting intron sequences (and maybe flanking > exons) for certain genes. I have been trying to get the introns via > the sequence features ('CDS' or 'gene'), but this has not been > working. Which approach will I have to take? > I'd be very grateful if you could point me into the right direction! > > Hope things are going well in Durham! And thank you in advance! > > Stefanie > > > > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From downloadondemand at gmail.com Sun Nov 4 13:39:42 2007 From: downloadondemand at gmail.com (download on demand) Date: Sun, 4 Nov 2007 20:39:42 +0200 Subject: [Bioperl-l] Help with Bio::SeqIO Message-ID: <923c9ce30711041039q3f718911r63eaa5a093226df2@mail.gmail.com> Hi to all. I have a problem with a simplest script: use Bio::SeqIO; # get command-line arguments, or die with a usage statement my $usage = "x2y.pl infile infileformat outfile outfileformat\n"; my $infile = shift or die $usage; my $infileformat = shift or die $usage; # my $outfile = shift or die $usage; my $outfileformat = shift or die $usage; # create one SeqIO object to read in,and another to write out my $seq_in = Bio::SeqIO->new('-file' => "<$infile", '-format' => $infileformat); my $seq_out = Bio::SeqIO->new('-fh' => \*STDOUT, '-format' => $outfileformat); # write each entry in the input file to the output file while (my $inseq = $seq_in->next_seq) { # $seq_out->write_seq($inseq); # Whole sequence not needed for my $feat_object ($inseq->get_SeqFeatures) { if ($feat_object->primary_tag eq "CDS") { print $feat_object->get_tag_values('product'),"\n"; print $feat_object->location->start,"..",$feat_object->location->end,"\n"; print $feat_object->spliced_seq->seq,"\n\n"; } } The result seems OK to me, but in case of first CDS of NC_005213.gbk from here the output is wrong: It is: hypothetical protein 1..490885 TAAATGCGATTGCTATTAGAA..................................Truncated sequence................................... Should be: hypothetical protein 879..490883 ATGCGATTGCTATTAGAA...................................Truncated sequence....................................TAA This CDS have an unnatural location string: CDS complement(join(490883..490885,1..879)), but spliced_seq should handle these things? Please help me! Best regards, N. From cjfields at uiuc.edu Sun Nov 4 19:08:34 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 4 Nov 2007 18:08:34 -0600 Subject: [Bioperl-l] Help with Bio::SeqIO In-Reply-To: <923c9ce30711041039q3f718911r63eaa5a093226df2@mail.gmail.com> References: <923c9ce30711041039q3f718911r63eaa5a093226df2@mail.gmail.com> Message-ID: <8543B6EA-7D37-4D59-B22F-01D34BA9C13D@uiuc.edu> Pass in (-nosort => 1) to spliced_seq: print $feat_object->spliced_seq(-no_sort =>1)->seq,"\n\n"; This ensures no sorting of sublocations occurs, if you want for instance typical GenBank/EMBL 'join' behavior. To the other devs: shouldn't -nosort be the default behavior when the split location is a 'join'? In other words, should spliced_seq() be modified to take into account the split location type when returning sequence? GB/EMBL/DDBJ rel. notes indicate a 'join' explicitly indicates the order of the sequences is important when joined together; the current behavior is more like that for 'order'. chris On Nov 4, 2007, at 12:39 PM, download on demand wrote: > Hi to all. > > I have a problem with a simplest script: > > > > use Bio::SeqIO; > # get command-line arguments, or die with a usage statement > my $usage = "x2y.pl infile infileformat outfile > outfileformat\n"; > my $infile = shift or die $usage; > my $infileformat = shift or die $usage; > # my $outfile = shift or die $usage; > my $outfileformat = shift or die $usage; > > # create one SeqIO object to read in,and another to write out > my $seq_in = Bio::SeqIO->new('-file' => "<$infile", > '-format' => $infileformat); > my $seq_out = Bio::SeqIO->new('-fh' => \*STDOUT, > '-format' => $outfileformat); > > # write each entry in the input file to the output file > while (my $inseq = $seq_in->next_seq) { > > # $seq_out->write_seq($inseq); # Whole sequence not needed > > for my $feat_object ($inseq->get_SeqFeatures) > { > if ($feat_object->primary_tag eq "CDS") > { > print $feat_object->get_tag_values('product'),"\n"; > print > $feat_object->location->start,"..",$feat_object->location->end,"\n"; > print $feat_object->spliced_seq->seq,"\n\n"; > } > } > > > > The result seems OK to me, but in case of first CDS of > NC_005213.gbk from > here > the > output is wrong: > > It is: > hypothetical protein > 1..490885 > TAAATGCGATTGCTATTAGAA..................................Truncated > sequence................................... > > Should be: > hypothetical protein > 879..490883 > ATGCGATTGCTATTAGAA...................................Truncated > sequence....................................TAA > > > > This CDS have an unnatural location string: > CDS complement(join(490883..490885,1..879)), but > spliced_seq > should handle these things? > > Please help me! > Best regards, N. > _______________________________________________ > From jean-luc.jany at univ-brest.fr Mon Nov 5 03:26:52 2007 From: jean-luc.jany at univ-brest.fr (Jean-luc Jany) Date: Mon, 05 Nov 2007 09:26:52 +0100 Subject: [Bioperl-l] Bioperl + standalone blast on Mac= cannot find path to blastall Message-ID: <472ED3CC.2050305@univ-brest.fr> Dear Bioperl and Mac users, I am a Mac user and would like to run a script I made using Bio::Tools::Run::StandAloneBlast. Unfortunately, I did not manage to indicate to Bioperl the pathway to Blastall and other executables. I read carefully the following link http://www.bioperl.org/wiki/HOWTO:StandAloneBlast and tried to indicate the path to Blast, but I guess the way to proceed is slightly different in Mac and that I should not create .ncbirc and .bashrc files (e.g. should I modify the .profile file instead of .bashrc?) Actually, my blast file is in myname directory and comprises a /bin and a /data file. I have got my blastall and other executables in myname/blast/bin/blastall. Thank you in anticipation for your help. Jean-Luc From Rohit.Ghai at mikrobio.med.uni-giessen.de Mon Nov 5 06:36:16 2007 From: Rohit.Ghai at mikrobio.med.uni-giessen.de (Rohit Ghai) Date: Mon, 05 Nov 2007 12:36:16 +0100 Subject: [Bioperl-l] bioperl and emboss on windows Message-ID: <472F0030.7040200@mikrobio.med.uni-giessen.de> Dear all, thanks for all the different inputs on this topic, I was able to run emboss applications on windows (vista), but with the following workaround. Chris suggested to remove EMBOSSwin and get another version. This I did. Scott suggested setting all the variables within the program. This I also tried, but actually these were already available to the program so this was also not the problem. The following line... my $fuzznuc = $f->program('fuzznuc') doesn't return a Bio::Tools::Run::EMBOSSApplication object. but using Bio::Tools::Run::EMBOSSApplication directly seems to work. It doesn't have any path issues. What is also curious is that $f->version returns the correct version of emboss running (no path problems here), and it looks like it runs the command "embossversion -auto" to get this information. If it can get at this command, its a bit peculiar why it cannot get the other programs. Or am I missing something here ? Please take a look at the code, I have commented within this... -Rohit use Bio::Factory::EMBOSS; use Data::Dumper; use Bio::Tools::Run::EMBOSSApplication; my $infile = "test.fasta"; my $motif = "AGGAGG"; my $outfile = "test.out"; my $f = Bio::Factory::EMBOSS->new(); # get an EMBOSS application object from the factory print Dumper $f; print "location=",$f->location,"\n"; #returns local print "version=", $f->version,"\n"; # this returns the correct version 5.0 (uses embossversion -auto internally, and seems to know where it is) print "info=", $f->program_info('fuzznuc'),"\n"; #returns nothing print "list=",$f->_program_list,"\n"; #returns nothing # # however, my $fuzznuc = $f->program('fuzznuc'); or with path / or \\ or with exe suffix doesn't work # $fuzznuc->executable('C:/mEMBOSS/fuzznuc'); # doesnt work # the problem is that it does not return a Bio::Tools::Run::EMBOSSApplication object. # # # # however, creating a EMBOSSApplication object directly makes it possible to run the program # my $application = Bio::Tools::Run::EMBOSSApplication->new(); $application->name('fuzznuc'); print Dumper $application; $application->run( { -sequence => $infile, -pattern => $motif, -outfile => $outfile }); print "Done\n"; exit; From neetisomaiya at gmail.com Mon Nov 5 07:20:04 2007 From: neetisomaiya at gmail.com (neeti somaiya) Date: Mon, 5 Nov 2007 17:50:04 +0530 Subject: [Bioperl-l] perl question Message-ID: <764978cf0711050420x800b663q1fd94b08f8a4b975@mail.gmail.com> Again a perl question, and maybe a very trivial one. How do I terminate a number like 3.1232010098 to only 3 decimal places in perl? -- -Neeti Even my blood says, B positive From biology0046 at hotmail.com Mon Nov 5 07:16:13 2007 From: biology0046 at hotmail.com (=?gb2312?B?va0gzsTi/Q==?=) Date: Mon, 05 Nov 2007 12:16:13 +0000 Subject: [Bioperl-l] how to extract intron information from gff files. Message-ID: Dear all: i got a poplar genome gff file like this: LG_I src exon 2598 3280 . - . name "fgenesh1_pg.C_LG_I000001"; transcriptId 62649 LG_I src CDS 2598 3280 . - 0 name "fgenesh1_pg.C_LG_I000001"; proteinId 62649; exonNumber 4 LG_I src start_codon 3278 3280 . - 0 name "fgenesh1_pg.C_LG_I000001" LG_I src stop_codon 2598 2600 . - 0 name "fgenesh1_pg.C_LG_I000001" LG_I src exon 3544 3918 . - . name "fgenesh1_pg.C_LG_I000001"; transcriptId 62649 LG_I src CDS 3544 3918 . - 2 name "fgenesh1_pg.C_LG_I000001"; proteinId 62649; exonNumber 3 LG_I src exon 4258 4740 . - . name "fgenesh1_pg.C_LG_I000001"; transcriptId 62649 LG_I src CDS 4258 4740 . - 2 name "fgenesh1_pg.C_LG_I000001"; proteinId 62649; exonNumber 2 LG_I src exon 5344 6388 . - . name "fgenesh1_pg.C_LG_I000001"; transcriptId 62649 LG_I src CDS 5344 6388 . - 2 name "fgenesh1_pg.C_LG_I000001"; proteinId 62649; exonNumber 1 LG_I src exon 8259 8528 . - . name "fgenesh1_pg.C_LG_I000002"; transcriptId 62650 LG_I src CDS 8259 8528 . - 0 name "fgenesh1_pg.C_LG_I000002"; proteinId 62650; exonNumber 3 LG_I src stop_codon 8259 8261 . - 0 name "fgenesh1_pg.C_LG_I000002" LG_I src exon 8897 8987 . - . name "fgenesh1_pg.C_LG_I000002"; transcriptId 62650 LG_I src CDS 8897 8987 . - 0 name "fgenesh1_pg.C_LG_I000002"; proteinId 62650; exonNumber 2 LG_I src exon 9831 9892 . - . name "fgenesh1_pg.C_LG_I000002"; transcriptId 62650 LG_I src CDS 9831 9892 . - 1 name "fgenesh1_pg.C_LG_I000002"; proteinId 62650; exonNumber 1 LG_I src start_codon 9890 9892 . - 0 name "fgenesh1_pg.C_LG_I000002" I try to use Bio::DB::GFF, but this module only applies to methods given in the gff file. what i want to get is "intron, 5utr, 3utr", but this information do not contain in this gff file. how can i get these information through bioperl? This file do not contain intron information if i consider gaps between exons as introns, non cds parts of the first and last exon as utrs, how can i extract them through this gff file. Thanks~~ Wenkai _________________________________________________________________ ?????????????????????????????? MSN Hotmail?? http://www.hotmail.com From spiros at lokku.com Mon Nov 5 07:36:36 2007 From: spiros at lokku.com (Spiros Denaxas) Date: Mon, 5 Nov 2007 12:36:36 +0000 Subject: [Bioperl-l] perl question In-Reply-To: <764978cf0711050420x800b663q1fd94b08f8a4b975@mail.gmail.com> References: <764978cf0711050420x800b663q1fd94b08f8a4b975@mail.gmail.com> Message-ID: Hey, use the `sprintf` function. More information can be found at , http://perldoc.perl.org/functions/sprintf.html. For more proper rounding, you could use the Math::Round module, http://search.cpan.org/~grommel/Math-Round-0.05/Round.pm. hope this helps, spiros On 11/5/07, neeti somaiya wrote: > > Again a perl question, and maybe a very trivial one. > How do I terminate a number like 3.1232010098 to only 3 decimal places in > perl? > > -- > -Neeti > Even my blood says, B positive > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From ak at ebi.ac.uk Mon Nov 5 07:43:06 2007 From: ak at ebi.ac.uk (Andreas Kahari) Date: Mon, 5 Nov 2007 12:43:06 +0000 Subject: [Bioperl-l] perl question In-Reply-To: <764978cf0711050420x800b663q1fd94b08f8a4b975@mail.gmail.com> References: <764978cf0711050420x800b663q1fd94b08f8a4b975@mail.gmail.com> Message-ID: <20071105124305.GC4491@ebi.ac.uk> On Mon, Nov 05, 2007 at 05:50:04PM +0530, neeti somaiya wrote: > Again a perl question, and maybe a very trivial one. > How do I terminate a number like 3.1232010098 to only 3 decimal places in > perl? When displaying: printf( "The number is %.3f\n", $number ); When making a string: my $string = sprintf( "%.3f", $number ); BTW, this is cutting, not rounding. Cheers, Andreas -- Andreas K?h?ri :: Ensembl Software Developer European Bioinformatics Institute (EMBL-EBI) -------------------------------------------- From t.nugent at cs.ucl.ac.uk Mon Nov 5 07:37:15 2007 From: t.nugent at cs.ucl.ac.uk (Tim Nugent) Date: Mon, 05 Nov 2007 12:37:15 +0000 Subject: [Bioperl-l] perl question In-Reply-To: <764978cf0711050420x800b663q1fd94b08f8a4b975@mail.gmail.com> References: <764978cf0711050420x800b663q1fd94b08f8a4b975@mail.gmail.com> Message-ID: <472F0E7B.60303@cs.ucl.ac.uk> Use Math:Round and nearest_ceil: http://search.cpan.org/~grommel/Math-Round-0.05/Round.pm neeti somaiya wrote: > Again a perl question, and maybe a very trivial one. > How do I terminate a number like 3.1232010098 to only 3 decimal places in > perl? > > -- Tim Nugent (MRes) Research Student Bioinformatics Unit Department of Computer Science University College London Gower Street London WC1E 6BT Tel: 020-7679-0410 t.nugent at ucl.ac.uk http://www.cs.ucl.ac.uk/staff/T.Nugent From bix at sendu.me.uk Mon Nov 5 07:47:17 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 05 Nov 2007 12:47:17 +0000 Subject: [Bioperl-l] perl question In-Reply-To: <764978cf0711050420x800b663q1fd94b08f8a4b975@mail.gmail.com> References: <764978cf0711050420x800b663q1fd94b08f8a4b975@mail.gmail.com> Message-ID: <472F10D5.5060006@sendu.me.uk> neeti somaiya wrote: > Again a perl question, and maybe a very trivial one. > How do I terminate a number like 3.1232010098 to only 3 decimal places in > perl? Please don't use this list to ask general Perl questions. See these instead: http://perldoc.perl.org/perlfaq4.html http://lists.cpan.org/ http://www.perlmonks.org/ $rounded = sprintf("%.3f", $number); From Marc.Logghe at DEVGEN.com Mon Nov 5 07:39:36 2007 From: Marc.Logghe at DEVGEN.com (Marc Logghe) Date: Mon, 5 Nov 2007 13:39:36 +0100 Subject: [Bioperl-l] perl question References: <764978cf0711050420x800b663q1fd94b08f8a4b975@mail.gmail.com> Message-ID: <0C528E3670D8CE4B8E013F6749231AA601C3BB80@ANTARESIA.be.devgen.com> Hi, Have a look at http://perldoc.perl.org/functions/sprintf.html#precision%2c-or-maximum-w idth In your particular case: my $f = 3.1232010098; printf "%0.3f", $f; HTH, Marc > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org > [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of > neeti somaiya > Sent: Monday, November 05, 2007 1:20 PM > To: bioperl-l > Subject: [Bioperl-l] perl question > > Again a perl question, and maybe a very trivial one. > How do I terminate a number like 3.1232010098 to only 3 > decimal places in perl? > > -- > -Neeti > Even my blood says, B positive > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From bix at sendu.me.uk Mon Nov 5 08:24:25 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 05 Nov 2007 13:24:25 +0000 Subject: [Bioperl-l] perl question In-Reply-To: <20071105124305.GC4491@ebi.ac.uk> References: <764978cf0711050420x800b663q1fd94b08f8a4b975@mail.gmail.com> <20071105124305.GC4491@ebi.ac.uk> Message-ID: <472F1989.90105@sendu.me.uk> Andreas Kahari wrote: > On Mon, Nov 05, 2007 at 05:50:04PM +0530, neeti somaiya wrote: >> Again a perl question, and maybe a very trivial one. >> How do I terminate a number like 3.1232010098 to only 3 decimal places in >> perl? > > When displaying: > > printf( "The number is %.3f\n", $number ); > > When making a string: > > my $string = sprintf( "%.3f", $number ); > > > BTW, this is cutting, not rounding. (s)printf rounds (ie. doesn't simply truncate), though for critical applications you should use your own rounding algorithm. From ak at ebi.ac.uk Mon Nov 5 08:56:24 2007 From: ak at ebi.ac.uk (Andreas Kahari) Date: Mon, 5 Nov 2007 13:56:24 +0000 Subject: [Bioperl-l] perl question In-Reply-To: <472F1989.90105@sendu.me.uk> References: <764978cf0711050420x800b663q1fd94b08f8a4b975@mail.gmail.com> <20071105124305.GC4491@ebi.ac.uk> <472F1989.90105@sendu.me.uk> Message-ID: <20071105135624.GD4491@ebi.ac.uk> On Mon, Nov 05, 2007 at 01:24:25PM +0000, Sendu Bala wrote: > Andreas Kahari wrote: > > On Mon, Nov 05, 2007 at 05:50:04PM +0530, neeti somaiya wrote: > >> Again a perl question, and maybe a very trivial one. > >> How do I terminate a number like 3.1232010098 to only 3 decimal places in > >> perl? > > > > When displaying: > > > > printf( "The number is %.3f\n", $number ); > > > > When making a string: > > > > my $string = sprintf( "%.3f", $number ); > > > > > > BTW, this is cutting, not rounding. > > (s)printf rounds (ie. doesn't simply truncate), though for critical > applications you should use your own rounding algorithm. They do indeed. Mea culpa. Andreas -- Andreas K?h?ri :: Ensembl Software Developer European Bioinformatics Institute (EMBL-EBI) -------------------------------------------- From jay at jays.net Mon Nov 5 10:14:17 2007 From: jay at jays.net (Jay Hannah) Date: Mon, 5 Nov 2007 10:14:17 -0500 Subject: [Bioperl-l] Help with Bio::SeqIO In-Reply-To: <8543B6EA-7D37-4D59-B22F-01D34BA9C13D@uiuc.edu> References: <923c9ce30711041039q3f718911r63eaa5a093226df2@mail.gmail.com> <8543B6EA-7D37-4D59-B22F-01D34BA9C13D@uiuc.edu> Message-ID: <8CA2A45C-1F82-47A2-841B-1BA92E1F4466@jays.net> On Nov 4, 2007, at 7:08 PM, Chris Fields wrote: > To the other devs: shouldn't -nosort be the default behavior when the > split location is a 'join'? I certainly think so. > In other words, should spliced_seq() be > modified to take into account the split location type when returning > sequence? GB/EMBL/DDBJ rel. notes indicate a 'join' explicitly > indicates the order of the sequences is important when joined > together; the current behavior is more like that for 'order'. I don't see any value to the sorting algorithm. All tests invoke - nosort => 1 (except a phase test where nosort doesn't matter anyway). In my limited experience the sorting only serves to break real-world splicing. If there is no valid use then we can remove ~20 lines from SeqFeatureI.pm circa line 505. If there is a valid use and someone would be so kind as to educate me I'd be happy to add tests which demonstrate them. :) P.S. CSHL is neato. I plan on understanding some of this stuff some day. :) j http://www.bioperl.org/wiki/User:Jhannah From hlapp at duke.edu Mon Nov 5 11:03:16 2007 From: hlapp at duke.edu (Hilmar Lapp) Date: Mon, 5 Nov 2007 11:03:16 -0500 Subject: [Bioperl-l] Help with Bio::SeqIO In-Reply-To: <8543B6EA-7D37-4D59-B22F-01D34BA9C13D@uiuc.edu> References: <923c9ce30711041039q3f718911r63eaa5a093226df2@mail.gmail.com> <8543B6EA-7D37-4D59-B22F-01D34BA9C13D@uiuc.edu> Message-ID: I agree that there should be a meaningful default that results in "doing the right thing" in most cases if the user doesn't intervene. I'm not sure I understand all the details, but it sounds sorting or not sorting should depend on the split location type unless the user overrides it by argument. That's what you're suggesting, right? -hilmar On Nov 4, 2007, at 7:08 PM, Chris Fields wrote: > Pass in (-nosort => 1) to spliced_seq: > > print $feat_object->spliced_seq(-no_sort =>1)->seq,"\n\n"; > > This ensures no sorting of sublocations occurs, if you want for > instance typical GenBank/EMBL 'join' behavior. > > To the other devs: shouldn't -nosort be the default behavior when > the split location is a 'join'? In other words, should spliced_seq > () be modified to take into account the split location type when > returning sequence? GB/EMBL/DDBJ rel. notes indicate a 'join' > explicitly indicates the order of the sequences is important when > joined together; the current behavior is more like that for 'order'. > > chris > > On Nov 4, 2007, at 12:39 PM, download on demand wrote: > >> Hi to all. >> >> I have a problem with a simplest script: >> >> >> >> use Bio::SeqIO; >> # get command-line arguments, or die with a usage statement >> my $usage = "x2y.pl infile infileformat outfile >> outfileformat\n"; >> my $infile = shift or die $usage; >> my $infileformat = shift or die $usage; >> # my $outfile = shift or die $usage; >> my $outfileformat = shift or die $usage; >> >> # create one SeqIO object to read in,and another to write >> out >> my $seq_in = Bio::SeqIO->new('-file' => "<$infile", >> '-format' => $infileformat); >> my $seq_out = Bio::SeqIO->new('-fh' => \*STDOUT, >> '-format' => $outfileformat); >> >> # write each entry in the input file to the output file >> while (my $inseq = $seq_in->next_seq) { >> >> # $seq_out->write_seq($inseq); # Whole sequence not needed >> >> for my $feat_object ($inseq->get_SeqFeatures) >> { >> if ($feat_object->primary_tag eq "CDS") >> { >> print $feat_object->get_tag_values('product'),"\n"; >> print >> $feat_object->location->start,"..",$feat_object->location->end,"\n"; >> print $feat_object->spliced_seq->seq,"\n\n"; >> } >> } >> >> >> >> The result seems OK to me, but in case of first CDS of >> NC_005213.gbk from >> here > Nanoarchaeum_equitans/> the >> output is wrong: >> >> It is: >> hypothetical protein >> 1..490885 >> TAAATGCGATTGCTATTAGAA..................................Truncated >> sequence................................... >> >> Should be: >> hypothetical protein >> 879..490883 >> ATGCGATTGCTATTAGAA...................................Truncated >> sequence....................................TAA >> >> >> >> This CDS have an unnatural location string: >> CDS complement(join(490883..490885,1..879)), but >> spliced_seq >> should handle these things? >> >> Please help me! >> Best regards, N. >> _______________________________________________ >> > > > From bernd.web at gmail.com Mon Nov 5 11:53:01 2007 From: bernd.web at gmail.com (Bernd Web) Date: Mon, 5 Nov 2007 17:53:01 +0100 Subject: [Bioperl-l] PSI-BLAST Message-ID: <716af09c0711050853l23087ac6j9f7d597580b66c46@mail.gmail.com> Hi, Is it possible with SearchIO to select a specific iteration (Results from round i) part of the PSI-blast report, when parsing this with SearchIO::blast? It seems the parser parses the complete report. If not implemented I could of course extract the specific part of the psi-blast report and then give it too SearchIO (e.g. with IO::String), but maybe I am missing a built-in option? Regards, Bernd From jay at jays.net Mon Nov 5 11:54:13 2007 From: jay at jays.net (Jay Hannah) Date: Mon, 5 Nov 2007 11:54:13 -0500 Subject: [Bioperl-l] Help with Bio::SeqIO In-Reply-To: References: <923c9ce30711041039q3f718911r63eaa5a093226df2@mail.gmail.com> <8543B6EA-7D37-4D59-B22F-01D34BA9C13D@uiuc.edu> Message-ID: On Nov 5, 2007, at 11:03 AM, Hilmar Lapp wrote: > I agree that there should be a meaningful default that results in > "doing the right thing" in most cases if the user doesn't intervene. > I'm not sure I understand all the details, but it sounds sorting or > not sorting should depend on the split location type unless the user > overrides it by argument. That's what you're suggesting, right? If someone knows why spliced_seq() should ever sort then I'm suggesting we add a test demonstrating a useful example of that. If no one has a useful example of when you would want spliced_seq() to sort then I'm suggesting we remove the sorting altogether and nosort goes away. I can provide/add many examples where sorting is bad. I do not know of a case where sorting is good. j http://www.bioperl.org/wiki/User:Jhannah From jason at bioperl.org Mon Nov 5 12:07:10 2007 From: jason at bioperl.org (Jason Stajich) Date: Mon, 5 Nov 2007 12:07:10 -0500 Subject: [Bioperl-l] Help with Bio::SeqIO In-Reply-To: References: <923c9ce30711041039q3f718911r63eaa5a093226df2@mail.gmail.com> <8543B6EA-7D37-4D59-B22F-01D34BA9C13D@uiuc.edu> Message-ID: At one point the location order was not respected/saved I believe. I guess we will just assume the user will build up a SplitLocation in order (i.e. add_SubLocation). I'll try and remember if there were any other particular reasons. -jason On Nov 5, 2007, at 11:03 AM, Hilmar Lapp wrote: > I agree that there should be a meaningful default that results in > "doing the right thing" in most cases if the user doesn't intervene. > I'm not sure I understand all the details, but it sounds sorting or > not sorting should depend on the split location type unless the user > overrides it by argument. That's what you're suggesting, right? > > -hilmar > > On Nov 4, 2007, at 7:08 PM, Chris Fields wrote: > >> Pass in (-nosort => 1) to spliced_seq: >> >> print $feat_object->spliced_seq(-no_sort =>1)->seq,"\n\n"; >> >> This ensures no sorting of sublocations occurs, if you want for >> instance typical GenBank/EMBL 'join' behavior. >> >> To the other devs: shouldn't -nosort be the default behavior when >> the split location is a 'join'? In other words, should spliced_seq >> () be modified to take into account the split location type when >> returning sequence? GB/EMBL/DDBJ rel. notes indicate a 'join' >> explicitly indicates the order of the sequences is important when >> joined together; the current behavior is more like that for 'order'. >> >> chris >> >> On Nov 4, 2007, at 12:39 PM, download on demand wrote: >> >>> Hi to all. >>> >>> I have a problem with a simplest script: >>> >>> >>> >>> use Bio::SeqIO; >>> # get command-line arguments, or die with a usage statement >>> my $usage = "x2y.pl infile infileformat outfile >>> outfileformat\n"; >>> my $infile = shift or die $usage; >>> my $infileformat = shift or die $usage; >>> # my $outfile = shift or die $usage; >>> my $outfileformat = shift or die $usage; >>> >>> # create one SeqIO object to read in,and another to write >>> out >>> my $seq_in = Bio::SeqIO->new('-file' => "<$infile", >>> '-format' => $infileformat); >>> my $seq_out = Bio::SeqIO->new('-fh' => \*STDOUT, >>> '-format' => $outfileformat); >>> >>> # write each entry in the input file to the output file >>> while (my $inseq = $seq_in->next_seq) { >>> >>> # $seq_out->write_seq($inseq); # Whole sequence not >>> needed >>> >>> for my $feat_object ($inseq->get_SeqFeatures) >>> { >>> if ($feat_object->primary_tag eq "CDS") >>> { >>> print $feat_object->get_tag_values('product'),"\n"; >>> print >>> $feat_object->location->start,"..",$feat_object->location->end,"\n"; >>> print $feat_object->spliced_seq->seq,"\n\n"; >>> } >>> } >>> >>> >>> >>> The result seems OK to me, but in case of first CDS of >>> NC_005213.gbk from >>> here >> Nanoarchaeum_equitans/> the >>> output is wrong: >>> >>> It is: >>> hypothetical protein >>> 1..490885 >>> TAAATGCGATTGCTATTAGAA..................................Truncated >>> sequence................................... >>> >>> Should be: >>> hypothetical protein >>> 879..490883 >>> ATGCGATTGCTATTAGAA...................................Truncated >>> sequence....................................TAA >>> >>> >>> >>> This CDS have an unnatural location string: >>> CDS complement(join(490883..490885,1..879)), but >>> spliced_seq >>> should handle these things? >>> >>> Please help me! >>> Best regards, N. >>> _______________________________________________ >>> >> >> >> > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org From cjfields at uiuc.edu Mon Nov 5 12:16:10 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 5 Nov 2007 11:16:10 -0600 Subject: [Bioperl-l] Help with Bio::SeqIO In-Reply-To: References: <923c9ce30711041039q3f718911r63eaa5a093226df2@mail.gmail.com> <8543B6EA-7D37-4D59-B22F-01D34BA9C13D@uiuc.edu> Message-ID: <69AE79C0-3775-4AAC-B846-AA0611C44EAB@uiuc.edu> Yes, we would sort based on the splittype() and default to a particular behavior ('join') if one isn't designated, maybe with a warning indicating the splittype() isn't defined. Using an 'order' or other defined types could also delineate a default sort/nosort behavior (probably the previous as it would replicate prior behavior). chris On Nov 5, 2007, at 10:03 AM, Hilmar Lapp wrote: > I agree that there should be a meaningful default that results in > "doing the right thing" in most cases if the user doesn't intervene. > I'm not sure I understand all the details, but it sounds sorting or > not sorting should depend on the split location type unless the user > overrides it by argument. That's what you're suggesting, right? > > -hilmar From cjfields at uiuc.edu Mon Nov 5 12:20:35 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 5 Nov 2007 11:20:35 -0600 Subject: [Bioperl-l] Help with Bio::SeqIO In-Reply-To: References: <923c9ce30711041039q3f718911r63eaa5a093226df2@mail.gmail.com> <8543B6EA-7D37-4D59-B22F-01D34BA9C13D@uiuc.edu> Message-ID: <70023491-3549-428D-9E5C-32275A33FF20@uiuc.edu> On Nov 5, 2007, at 10:54 AM, Jay Hannah wrote: > On Nov 5, 2007, at 11:03 AM, Hilmar Lapp wrote: >> I agree that there should be a meaningful default that results in >> "doing the right thing" in most cases if the user doesn't intervene. >> I'm not sure I understand all the details, but it sounds sorting or >> not sorting should depend on the split location type unless the user >> overrides it by argument. That's what you're suggesting, right? > > If someone knows why spliced_seq() should ever sort then I'm > suggesting we add a test demonstrating a useful example of that. > > If no one has a useful example of when you would want spliced_seq() > to sort then I'm suggesting we remove the sorting altogether and > nosort goes away. > > I can provide/add many examples where sorting is bad. I do not know > of a case where sorting is good. > > j > http://www.bioperl.org/wiki/User:Jhannah The behavior would be based on the current use of 'join', 'order', and 'bond' (the latter in GenPept records). I documented some cases here a while back: http://www.bioperl.org/wiki/BioPerl_Locations#Split chris From hlapp at duke.edu Mon Nov 5 12:32:24 2007 From: hlapp at duke.edu (Hilmar Lapp) Date: Mon, 5 Nov 2007 12:32:24 -0500 Subject: [Bioperl-l] Help with Bio::SeqIO In-Reply-To: <69AE79C0-3775-4AAC-B846-AA0611C44EAB@uiuc.edu> References: <923c9ce30711041039q3f718911r63eaa5a093226df2@mail.gmail.com> <8543B6EA-7D37-4D59-B22F-01D34BA9C13D@uiuc.edu> <69AE79C0-3775-4AAC-B846-AA0611C44EAB@uiuc.edu> Message-ID: <13919657-0446-4821-9EE4-FD07C995C734@duke.edu> Sounds good to me. -hilmar On Nov 5, 2007, at 12:16 PM, Chris Fields wrote: > Yes, we would sort based on the splittype() and default to a > particular behavior ('join') if one isn't designated, maybe with a > warning indicating the splittype() isn't defined. Using an 'order' > or other defined types could also delineate a default sort/nosort > behavior (probably the previous as it would replicate prior behavior). > > chris > > On Nov 5, 2007, at 10:03 AM, Hilmar Lapp wrote: > >> I agree that there should be a meaningful default that results in >> "doing the right thing" in most cases if the user doesn't intervene. >> I'm not sure I understand all the details, but it sounds sorting or >> not sorting should depend on the split location type unless the user >> overrides it by argument. That's what you're suggesting, right? >> >> -hilmar > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at duke dot edu : =========================================================== From cjfields at uiuc.edu Mon Nov 5 12:41:27 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 5 Nov 2007 11:41:27 -0600 Subject: [Bioperl-l] Help with Bio::SeqIO In-Reply-To: References: <923c9ce30711041039q3f718911r63eaa5a093226df2@mail.gmail.com> <8543B6EA-7D37-4D59-B22F-01D34BA9C13D@uiuc.edu> Message-ID: It may have something to do with remote locations or setting strand() in sublocations. This may have popped up in relation to a LocationI code audit I proposed a while back on the list which I never got around to. Oh well... I at least managed getting a wiki page started in case we decided to make changes, with the intention of making it a HOWTO at some point: http://www.bioperl.org/wiki/BioPerl_Locations If we go through with the changes to spliced_seq(), should it be implemented for inclusion in v1.6 or wait until v1.7? chris On Nov 5, 2007, at 11:07 AM, Jason Stajich wrote: > > At one point the location order was not respected/saved I believe. > I guess we will just assume the user will build up a SplitLocation > in order (i.e. add_SubLocation). I'll try and remember if there > were any other particular reasons. > > > -jason > On Nov 5, 2007, at 11:03 AM, Hilmar Lapp wrote: > >> I agree that there should be a meaningful default that results in >> "doing the right thing" in most cases if the user doesn't intervene. >> I'm not sure I understand all the details, but it sounds sorting or >> not sorting should depend on the split location type unless the user >> overrides it by argument. That's what you're suggesting, right? >> >> -hilmar >> >> On Nov 4, 2007, at 7:08 PM, Chris Fields wrote: >> >>> Pass in (-nosort => 1) to spliced_seq: >>> >>> print $feat_object->spliced_seq(-no_sort =>1)->seq,"\n\n"; >>> >>> This ensures no sorting of sublocations occurs, if you want for >>> instance typical GenBank/EMBL 'join' behavior. >>> >>> To the other devs: shouldn't -nosort be the default behavior when >>> the split location is a 'join'? In other words, should spliced_seq >>> () be modified to take into account the split location type when >>> returning sequence? GB/EMBL/DDBJ rel. notes indicate a 'join' >>> explicitly indicates the order of the sequences is important when >>> joined together; the current behavior is more like that for 'order'. >>> >>> chris >>> >>> On Nov 4, 2007, at 12:39 PM, download on demand wrote: >>> >>>> Hi to all. >>>> >>>> I have a problem with a simplest script: >>>> >>>> >>>> >>>> use Bio::SeqIO; >>>> # get command-line arguments, or die with a usage >>>> statement >>>> my $usage = "x2y.pl infile infileformat outfile >>>> outfileformat\n"; >>>> my $infile = shift or die $usage; >>>> my $infileformat = shift or die $usage; >>>> # my $outfile = shift or die $usage; >>>> my $outfileformat = shift or die $usage; >>>> >>>> # create one SeqIO object to read in,and another to write >>>> out >>>> my $seq_in = Bio::SeqIO->new('-file' => "<$infile", >>>> '-format' => $infileformat); >>>> my $seq_out = Bio::SeqIO->new('-fh' => \*STDOUT, >>>> '-format' => >>>> $outfileformat); >>>> >>>> # write each entry in the input file to the output file >>>> while (my $inseq = $seq_in->next_seq) { >>>> >>>> # $seq_out->write_seq($inseq); # Whole sequence not >>>> needed >>>> >>>> for my $feat_object ($inseq->get_SeqFeatures) >>>> { >>>> if ($feat_object->primary_tag eq "CDS") >>>> { >>>> print $feat_object->get_tag_values('product'),"\n"; >>>> print >>>> $feat_object->location->start,"..",$feat_object->location- >>>> >end,"\n"; >>>> print $feat_object->spliced_seq->seq,"\n\n"; >>>> } >>>> } >>>> >>>> >>>> >>>> The result seems OK to me, but in case of first CDS of >>>> NC_005213.gbk from >>>> here >>> Nanoarchaeum_equitans/> the >>>> output is wrong: >>>> >>>> It is: >>>> hypothetical protein >>>> 1..490885 >>>> TAAATGCGATTGCTATTAGAA..................................Truncated >>>> sequence................................... >>>> >>>> Should be: >>>> hypothetical protein >>>> 879..490883 >>>> ATGCGATTGCTATTAGAA...................................Truncated >>>> sequence....................................TAA >>>> >>>> >>>> >>>> This CDS have an unnatural location string: >>>> CDS complement(join(490883..490885,1..879)), but >>>> spliced_seq >>>> should handle these things? >>>> >>>> Please help me! >>>> Best regards, N. >>>> _______________________________________________ >>>> >>> >>> >>> >> >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > Jason Stajich > jason at bioperl.org > Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bosborne11 at verizon.net Mon Nov 5 11:05:41 2007 From: bosborne11 at verizon.net (Brian Osborne) Date: Mon, 05 Nov 2007 12:05:41 -0400 Subject: [Bioperl-l] Bioperl + standalone blast on Mac= cannot find path to blastall In-Reply-To: <472ED3CC.2050305@univ-brest.fr> Message-ID: Jean-luc, >From what you written it sounds like you're using bash and not some other shell (e.g. tcsh, csh), right? If that's the case then create a .bashrc file in your home directory, as well as a .ncbirc file. This should work. I'm no Unix expert but I've always configured tcsh on the Mac in the same ways I'd configure it on Linux machines. Similarly, if you're using bash then it will read its .bashrc file, regardless of what flavor of Unix you use (and the same thing holds true for zsh or csh or ...). Brian O. On 11/5/07 4:26 AM, "Jean-luc Jany" wrote: > Dear Bioperl and Mac users, > > I am a Mac user and would like to run a script I made using > Bio::Tools::Run::StandAloneBlast. Unfortunately, I did not manage to indicate > to Bioperl the pathway to Blastall and other executables. > > I read carefully the following link > http://www.bioperl.org/wiki/HOWTO:StandAloneBlast and tried to indicate the > path to Blast, but I guess the way to proceed is slightly different in Mac and > that I should not create .ncbirc and .bashrc files (e.g. should I modify the > .profile file instead of .bashrc?) > > Actually, my blast file is in myname directory and comprises a /bin and a > /data file. I have got my blastall and other executables in > myname/blast/bin/blastall. > > Thank you in anticipation for your help. > > Jean-Luc > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From arareko at campus.iztacala.unam.mx Mon Nov 5 13:35:56 2007 From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra) Date: Mon, 05 Nov 2007 12:35:56 -0600 Subject: [Bioperl-l] Bioperl + standalone blast on Mac= cannot find path to blastall In-Reply-To: References: Message-ID: <472F628C.2000506@campus.iztacala.unam.mx> If the ~/.bashrc file doesn't work for you, try renaming it to ~/.bash_profile and re-login, that might work best. ~/.bashrc works as an individual per-interactive-shell startup file, whereas ~/.bash_profile is a personal initialization file, executed for login shells. Hope this helps. Regards, Mauricio. Brian Osborne wrote: > Jean-luc, > >>From what you written it sounds like you're using bash and not some other > shell (e.g. tcsh, csh), right? If that's the case then create a .bashrc file > in your home directory, as well as a .ncbirc file. This should work. > > I'm no Unix expert but I've always configured tcsh on the Mac in the same > ways I'd configure it on Linux machines. Similarly, if you're using bash > then it will read its .bashrc file, regardless of what flavor of Unix you > use (and the same thing holds true for zsh or csh or ...). > > Brian O. > > > On 11/5/07 4:26 AM, "Jean-luc Jany" wrote: > >> Dear Bioperl and Mac users, >> >> I am a Mac user and would like to run a script I made using >> Bio::Tools::Run::StandAloneBlast. Unfortunately, I did not manage to indicate >> to Bioperl the pathway to Blastall and other executables. >> >> I read carefully the following link >> http://www.bioperl.org/wiki/HOWTO:StandAloneBlast and tried to indicate the >> path to Blast, but I guess the way to proceed is slightly different in Mac and >> that I should not create .ncbirc and .bashrc files (e.g. should I modify the >> .profile file instead of .bashrc?) >> >> Actually, my blast file is in myname directory and comprises a /bin and a >> /data file. I have got my blastall and other executables in >> myname/blast/bin/blastall. >> >> Thank you in anticipation for your help. >> >> Jean-Luc >> >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- MAURICIO HERRERA CUADRA arareko at campus.iztacala.unam.mx Laboratorio de Gen?tica Unidad de Morfofisiolog?a y Funci?n Facultad de Estudios Superiores Iztacala, UNAM From hlapp at duke.edu Mon Nov 5 16:04:11 2007 From: hlapp at duke.edu (Hilmar Lapp) Date: Mon, 5 Nov 2007 16:04:11 -0500 Subject: [Bioperl-l] Help with Bio::SeqIO In-Reply-To: References: <923c9ce30711041039q3f718911r63eaa5a093226df2@mail.gmail.com> <8543B6EA-7D37-4D59-B22F-01D34BA9C13D@uiuc.edu> Message-ID: On Nov 5, 2007, at 12:41 PM, Chris Fields wrote: > If we go through with the changes to spliced_seq(), should it be > implemented for inclusion in v1.6 or wait until v1.7? I would say they should be implemented ASAP because they 1) should not change behavior for those for which the current default behavior was already broken (and who therefore pass in --no_sort), and 2) fix the behavior for those who erroneously assumed that the code was going to do the right thing by default. I.e., it sounds mostly like a bugfix to me. Am I overlooking something? -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at duke dot edu : =========================================================== From cjfields at uiuc.edu Mon Nov 5 17:12:23 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 5 Nov 2007 16:12:23 -0600 Subject: [Bioperl-l] Help with Bio::SeqIO In-Reply-To: References: <923c9ce30711041039q3f718911r63eaa5a093226df2@mail.gmail.com> <8543B6EA-7D37-4D59-B22F-01D34BA9C13D@uiuc.edu> Message-ID: <980977BB-72C3-401A-848F-AEF2E602E4BE@uiuc.edu> On Nov 5, 2007, at 3:04 PM, Hilmar Lapp wrote: > > On Nov 5, 2007, at 12:41 PM, Chris Fields wrote: > >> If we go through with the changes to spliced_seq(), should it be >> implemented for inclusion in v1.6 or wait until v1.7? > > I would say they should be implemented ASAP because they 1) should > not change behavior for those for which the current default > behavior was already broken (and who therefore pass in --no_sort), > and 2) fix the behavior for those who erroneously assumed that the > code was going to do the right thing by default. > > I.e., it sounds mostly like a bugfix to me. Am I overlooking > something? > > -hilmar > -- Okay; I'll try to get this in soon. chris From jean-luc.jany at univ-brest.fr Tue Nov 6 04:00:07 2007 From: jean-luc.jany at univ-brest.fr (Jean-luc Jany) Date: Tue, 06 Nov 2007 10:00:07 +0100 Subject: [Bioperl-l] Bioperl + standalone blast on Mac= cannot find path to blastall Message-ID: <47302D17.2030500@univ-brest.fr> Thanks Brian. Yes I use bash. I am going to follow your advice as soon as possible (for some reasons I am unable to run bioperl) and come back to you to tell you if it runs. Jean-Luc From jason at bioperl.org Tue Nov 6 16:18:35 2007 From: jason at bioperl.org (Jason Stajich) Date: Tue, 6 Nov 2007 16:18:35 -0500 Subject: [Bioperl-l] lightweight sequence features Message-ID: I started a branch for implementing and playing with lightweight feature object. The branch is called 'lightweight_feature_branch'. Right now it is about 70% faster just in object creation based on parsing features using Bio::Tools::GFF and swapping the types of features that are created. It uses arrays instead of hashes under the hood. So the objects don't have locations under the hood. My hope is if this works okay we could use it for creating objects where we KNOW the underlying features have simple locations so such as parsing in GFF data. -jason -- Jason Stajich jason at bioperl.org From cjfields at uiuc.edu Tue Nov 6 16:57:17 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 6 Nov 2007 15:57:17 -0600 Subject: [Bioperl-l] lightweight sequence features In-Reply-To: References: Message-ID: <5E209F80-2A49-4D6B-A621-04B27AF91D5D@uiuc.edu> Bravo! I once benchmarked Location instance creation once and found it contributed quite a bit of overhead so the speedup with that and the use of arrays makes quite a bit of sense to me. You mention only simple locations; I'm guessing this doesn't handle 'fuzzy' ends? If it did I could see layering the feature data from the get-go, so it could be used just about anywhere in the place of SF::Generic. Maybe something to test out in 1.7? chris On Nov 6, 2007, at 3:18 PM, Jason Stajich wrote: > I started a branch for implementing and playing with lightweight > feature object. The branch is called 'lightweight_feature_branch'. > > Right now it is about 70% faster just in object creation based on > parsing features using Bio::Tools::GFF and swapping the types of > features that are created. It uses arrays instead of hashes under > the hood. > > So the objects don't have locations under the hood. My hope is if > this works okay we could use it for creating objects where we KNOW > the underlying features have simple locations so such as parsing in > GFF data. > > -jason > -- > Jason Stajich > jason at bioperl.org > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From jason at bioperl.org Tue Nov 6 23:14:55 2007 From: jason at bioperl.org (Jason Stajich) Date: Tue, 6 Nov 2007 23:14:55 -0500 Subject: [Bioperl-l] lightweight sequence features In-Reply-To: <5E209F80-2A49-4D6B-A621-04B27AF91D5D@uiuc.edu> References: <5E209F80-2A49-4D6B-A621-04B27AF91D5D@uiuc.edu> Message-ID: Right - only for simple locations. I've got a bunch more tests and fixes to put in. I am hoping this can be fast replacement in the case where we're dealing with this "unflattened" data (i.e. GFF in FeatureIO & Gbrowse). This is sort of a playground until I feel like it can really get it tested a bit more. I'll give an all clear when the dust settles in terms of the design if anyone wants to play/help. -jason On Nov 6, 2007, at 4:57 PM, Chris Fields wrote: > Bravo! I once benchmarked Location instance creation once and > found it contributed quite a bit of overhead so the speedup with > that and the use of arrays makes quite a bit of sense to me. > > You mention only simple locations; I'm guessing this doesn't handle > 'fuzzy' ends? If it did I could see layering the feature data from > the get-go, so it could be used just about anywhere in the place of > SF::Generic. Maybe something to test out in 1.7? > > chris > > On Nov 6, 2007, at 3:18 PM, Jason Stajich wrote: > >> I started a branch for implementing and playing with lightweight >> feature object. The branch is called 'lightweight_feature_branch'. >> >> Right now it is about 70% faster just in object creation based on >> parsing features using Bio::Tools::GFF and swapping the types of >> features that are created. It uses arrays instead of hashes under >> the hood. >> >> So the objects don't have locations under the hood. My hope is if >> this works okay we could use it for creating objects where we KNOW >> the underlying features have simple locations so such as parsing in >> GFF data. >> >> -jason >> -- >> Jason Stajich >> jason at bioperl.org >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From heikki at sanbi.ac.za Wed Nov 7 05:05:59 2007 From: heikki at sanbi.ac.za (Heikki Lehvaslaiho) Date: Wed, 7 Nov 2007 12:05:59 +0200 Subject: [Bioperl-l] Bio::Tools::Run::Mdust Message-ID: <200711071205.59576.heikki@sanbi.ac.za> Hi Donald, I started using your Mdust module in bioperl-run and run into problems immediately. * Only Bio::Seq objects are accepted but not Bio::PrimarySeq objects, although the docs say otherwise * Sequences are modified in place. That is really bad, because that means that the user has to know to create a copy before running Mdust on it. * The docs say that you have to set MDUSTDIR envvar to tell the program where to find the binary. That is actually optional if the binary is on your path. * The tests do not cover any of the options to the program As a quick fix, I suggest that we: * leave the current way of working for Bio::SeqI objects: sequence string is not masked but seqfeatures to that effect are added * Modify run() to return the new masked sequence object when the target is a Bio::PrimarySeqI. * fix the documentation After that it will be possible to simply write: use Bio::Tools::Run::Mdust; $mdust = Bio::Tools::Run::Mdust->new(); $seq_dusted = $m->run($seq); # $seq->isa(PrimarySeqI); Are you happy for me to do this or do you want to do it yourself? Yours, -Heikki -- ______ _/ _/_____________________________________________________ _/ _/ _/ _/ _/ Heikki Lehvaslaiho _/_/_/_/_/ heikki at_sanbi _ac _za skype: heikki_lehvaslaiho _/ _/ _/ SANBI, South African National Bioinformatics Institute _/ _/ _/ University of Western Cape, South Africa _/ Phone: +27 21 959 2096 FAX: +27 21 959 2512 ___ _/_/_/_/_/________________________________________________________ From Kevin.M.Brown at asu.edu Wed Nov 7 13:04:50 2007 From: Kevin.M.Brown at asu.edu (Kevin Brown) Date: Wed, 7 Nov 2007 11:04:50 -0700 Subject: [Bioperl-l] Bio::Ext::Align? Message-ID: <1A4207F8295607498283FE9E93B775B403F7F6FE@EX02.asurite.ad.asu.edu> I installed bioperl-ext from CVS, but can't figure out what else is missing to utilize Bio::Tools::pSW. The error I get from the example script in the wiki is: The C-compiled engine for Smith Waterman alignments (Bio::Ext::Align) has not been installed. Please read the install the bioperl-ext package BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/Bio/Tools/pSW.pm line 128. Compilation failed in require at ./align_test.pl line 3. BEGIN failed--compilation aborted at ./align_test.pl line 3. In /usr/lib/perl5/site_perl/5.8.5/Bio/Ext there is a folder called Align, but no Align.pm file. I followed the directions in the wiki to install 1.5.2_102 (think I had _100 installed previously). Any thoughts on what I'm missing? From jason at bioperl.org Wed Nov 7 14:52:16 2007 From: jason at bioperl.org (Jason Stajich) Date: Wed, 7 Nov 2007 14:52:16 -0500 Subject: [Bioperl-l]