From kvddrift at earthlink.net Sun Feb 1 12:08:36 2004 From: kvddrift at earthlink.net (Koen van der Drift) Date: Sun Feb 1 12:15:44 2004 Subject: [Bioperl-l] Bio::UnivAln mourned In-Reply-To: <200402010251.i112pSEt031610@portal.open-bio.org> References: <200402010251.i112pSEt031610@portal.open-bio.org> Message-ID: <45F2F0CC-54D9-11D8-A807-003065A5FDCC@earthlink.net> > But I just migrated from linux to OS X, and my latest install (bioperl > 1.4) > was virgin to my new machine. > > After my scripts carped I placed UnivAln.pm from an old bioperl dist (I > used 0.7.0) > into my bioperl-1.x build tree (install-dist/Bio) and added it's name > to my MANIFEST > before running (or re-running) perl Makefile.PL. > > Now it works without apparent conflicts in my set-up, YMMV. > > I understand that maybe the demise of the module is due to its lack of > a maintainer keeping its > guts up-to-date with bioperl root structure. I would perhaps be willing > to help here. > Would there be objections to its revival? Or are there plans to expand > existing modules? Hi, I am the current maintainer of the fink package for bioperl. I have already submitted the new version (1.4) to fink a few weeks ago. Unfortunately, the powers that be have not yet put the package into the fink cvs, so it is not yet available. The fink team is currently restructuring the way perl modules get installed, and it seems that the bioperl package is on hold until this is resolved. If you want I can mail you the info file for bioperl offlist, so you can use it with fink. - Koen. From lstein at cshl.edu Mon Feb 2 04:07:10 2004 From: lstein at cshl.edu (Lincoln Stein) Date: Mon Feb 2 04:13:56 2004 Subject: [Bioperl-l] Filehandle interface in Bio::AlignIO: broken in 1.4? In-Reply-To: References: Message-ID: <200402021107.10368.lstein@cshl.edu> So sorry about breaking the newFh method. You can restore the previous behavior by passing \*ARGV to the -fh argument when you create the SeqIO object. Unfortunately the earlier behavior made it impossible to create a seqIO object that was write only, and as a result some higher-level modules were gobbling the STDIN inappropriately. Lincoln On Saturday 31 January 2004 05:07 pm, Jason Stajich wrote: > Dave - > > This is due to a change lincoln made to not default to the magic <> > operator when no filename is provided. It broke a lot of my > scripts too. > > [This is basically part of similar question that Peter was asking > wrt SeqIO] > > I would like to see it come back somehow but I am not sure how as > it causes certain things to block during the tests. > > it has nothing to do with the newFh method but with Bio::Root::IO > in the _readline method. > > < my $fh = $self->_fh || \*ARGV; > --- > > > my $fh = $self->_fh or return; > > If you want the old functionality just change that line back in > your code for the time being. > > > --jason > > On Sat, 31 Jan 2004, Dave NO SPAM Ardell wrote: > > Hi, > > > > I searched for this in docs and the mailing list but didn't find > > anything. Also > > looked quickly in bugzilla and the Changelist with no result. > > So sorry if I am rehashing something known. > > > > Was surprised after installing 1.4 that filehandle functionality > > documented > > as part of the interface in AlignIO seems broken. That is to say, > > > > 1: $stream = Bio::AlignIO->newFh('-format' => "$opt_i"); # read > > from standard input > > 2: while ( my $aln = <$stream> ) { > > > > doesn't work anymore. It doesn't die, but the diamond operator > > returns null regardless of input. During debugging, > > > > gdb> x $stream > > > > after line 1 above, executing with sequence input on STDIN > > gives: > > > > ------ BEGIN DEBUGGER OUTPUT > > 0 GLOB(0xbbd178) > > -> *Symbol::GEN0 > > Can't locate object method "FILENO" via package > > "Bio::AlignIO::fasta" at /System/Library/Perl/5.8.1/dumpvar.pl > > line 238. > > dumpvar::unwrap('GLOB(0xbbd178)',3,-2) called at > > /System/Library/Perl/5.8.1/dumpvar.pl line 118 > > dumpvar::DumpElem('GLOB(0xbbd178)',3,-2) called at > > /System/Library/Perl/5.8.1/dumpvar.pl line 223 > > dumpvar::unwrap('ARRAY(0xbbd0e8)',0,-1) called at > > /System/Library/Perl/5.8.1/dumpvar.pl line 33 > > main::dumpValue('ARRAY(0xbbd0e8)',-1) called at > > /System/Library/Perl/5.8.1/perl5db.pl line 5270 > > DB::dumpit('GLOB(0x143988)','ARRAY(0xbbd0e8)') called at > > /System/Library/Perl/5.8.1/perl5db.pl line 647 > > DB::eval called at /System/Library/Perl/5.8.1/perl5db.pl > > line 3314 > > DB::DB called at /Users/dave/Data/mybin/pi line 44 > > > > ----- END DEBUGGER OUTPUT ----- > > > > I see from recent scripts, for instance, bp_sreformat.pl, and > > from documentation > > that stdin/stdout functionality can be had through AlignIO::new > > > > But for my quickfix, I reinstalled bioperl 1.2.3 > > > > Maybe the change is intentional. > > Although I guess I would miss the filehandle interface, > > I could get over it =) > > Maybe the change should be documented though. > > > > Apologies if it is and I missed it. > > > > Dave > > > > > > --+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+-- > >+--+-- +-- > > David Ardell, Asst. Professor. tel : 46 (0) 18 471 6694 > > Linnaeus Centre for Bioinformatics fax : 46 (0) 18 471 6698 > > Uppsala University Biomedical Center > > http://www.lcb.uu.se/~dave Husargatan 3, Box 598, > > SE 751 24 Uppsala SWEDEN. > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l@portal.open-bio.org > > http://portal.open-bio.org/mailman/listinfo/bioperl-l > > -- > Jason Stajich > Duke University > jason at cgt.mc.duke.edu > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 From Bernhard.Schmalhofer at biomax.de Mon Feb 2 04:15:20 2004 From: Bernhard.Schmalhofer at biomax.de (Bernhard Schmalhofer) Date: Mon Feb 2 04:21:54 2004 Subject: [Bioperl-l] Testing BioPerl objects for equality In-Reply-To: References: Message-ID: <401E1528.5060309@biomax.de> Ewan Birney wrote: > > On Thu, 29 Jan 2004, Peter van Heusden wrote: > > >>I've got an idea for testing where I'd like to 'round-trip' through >>SeqIO: read in from a file on disk, write out again with write_seq() and >>then read in the file written by write_seq() and compare the two >>sequence objects. If they aren't equal, it means we've got a problem. > > > That sounds like a great idea... we've always had problems with diff'ing > the files because of whitespace issues, but diff'ing the objects sounds > great. > > >>To make this work requires some kind of equals() method on Seq, >>SeqFeature, etc. This doesn't seem to be there at the moment - or am I >>missing something? Maybe there should probably be some kind of >>Bio::ComparableI interface which provides an equals() abstract method. >> > If the roundtrip is starting from a file is a specific format, shouldn't it be possiple to compare the data structures of the sequence object directly? I was think of using something like Test::More::is_deeply(), which tells you where the data structures start to become different. CU, Bernhard -- ************************************************** Bernhard Schmalhofer Senior Developer Biomax Informatics AG Lochhamer Str. 11 82152 Martinsried, Germany Tel: +49 89 895574-839 Fax: +49 89 895574-825 eMail: Bernhard.Schmalhofer@biomax.com Website: www.biomax.com ************************************************** From heikki at ebi.ac.uk Mon Feb 2 06:41:23 2004 From: heikki at ebi.ac.uk (Heikki Lehvaslaiho) Date: Mon Feb 2 06:47:25 2004 Subject: [Bioperl-l] working with large alignments Message-ID: <200402021141.23420.heikki@ebi.ac.uk> Albert Vilella who is visiting me here at EBI works with really big genomic sequence alignments. I've committed several of his modules into cvs for that purpose. The most important additions are: * Bio::Seq::LargeLocatableSeq Bio::RangeI compliant Bio::Seq::LargePrimarySeq uses File::Tmp for seq storing * Bio::Seq::LargeSeqI Interface class for LargeSeq implemantations * Bio::AlignIO::largemultifasta IO class creating Bio::Seq::LargeLocatableSeq and SimpleAlign objects The LargeLocatableSeq is based on code from Bio::Seq::LargePrimarySeq. Everything seems to work but if we run tests added to the end of the t/AlignIO.t file with larger files, the process is still using large amount of memory. We'be interested from hearing from anyone who can suggest improvements. You are willling to test the code with larger data sets, I've put two files here: http://www.ebi.ac.uk/~lehvasla/bioperl/medium.largemultifasta (1.3M) http://www.ebi.ac.uk/~lehvasla/bioperl/large.largemultifasta (31M) Thanks, -Heikki and Albert -- ______ _/ _/_____________________________________________________ _/ _/ http://www.ebi.ac.uk/mutations/ _/ _/ _/ Heikki Lehvaslaiho heikki_at_ebi ac uk _/_/_/_/_/ EMBL Outstation, European Bioinformatics Institute _/ _/ _/ Wellcome Trust Genome Campus, Hinxton _/ _/ _/ Cambs. CB10 1SD, United Kingdom _/ Phone: +44 (0)1223 494 644 FAX: +44 (0)1223 494 468 ___ _/_/_/_/_/________________________________________________________ From m_conte at hotmail.com Mon Feb 2 07:11:25 2004 From: m_conte at hotmail.com (matthieu CONTE) Date: Mon Feb 2 07:17:39 2004 Subject: [Bioperl-l] Bio ::seqIO ::tigr Message-ID: Ok... But the method ?get_BioDatabaseAdaptor? doesn't exist in the Bio::DB::BioSQL::DBAdaptor module (documentation). I didn't find it on the bioperl-db web page Any idea ? Thanks Matthieu CONTE M. Sc. in Bioinformatics from SIB 00 33 06.68.90.28.70 m_conte@hotmail.com >From: Hilmar Lapp >To: "matthieu CONTE" >CC: bioperl-l@bioperl.org >Subject: Re: [Bioperl-l] Bio ::seqIO ::tigr >Date: Wed, 28 Jan 2004 08:55:15 -0800 > >I suspect you have an old version of bioperl-db, or a version mix-up. You >need to download and install the latest revision from CVS for bioperl-db. > >Note that if the root of the problem is with the pir parser then >load_seqdatabase.pl will not cure it, as it just uses any Bio::SeqIO >compliant parser to provide the input sequences. If the parser is broken >then there won't be input ... It just saves you the round-trip (and >possible errors associated with it) of going through swissprot format. > > -hilmar > >On Wednesday, January 28, 2004, at 02:07 AM, matthieu CONTE wrote: > >>Ok , I try directly with "load_seqdatabase.pl" but there is another >>problem..... >> >>[conte@bearn scripts]$ perl load_seqdatabase.pl -dbuser biosql -dbpass >>biosql -format tigr tigr >>/home/conte/pipeline_orthologues/data/orysa_tigr.txt >> >>Can't locate object method "get_BioDatabaseAdaptor" via package >>"Bio::DB::BioSQL::DBAdaptor" at load_seqdatabase.pl line 84. >> >>Indeed this method does not exist in Bio::DB::BioSQL::DBAdaptor.... >> >> >> >> >>Matthieu CONTE >>M. Sc. in Bioinformatics from SIB >> >>00 33 06.68.90.28.70 >>m_conte@hotmail.com >> >> >> >> >> >>>From: Hilmar Lapp >>>To: "matthieu CONTE" >>>CC: bioperl-l@bioperl.org >>>Subject: Re: [Bioperl-l] Bio ::seqIO ::tigr Date: Tue, 27 Jan 2004 >>>09:31:39 -0800 >>> >>>A question aside: why do you want to convert to swissprot in order to >>>load into biosql? (load_seqdatabase.pl can use any SeqIO reader.) >>> >>> -hilmar >>> >>>On Tuesday, January 27, 2004, at 02:50 AM, matthieu CONTE wrote: >>> >>>>I currently trying to use the Bio ::seqIO ::tigr module. >>>>My objective is to download the whole rice genome form Tigr ( adress >>>>below)and to integrate it in my BioSQL DB. >>>>For this I am trying to convert the tigr format in swiss format with the >>>>script below >>>> >>>> >>>>use Bio::SeqIO; >>>> >>>>my $in = Bio::SeqIO->new(-file >>>>=>'>>>=>'tigr'); >>>> >>>>my $out = Bio::SeqIO->new(-file => >>>>'>/home/conte/pipeline_orthologues/data/orysa_swiss.txt' , >>>>-format=>'swiss'); >>>> >>>>print $out $_ while <$in>; >>>> >>>>I obtain: >>>> >>>>------------ EXCEPTION ------------- >>>>MSG: [19]Required missing >>>>STACK Bio::SeqIO::tigr::throw >>>>/usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO/tigr.pm:1338 >>>>STACK Bio::SeqIO::tigr::_process_header >>>>/usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO/tigr.pm:700 >>>>STACK Bio::SeqIO::tigr::_process_assembly >>>>/usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO/tigr.pm:535 >>>>STACK Bio::SeqIO::tigr::_process_tigr >>>>/usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO/tigr.pm:453 >>>>STACK Bio::SeqIO::tigr::_process >>>>/usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO/tigr.pm:420 >>>>STACK Bio::SeqIO::tigr::_initialize >>>>/usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO/tigr.pm:90 >>>>STACK Bio::SeqIO::new >>>>/usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO.pm:358 >>>>STACK Bio::SeqIO::new >>>>/usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO.pm:378 >>>>STACK toplevel get_bioseq_tigr.pl:8 >>>> >>>>Could you please tell me if there is a problem with the parser or with >>>>the input data format of Tigr? >>>> >>>>Thanks in advance >>>> >>>> >>>> >>>> >>>>Matthieu CONTE >>>>m_conte@hotmail.com >>>> >>>>_________________________________________________________________ >>>>MSN Messenger : discutez en direct avec vos amis ! >>>>http://www.msn.fr/msger/default.asp >>>> >>>>_______________________________________________ >>>>Bioperl-l mailing list >>>>Bioperl-l@portal.open-bio.org >>>>http://portal.open-bio.org/mailman/listinfo/bioperl-l >>>> >>>> >>>-- >>>------------------------------------------------------------- >>>Hilmar Lapp email: lapp at gnf.org >>>GNF, San Diego, Ca. 92121 phone: +1-858-812-1757 >>>------------------------------------------------------------- >>> >>> >> >>_________________________________________________________________ >>MSN Messenger : discutez en direct avec vos amis ! >>http://www.msn.fr/msger/default.asp >> >> >-- >------------------------------------------------------------- >Hilmar Lapp email: lapp at gnf.org >GNF, San Diego, Ca. 92121 phone: +1-858-812-1757 >------------------------------------------------------------- > > _________________________________________________________________ MSN Search, le moteur de recherche qui pense comme vous ! http://search.msn.fr/worldwide.asp From hlapp at gmx.net Mon Feb 2 10:55:40 2004 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon Feb 2 11:01:57 2004 Subject: [Bioperl-l] Bio ::seqIO ::tigr In-Reply-To: Message-ID: <3FF205D0-5598-11D8-8C77-000A959EB4C4@gmx.net> That's why I said you seem to have a version mix-up in addition. get_BioDatabaseAdaptor is part of the 0.1 API, which was retired more than a year ago. -hilmar On Monday, February 2, 2004, at 04:11 AM, matthieu CONTE wrote: > > Ok... > But the method ?get_BioDatabaseAdaptor? doesn't exist in the > Bio::DB::BioSQL::DBAdaptor module (documentation). I didn't find it on > the bioperl-db web page > Any idea ? > > Thanks > > > > Matthieu CONTE > M. Sc. in Bioinformatics from SIB > > 00 33 06.68.90.28.70 > m_conte@hotmail.com > > > > > >> From: Hilmar Lapp >> To: "matthieu CONTE" >> CC: bioperl-l@bioperl.org >> Subject: Re: [Bioperl-l] Bio ::seqIO ::tigr >> Date: Wed, 28 Jan 2004 08:55:15 -0800 >> >> I suspect you have an old version of bioperl-db, or a version mix-up. >> You need to download and install the latest revision from CVS for >> bioperl-db. >> >> Note that if the root of the problem is with the pir parser then >> load_seqdatabase.pl will not cure it, as it just uses any Bio::SeqIO >> compliant parser to provide the input sequences. If the parser is >> broken then there won't be input ... It just saves you the round-trip >> (and possible errors associated with it) of going through swissprot >> format. >> >> -hilmar >> >> On Wednesday, January 28, 2004, at 02:07 AM, matthieu CONTE wrote: >> >>> Ok , I try directly with "load_seqdatabase.pl" but there is another >>> problem..... >>> >>> [conte@bearn scripts]$ perl load_seqdatabase.pl -dbuser biosql >>> -dbpass biosql -format tigr tigr >>> /home/conte/pipeline_orthologues/data/orysa_tigr.txt >>> >>> Can't locate object method "get_BioDatabaseAdaptor" via package >>> "Bio::DB::BioSQL::DBAdaptor" at load_seqdatabase.pl line 84. >>> >>> Indeed this method does not exist in Bio::DB::BioSQL::DBAdaptor.... >>> >>> >>> >>> >>> Matthieu CONTE >>> M. Sc. in Bioinformatics from SIB >>> >>> 00 33 06.68.90.28.70 >>> m_conte@hotmail.com >>> >>> >>> >>> >>> >>>> From: Hilmar Lapp >>>> To: "matthieu CONTE" >>>> CC: bioperl-l@bioperl.org >>>> Subject: Re: [Bioperl-l] Bio ::seqIO ::tigr Date: Tue, 27 Jan 2004 >>>> 09:31:39 -0800 >>>> >>>> A question aside: why do you want to convert to swissprot in order >>>> to load into biosql? (load_seqdatabase.pl can use any SeqIO >>>> reader.) >>>> >>>> -hilmar >>>> >>>> On Tuesday, January 27, 2004, at 02:50 AM, matthieu CONTE wrote: >>>> >>>>> I currently trying to use the Bio ::seqIO ::tigr module. >>>>> My objective is to download the whole rice genome form Tigr ( >>>>> adress below)and to integrate it in my BioSQL DB. >>>>> For this I am trying to convert the tigr format in swiss format >>>>> with the script below >>>>> >>>>> >>>>> use Bio::SeqIO; >>>>> >>>>> my $in = Bio::SeqIO->new(-file >>>>> =>'>>>> =>'tigr'); >>>>> >>>>> my $out = Bio::SeqIO->new(-file => >>>>> '>/home/conte/pipeline_orthologues/data/orysa_swiss.txt' , >>>>> -format=>'swiss'); >>>>> >>>>> print $out $_ while <$in>; >>>>> >>>>> I obtain: >>>>> >>>>> ------------ EXCEPTION ------------- >>>>> MSG: [19]Required missing >>>>> STACK Bio::SeqIO::tigr::throw >>>>> /usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO/ >>>>> tigr.pm:1338 >>>>> STACK Bio::SeqIO::tigr::_process_header >>>>> /usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO/>>>>> tigr.pm:700 >>>>> STACK Bio::SeqIO::tigr::_process_assembly >>>>> /usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO/>>>>> tigr.pm:535 >>>>> STACK Bio::SeqIO::tigr::_process_tigr >>>>> /usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO/>>>>> tigr.pm:453 >>>>> STACK Bio::SeqIO::tigr::_process >>>>> /usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO/>>>>> tigr.pm:420 >>>>> STACK Bio::SeqIO::tigr::_initialize >>>>> /usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO/tigr.pm:90 >>>>> STACK Bio::SeqIO::new >>>>> /usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO.pm:358 >>>>> STACK Bio::SeqIO::new >>>>> /usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO.pm:378 >>>>> STACK toplevel get_bioseq_tigr.pl:8 >>>>> >>>>> Could you please tell me if there is a problem with the parser or >>>>> with the input data format of Tigr? >>>>> >>>>> Thanks in advance >>>>> >>>>> >>>>> >>>>> >>>>> Matthieu CONTE >>>>> m_conte@hotmail.com >>>>> >>>>> _________________________________________________________________ >>>>> MSN Messenger : discutez en direct avec vos amis ! >>>>> http://www.msn.fr/msger/default.asp >>>>> >>>>> _______________________________________________ >>>>> Bioperl-l mailing list >>>>> Bioperl-l@portal.open-bio.org >>>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l >>>>> >>>>> >>>> -- >>>> ------------------------------------------------------------- >>>> Hilmar Lapp email: lapp at gnf.org >>>> GNF, San Diego, Ca. 92121 phone: +1-858-812-1757 >>>> ------------------------------------------------------------- >>>> >>>> >>> >>> _________________________________________________________________ >>> MSN Messenger : discutez en direct avec vos amis ! >>> http://www.msn.fr/msger/default.asp >>> >>> >> -- >> ------------------------------------------------------------- >> Hilmar Lapp email: lapp at gnf.org >> GNF, San Diego, Ca. 92121 phone: +1-858-812-1757 >> ------------------------------------------------------------- >> >> > > _________________________________________________________________ > MSN Search, le moteur de recherche qui pense comme vous ! > http://search.msn.fr/worldwide.asp > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > > -- ------------------------------------------------------------- Hilmar Lapp email: lapp at gnf.org GNF, San Diego, Ca. 92121 phone: +1-858-812-1757 ------------------------------------------------------------- From mitchell at odin.mdacc.tmc.edu Mon Feb 2 16:45:04 2004 From: mitchell at odin.mdacc.tmc.edu (James Mitchell) Date: Mon Feb 2 16:51:22 2004 Subject: [Bioperl-l] Bio::Ontology - parsing GO files Message-ID: I'm using Bio::Ontology modules to access GO tree information, ie. ancestors/descendants of a given node. I'm using it like this: --- use strict; use Bio::Ontology::SimpleGOEngine; my $gendir = $ENV{GeneLink_Dir}; my $deffile = $gendir . "GO.defs"; my $comfile = $gendir . "component.ontology"; my $funfile = $gendir . "function.ontology"; my $profile = $gendir . "process.ontology"; my $parser = Bio::Ontology::SimpleGOEngine->new ( -defs_file => $deffile, -files => [$comfile, $funfile, $profile] ); my $engine = $parser->parse(); --- I'm getting this error though: Can't locate object method "parse" via package "Bio::Ontology::SimpleGOEngine" ( perhaps you forgot to load "Bio::Ontology::SimpleGOEngine"?) --- Is this the correct method for parsing GO files? I'm using version 1.4 on Windows. thanks, James From ukWoodwards at uk.tk Tue Feb 3 03:45:28 2004 From: ukWoodwards at uk.tk (Help) Date: Mon Feb 2 21:48:47 2004 Subject: [Bioperl-l] sukper viagrma Message-ID: It`s fabuklous! I took the only one pijll of Cialjs and that was such a GREAT weekend! All the girls at the party were just punch-drunk with my potential I have fhcked all of them THREE times but my dhck WAS able to do some more! Cbalis- it`s COOL!!! The best weekend stuff I've ever trhied! Haven`t you tried yet? DO IT NkOW at http://www.vow-meds.com/sv/index.php?pid=genviag volley annuls stirrings nagging cycled Metrecal bonnet seceding pinnings. From hlapp at gmx.net Tue Feb 3 01:32:58 2004 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue Feb 3 01:39:18 2004 Subject: [Bioperl-l] Bio::Ontology - parsing GO files In-Reply-To: Message-ID: On Monday, February 2, 2004, at 01:45 PM, James Mitchell wrote: > my $parser = Bio::Ontology::SimpleGOEngine->new > Is this still in the documentation? If so, I apologize. You parse ontologies analogous to other IO APIs in bioperl: $ont_stream = Bio::OntologyIO->new(-format => 'go', -files => [....], -defs_file => $deffile); and then while(my $ont = $ont_stream->next_ontology()) { # do something with $ont (it's a Bio::Ontology::OntologyI) } Most ontology streams will only have one ontology ever. So, for GO you could as well say my $go_ont = $ont_stream->next_ontology(); Hth, -hilmar -- ------------------------------------------------------------- Hilmar Lapp email: lapp at gnf.org GNF, San Diego, Ca. 92121 phone: +1-858-812-1757 ------------------------------------------------------------- From lstein at cshl.edu Tue Feb 3 04:37:34 2004 From: lstein at cshl.edu (Lincoln Stein) Date: Tue Feb 3 04:44:34 2004 Subject: [Bioperl-l] Testing BioPerl objects for equality In-Reply-To: <401E1528.5060309@biomax.de> References: <401E1528.5060309@biomax.de> Message-ID: <200402031137.34626.lstein@cshl.edu> I think that's a great idea. I hadn't known about test_deeply(). There's also a Test::Differences module, that does something similar. Lincoln On Monday 02 February 2004 11:15 am, Bernhard Schmalhofer wrote: > Ewan Birney wrote: > > On Thu, 29 Jan 2004, Peter van Heusden wrote: > >>I've got an idea for testing where I'd like to 'round-trip' > >> through SeqIO: read in from a file on disk, write out again with > >> write_seq() and then read in the file written by write_seq() and > >> compare the two sequence objects. If they aren't equal, it means > >> we've got a problem. > > > > That sounds like a great idea... we've always had problems with > > diff'ing the files because of whitespace issues, but diff'ing the > > objects sounds great. > > > >>To make this work requires some kind of equals() method on Seq, > >>SeqFeature, etc. This doesn't seem to be there at the moment - or > >> am I missing something? Maybe there should probably be some kind > >> of Bio::ComparableI interface which provides an equals() > >> abstract method. > > If the roundtrip is starting from a file is a specific format, > shouldn't it be possiple to compare the data structures of the > sequence object directly? > I was think of using something like Test::More::is_deeply(), which > tells you where the data structures start to become different. > > CU, Bernhard -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 From christop21whitney at hotmail.com Tue Feb 3 05:00:33 2004 From: christop21whitney at hotmail.com (mauricio) Date: Tue Feb 3 05:09:47 2004 Subject: [Bioperl-l] Stronger than V1AGRA?! Message-ID: <1075802433-5919@excite.com> The Biggest New Drug since V1agra! Many times as powerful. C1AL1S has been seen all over TV as of late. So why is it so much better than V1agra? Why are so many switching brands? -A quicker more stable erection -More enjoyable sex for both -Longer sex -Known to add length to you erection -Lasts up to 36 hours (not a thrity-six hour erection, but enhancement for thirty-six) We have it at a discounted savings. Save when you go through our site on all your orders. See the difference today. http://aspen.instrhh.com/s95c/index.php?id=s95 passion electriccracker fiona dickhead vanilla niki sugar tootsie hanson image biology october god stormy From billthebrute at yahoo.fr Tue Feb 3 08:04:26 2004 From: billthebrute at yahoo.fr (=?iso-8859-1?q?william=20ritchie?=) Date: Tue Feb 3 08:10:39 2004 Subject: [Bioperl-l] Splice site Message-ID: <20040203130426.21627.qmail@web25209.mail.ukl.yahoo.com> Hi, I m looking for an implementation of a good splice site prediction algorithm (like netgene or sitevideo). Does anyone have any suggestions? Thanks. _________________________________________________________________ Do You Yahoo!? -- Une adresse @yahoo.fr gratuite et en fran?ais ! Yahoo! Mail : http://fr.mail.yahoo.com From Sebastien.Moretti at igs.cnrs-mrs.fr Tue Feb 3 08:49:02 2004 From: Sebastien.Moretti at igs.cnrs-mrs.fr (Sebastien Moretti) Date: Tue Feb 3 08:52:19 2004 Subject: [Bioperl-l] Uniprot In-Reply-To: <20040203130426.21627.qmail@web25209.mail.ukl.yahoo.com> References: <20040203130426.21627.qmail@web25209.mail.ukl.yahoo.com> Message-ID: <200402031449.03089.Sebastien.Moretti@igs.cnrs-mrs.fr> Hello Is there a BioPerl module to send a request to UniProt db ? My script send a request to Swissprot : #!/usr/bin/perl use strict; use Bio::DB::SwissProt; use Bio::SeqIO; my $acc=$ARGV[0]; my $gb = new Bio::DB::SwissProt; my $stream = $gb->get_Seq_by_acc($acc); my $out=Bio::SeqIO->new(-format=>'swiss'); my $result=$out->write_seq($stream); $result =~ s/^1.*$//; print $result; exit; But how can I do the same with UniProt ? Thanks -- Sebastien MORETTI CNRS - IGS 31 chemin Joseph Aiguier 13402 Marseille cedex 20, FRANCE tel. 04 91 16 44 55 - 06 61 88 59 00 From Wiepert.Mathieu at mayo.edu Tue Feb 3 09:49:49 2004 From: Wiepert.Mathieu at mayo.edu (Wiepert, Mathieu) Date: Tue Feb 3 09:56:35 2004 Subject: [Bioperl-l] Difference between Message-ID: <2F41CC6C9777D311ACBD009027B108EA06E9AFD5@excsrv32.mayo.edu> Hi, When I reported this, I was told that it was actually a minor bug, and they would look into it. It didn't sound like something they were going to address any time soon, and I never followed up, so guess it is still the same issue... -mat > -----Original Message----- > From: Alan Li [mailto:immunoguest@hotmail.com] > Sent: Saturday, January 31, 2004 5:26 PM > To: Wiepert, Mathieu; bioperl-l@bioperl.org > Subject: RE: [Bioperl-l] Difference between > > > I would like to thank everyone for their responses. > > And yes, Mat is right about this being an issue with the XML > output of > stand-alone blast. I tried comparing the results of just the > stand-alone > blast using different -F flags. The results below shows that > if "-F F" is > set the results are the same, but are different when using > "-F T" for the > XML output. > > So is there anything I could do to make the XML results the > same when the > filtering option is set to true? Perhaps either through > another blast > parameter or by doing it programmatically? > > -------------------------------------------------------------- > --------- > > blastall -p blastn -m 7 -F T -d ecoli/ecoli.nt -i test.txt > > > 1 > gi|1786181|gb|AE000111.1|AE000111 > Escherichia coli K-12 MG1655 section 1 of > 400 of the > complete genome > AE000111 > 10596 > > > 1 > 589.253 > 297 > 1.04898e-168 > 237 > 560 > 237 > 560 > 1 > 1 > 324 > 324 > 324 > > AGGTAACGGTGCGGGCTGACGCGTACAGGAAACACAGAAAAAAGCCCGCACC > TGACAGTGCGGGCTTTTTTTTTCGACCAAAGGTAACGAGGTAACAACCATGCGAGTGTTGAA > GTTCGGCGGTACATCAGTGGCAAATGCAGAACGTTTTCTGCGTGTTGCCGATATTCTGGAAA > GCAATGCCAGGCAGGGGCAGGTGGCCACCGTCCTCTCTGCCCCCGCCAAAATCACCAACCAC > CTGGTGGCGATGATTGAAAAAACCATTAGCGGCCAGGATGCTTTACCCAATATCAGCGATGC > CGAACGTATTTTTGCCGAACTTTT > > AGGTAACGGTGCGGGCTGACGCGTACAGGAAACACAGAAAAAAGCCCGCACC > TGACAGTGCGGGCTTTTTTTTTCGACCAAAGGTAACGAGGTAACAACCATGCGAGTGTTGAA > GTTCGGCGGTACATCAGTGGCAAATGCAGAACGTTTTCTGCGTGTTGCCGATATTCTGGAAA > GCAATGCCAGGCAGGGGCAGGTGGCCACCGTCCTCTCTGCCCCCGCCAAAATCACCAACCAC > CTGGTGGCGATGATTGAAAAAACCATTAGCGGCCAGGATGCTTTACCCAATATCAGCGATGC > CGAACGTATTTTTGCCGAACTTTT > > ||||||||||||||||||||||||||||||||||||||||||||||||| > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > ||||||||||||||||||||||||||| > > > -------------------------------------------------------------- > --------- > > blastall -p blastn -m 0 -F T -d ecoli/ecoli.nt -i test.txt > > >gb|AE000111.1|AE000111 Escherichia coli K-12 MG1655 section > 1 of 400 of the > >complete > genome > Length = 10596 > > Score = 589 bits (297), Expect = e-168 > Identities = 315/324 (97%) > Strand = Plus / Plus > > > Query: 237 > aggtaacggtgcgggctgacgcgtacaggaaacacagaaaaaagcccgcacctgacagtg 296 > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > Sbjct: 237 > aggtaacggtgcgggctgacgcgtacaggaaacacagaaaaaagcccgcacctgacagtg 296 > > > Query: 297 > cgggcnnnnnnnnncgaccaaaggtaacgaggtaacaaccatgcgagtgttgaagttcgg 356 > ||||| > |||||||||||||||||||||||||||||||||||||||||||||| > Sbjct: 297 > cgggctttttttttcgaccaaaggtaacgaggtaacaaccatgcgagtgttgaagttcgg 356 > > > Query: 357 > cggtacatcagtggcaaatgcagaacgttttctgcgtgttgccgatattctggaaagcaa 416 > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > Sbjct: 357 > cggtacatcagtggcaaatgcagaacgttttctgcgtgttgccgatattctggaaagcaa 416 > > > Query: 417 > tgccaggcaggggcaggtggccaccgtcctctctgcccccgccaaaatcaccaaccacct 476 > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > Sbjct: 417 > tgccaggcaggggcaggtggccaccgtcctctctgcccccgccaaaatcaccaaccacct 476 > > > Query: 477 > ggtggcgatgattgaaaaaaccattagcggccaggatgctttacccaatatcagcgatgc 536 > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > Sbjct: 477 > ggtggcgatgattgaaaaaaccattagcggccaggatgctttacccaatatcagcgatgc 536 > > > Query: 537 cgaacgtatttttgccgaactttt 560 > |||||||||||||||||||||||| > Sbjct: 537 cgaacgtatttttgccgaactttt 560 > > -------------------------------------------------------------- > --------- > > blastall -p blastn -m 7 -F F -d ecoli/ecoli.nt -i test.txt > > > 1 > gi|1786181|gb|AE000111.1|AE000111 > Escherichia coli K-12 MG1655 section 1 of > 400 of the > complete genome > AE000111 > 10596 > > > 1 > 1110.61 > 560 > 0 > 1 > 560 > 1 > 560 > 1 > 1 > 560 > 560 > 560 > > AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAA > AGAGTGTCTGATAGCAGCTTCTGAACTGGTTACCTGCCGTGAGTAAATTAAAATTTTATTGA > CTTAGGTCACTAAATACTTTAACCAATATAGGCATAGCGCACAGACAGATAAAAATTACAGA > GTACACAACATCCATGAAACGCATTAGCACCACCATTACCACCACCATCACCATTACCACAG > GTAACGGTGCGGGCTGACGCGTACAGGAAACACAGAAAAAAGCCCGCACCTGACAGTGCGGG > CTTTTTTTTTCGACCAAAGGTAACGAGGTAACAACCATGCGAGTGTTGAAGTTCGGCGGTAC > ATCAGTGGCAAATGCAGAACGTTTTCTGCGTGTTGCCGATATTCTGGAAAGCAATGCCAGGC > AGGGGCAGGTGGCCACCGTCCTCTCTGCCCCCGCCAAAATCACCAACCACCTGGTGGCGATG > ATTGAAAAAACCATTAGCGGCCAGGATGCTTTACCCAATATCAGCGATGCCGAACGTATTTT > TGCCGAACTTTT > > AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAA > AGAGTGTCTGATAGCAGCTTCTGAACTGGTTACCTGCCGTGAGTAAATTAAAATTTTATTGA > CTTAGGTCACTAAATACTTTAACCAATATAGGCATAGCGCACAGACAGATAAAAATTACAGA > GTACACAACATCCATGAAACGCATTAGCACCACCATTACCACCACCATCACCATTACCACAG > GTAACGGTGCGGGCTGACGCGTACAGGAAACACAGAAAAAAGCCCGCACCTGACAGTGCGGG > CTTTTTTTTTCGACCAAAGGTAACGAGGTAACAACCATGCGAGTGTTGAAGTTCGGCGGTAC > ATCAGTGGCAAATGCAGAACGTTTTCTGCGTGTTGCCGATATTCTGGAAAGCAATGCCAGGC > AGGGGCAGGTGGCCACCGTCCTCTCTGCCCCCGCCAAAATCACCAACCACCTGGTGGCGATG > ATTGAAAAAACCATTAGCGGCCAGGATGCTTTACCCAATATCAGCGATGCCGAACGTATTTT > TGCCGAACTTTT > > ||||||||||||||||||||||||||||||||||||||||||||||||| > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > ||||||||||||||| > > > -------------------------------------------------------------- > --------- > > blastall -p blastn -m 0 -F F -d ecoli/ecoli.nt -i test.txt > > >gb|AE000111.1|AE000111 Escherichia coli K-12 MG1655 section > 1 of 400 of the > >complete > genome > Length = 10596 > > Score = 1110 bits (560), Expect = 0.0 > Identities = 560/560 (100%) > Strand = Plus / Plus > > > Query: 1 > agcttttcattctgactgcaacgggcaatatgtctctgtgtggattaaaaaaagagtgtc 60 > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > Sbjct: 1 > agcttttcattctgactgcaacgggcaatatgtctctgtgtggattaaaaaaagagtgtc 60 > > > Query: 61 > tgatagcagcttctgaactggttacctgccgtgagtaaattaaaattttattgacttagg 120 > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > Sbjct: 61 > tgatagcagcttctgaactggttacctgccgtgagtaaattaaaattttattgacttagg 120 > > > Query: 121 > tcactaaatactttaaccaatataggcatagcgcacagacagataaaaattacagagtac 180 > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > Sbjct: 121 > tcactaaatactttaaccaatataggcatagcgcacagacagataaaaattacagagtac 180 > > > Query: 181 > acaacatccatgaaacgcattagcaccaccattaccaccaccatcaccattaccacaggt 240 > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > Sbjct: 181 > acaacatccatgaaacgcattagcaccaccattaccaccaccatcaccattaccacaggt 240 > > > Query: 241 > aacggtgcgggctgacgcgtacaggaaacacagaaaaaagcccgcacctgacagtgcggg 300 > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > Sbjct: 241 > aacggtgcgggctgacgcgtacaggaaacacagaaaaaagcccgcacctgacagtgcggg 300 > > > Query: 301 > ctttttttttcgaccaaaggtaacgaggtaacaaccatgcgagtgttgaagttcggcggt 360 > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > Sbjct: 301 > ctttttttttcgaccaaaggtaacgaggtaacaaccatgcgagtgttgaagttcggcggt 360 > > > Query: 361 > acatcagtggcaaatgcagaacgttttctgcgtgttgccgatattctggaaagcaatgcc 420 > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > Sbjct: 361 > acatcagtggcaaatgcagaacgttttctgcgtgttgccgatattctggaaagcaatgcc 420 > > > Query: 421 > aggcaggggcaggtggccaccgtcctctctgcccccgccaaaatcaccaaccacctggtg 480 > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > Sbjct: 421 > aggcaggggcaggtggccaccgtcctctctgcccccgccaaaatcaccaaccacctggtg 480 > > > Query: 481 > gcgatgattgaaaaaaccattagcggccaggatgctttacccaatatcagcgatgccgaa 540 > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > Sbjct: 481 > gcgatgattgaaaaaaccattagcggccaggatgctttacccaatatcagcgatgccgaa 540 > > > Query: 541 cgtatttttgccgaactttt 560 > |||||||||||||||||||| > Sbjct: 541 cgtatttttgccgaactttt 560 > > > >From: "Wiepert, Mathieu" > >To: 'tai kwan do' , bioperl-l@bioperl.org > >Subject: RE: [Bioperl-l] Difference between Date: Fri, 30 > Jan 2004 11:13:05 > >-0600 > > > >Hi, > > > >I have a vague recollection of this problem, so this answer > is likely > >wrong, but I think it has something to do with the filtered > sequence? You > >have 9 masked NT's, so it is probably a difference in the > defaults, and > >something to do with the XML output not masked? > > > >Sorry I can't find the emails I had with NCBI on this, but I > am maybe 70% > >sure that it is a problem like that, with defaults on the > local server > >versus NCBI, and the XML not using masked data? > > > >Someone else chime in if I am way off there... > > > >HTH, > > > >-mat > > > > _________________________________________________________________ > There are now three new levels of MSN Hotmail Extra Storage! > Learn more. > http://join.msn.com/?pgmarket=en-us&page=hotmail/es2&ST=1 > From jason at cgt.duhs.duke.edu Tue Feb 3 10:45:55 2004 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Tue Feb 3 10:53:13 2004 Subject: [Bioperl-l] Difference between In-Reply-To: <2F41CC6C9777D311ACBD009027B108EA06E9AFD5@excsrv32.mayo.edu> References: <2F41CC6C9777D311ACBD009027B108EA06E9AFD5@excsrv32.mayo.edu> Message-ID: One also gets slightly different values at times in -m 8 and -m 0 runs as well. -jason On Tue, 3 Feb 2004, Wiepert, Mathieu wrote: > Hi, > > When I reported this, I was told that it was actually a minor bug, and they would look into it. It didn't sound like something they were going to address any time soon, and I never followed up, so guess it is still the same issue... > > -mat > > > -----Original Message----- > > From: Alan Li [mailto:immunoguest@hotmail.com] > > Sent: Saturday, January 31, 2004 5:26 PM > > To: Wiepert, Mathieu; bioperl-l@bioperl.org > > Subject: RE: [Bioperl-l] Difference between > > > > > > I would like to thank everyone for their responses. > > > > And yes, Mat is right about this being an issue with the XML > > output of > > stand-alone blast. I tried comparing the results of just the > > stand-alone > > blast using different -F flags. The results below shows that > > if "-F F" is > > set the results are the same, but are different when using > > "-F T" for the > > XML output. > > > > So is there anything I could do to make the XML results the > > same when the > > filtering option is set to true? Perhaps either through > > another blast > > parameter or by doing it programmatically? > > > > -------------------------------------------------------------- > > --------- > > > > blastall -p blastn -m 7 -F T -d ecoli/ecoli.nt -i test.txt > > > > > > 1 > > gi|1786181|gb|AE000111.1|AE000111 > > Escherichia coli K-12 MG1655 section 1 of > > 400 of the > > complete genome > > AE000111 > > 10596 > > > > > > 1 > > 589.253 > > 297 > > 1.04898e-168 > > 237 > > 560 > > 237 > > 560 > > 1 > > 1 > > 324 > > 324 > > 324 > > > > AGGTAACGGTGCGGGCTGACGCGTACAGGAAACACAGAAAAAAGCCCGCACC > > TGACAGTGCGGGCTTTTTTTTTCGACCAAAGGTAACGAGGTAACAACCATGCGAGTGTTGAA > > GTTCGGCGGTACATCAGTGGCAAATGCAGAACGTTTTCTGCGTGTTGCCGATATTCTGGAAA > > GCAATGCCAGGCAGGGGCAGGTGGCCACCGTCCTCTCTGCCCCCGCCAAAATCACCAACCAC > > CTGGTGGCGATGATTGAAAAAACCATTAGCGGCCAGGATGCTTTACCCAATATCAGCGATGC > > CGAACGTATTTTTGCCGAACTTTT > > > > AGGTAACGGTGCGGGCTGACGCGTACAGGAAACACAGAAAAAAGCCCGCACC > > TGACAGTGCGGGCTTTTTTTTTCGACCAAAGGTAACGAGGTAACAACCATGCGAGTGTTGAA > > GTTCGGCGGTACATCAGTGGCAAATGCAGAACGTTTTCTGCGTGTTGCCGATATTCTGGAAA > > GCAATGCCAGGCAGGGGCAGGTGGCCACCGTCCTCTCTGCCCCCGCCAAAATCACCAACCAC > > CTGGTGGCGATGATTGAAAAAACCATTAGCGGCCAGGATGCTTTACCCAATATCAGCGATGC > > CGAACGTATTTTTGCCGAACTTTT > > > > ||||||||||||||||||||||||||||||||||||||||||||||||| > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > > ||||||||||||||||||||||||||| > > > > > > -------------------------------------------------------------- > > --------- > > > > blastall -p blastn -m 0 -F T -d ecoli/ecoli.nt -i test.txt > > > > >gb|AE000111.1|AE000111 Escherichia coli K-12 MG1655 section > > 1 of 400 of the > > >complete > > genome > > Length = 10596 > > > > Score = 589 bits (297), Expect = e-168 > > Identities = 315/324 (97%) > > Strand = Plus / Plus > > > > > > Query: 237 > > aggtaacggtgcgggctgacgcgtacaggaaacacagaaaaaagcccgcacctgacagtg 296 > > > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > > Sbjct: 237 > > aggtaacggtgcgggctgacgcgtacaggaaacacagaaaaaagcccgcacctgacagtg 296 > > > > > > Query: 297 > > cgggcnnnnnnnnncgaccaaaggtaacgaggtaacaaccatgcgagtgttgaagttcgg 356 > > ||||| > > |||||||||||||||||||||||||||||||||||||||||||||| > > Sbjct: 297 > > cgggctttttttttcgaccaaaggtaacgaggtaacaaccatgcgagtgttgaagttcgg 356 > > > > > > Query: 357 > > cggtacatcagtggcaaatgcagaacgttttctgcgtgttgccgatattctggaaagcaa 416 > > > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > > Sbjct: 357 > > cggtacatcagtggcaaatgcagaacgttttctgcgtgttgccgatattctggaaagcaa 416 > > > > > > Query: 417 > > tgccaggcaggggcaggtggccaccgtcctctctgcccccgccaaaatcaccaaccacct 476 > > > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > > Sbjct: 417 > > tgccaggcaggggcaggtggccaccgtcctctctgcccccgccaaaatcaccaaccacct 476 > > > > > > Query: 477 > > ggtggcgatgattgaaaaaaccattagcggccaggatgctttacccaatatcagcgatgc 536 > > > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > > Sbjct: 477 > > ggtggcgatgattgaaaaaaccattagcggccaggatgctttacccaatatcagcgatgc 536 > > > > > > Query: 537 cgaacgtatttttgccgaactttt 560 > > |||||||||||||||||||||||| > > Sbjct: 537 cgaacgtatttttgccgaactttt 560 > > > > -------------------------------------------------------------- > > --------- > > > > blastall -p blastn -m 7 -F F -d ecoli/ecoli.nt -i test.txt > > > > > > 1 > > gi|1786181|gb|AE000111.1|AE000111 > > Escherichia coli K-12 MG1655 section 1 of > > 400 of the > > complete genome > > AE000111 > > 10596 > > > > > > 1 > > 1110.61 > > 560 > > 0 > > 1 > > 560 > > 1 > > 560 > > 1 > > 1 > > 560 > > 560 > > 560 > > > > AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAA > > AGAGTGTCTGATAGCAGCTTCTGAACTGGTTACCTGCCGTGAGTAAATTAAAATTTTATTGA > > CTTAGGTCACTAAATACTTTAACCAATATAGGCATAGCGCACAGACAGATAAAAATTACAGA > > GTACACAACATCCATGAAACGCATTAGCACCACCATTACCACCACCATCACCATTACCACAG > > GTAACGGTGCGGGCTGACGCGTACAGGAAACACAGAAAAAAGCCCGCACCTGACAGTGCGGG > > CTTTTTTTTTCGACCAAAGGTAACGAGGTAACAACCATGCGAGTGTTGAAGTTCGGCGGTAC > > ATCAGTGGCAAATGCAGAACGTTTTCTGCGTGTTGCCGATATTCTGGAAAGCAATGCCAGGC > > AGGGGCAGGTGGCCACCGTCCTCTCTGCCCCCGCCAAAATCACCAACCACCTGGTGGCGATG > > ATTGAAAAAACCATTAGCGGCCAGGATGCTTTACCCAATATCAGCGATGCCGAACGTATTTT > > TGCCGAACTTTT > > > > AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAA > > AGAGTGTCTGATAGCAGCTTCTGAACTGGTTACCTGCCGTGAGTAAATTAAAATTTTATTGA > > CTTAGGTCACTAAATACTTTAACCAATATAGGCATAGCGCACAGACAGATAAAAATTACAGA > > GTACACAACATCCATGAAACGCATTAGCACCACCATTACCACCACCATCACCATTACCACAG > > GTAACGGTGCGGGCTGACGCGTACAGGAAACACAGAAAAAAGCCCGCACCTGACAGTGCGGG > > CTTTTTTTTTCGACCAAAGGTAACGAGGTAACAACCATGCGAGTGTTGAAGTTCGGCGGTAC > > ATCAGTGGCAAATGCAGAACGTTTTCTGCGTGTTGCCGATATTCTGGAAAGCAATGCCAGGC > > AGGGGCAGGTGGCCACCGTCCTCTCTGCCCCCGCCAAAATCACCAACCACCTGGTGGCGATG > > ATTGAAAAAACCATTAGCGGCCAGGATGCTTTACCCAATATCAGCGATGCCGAACGTATTTT > > TGCCGAACTTTT > > > > ||||||||||||||||||||||||||||||||||||||||||||||||| > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > > ||||||||||||||| > > > > > > -------------------------------------------------------------- > > --------- > > > > blastall -p blastn -m 0 -F F -d ecoli/ecoli.nt -i test.txt > > > > >gb|AE000111.1|AE000111 Escherichia coli K-12 MG1655 section > > 1 of 400 of the > > >complete > > genome > > Length = 10596 > > > > Score = 1110 bits (560), Expect = 0.0 > > Identities = 560/560 (100%) > > Strand = Plus / Plus > > > > > > Query: 1 > > agcttttcattctgactgcaacgggcaatatgtctctgtgtggattaaaaaaagagtgtc 60 > > > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > > Sbjct: 1 > > agcttttcattctgactgcaacgggcaatatgtctctgtgtggattaaaaaaagagtgtc 60 > > > > > > Query: 61 > > tgatagcagcttctgaactggttacctgccgtgagtaaattaaaattttattgacttagg 120 > > > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > > Sbjct: 61 > > tgatagcagcttctgaactggttacctgccgtgagtaaattaaaattttattgacttagg 120 > > > > > > Query: 121 > > tcactaaatactttaaccaatataggcatagcgcacagacagataaaaattacagagtac 180 > > > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > > Sbjct: 121 > > tcactaaatactttaaccaatataggcatagcgcacagacagataaaaattacagagtac 180 > > > > > > Query: 181 > > acaacatccatgaaacgcattagcaccaccattaccaccaccatcaccattaccacaggt 240 > > > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > > Sbjct: 181 > > acaacatccatgaaacgcattagcaccaccattaccaccaccatcaccattaccacaggt 240 > > > > > > Query: 241 > > aacggtgcgggctgacgcgtacaggaaacacagaaaaaagcccgcacctgacagtgcggg 300 > > > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > > Sbjct: 241 > > aacggtgcgggctgacgcgtacaggaaacacagaaaaaagcccgcacctgacagtgcggg 300 > > > > > > Query: 301 > > ctttttttttcgaccaaaggtaacgaggtaacaaccatgcgagtgttgaagttcggcggt 360 > > > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > > Sbjct: 301 > > ctttttttttcgaccaaaggtaacgaggtaacaaccatgcgagtgttgaagttcggcggt 360 > > > > > > Query: 361 > > acatcagtggcaaatgcagaacgttttctgcgtgttgccgatattctggaaagcaatgcc 420 > > > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > > Sbjct: 361 > > acatcagtggcaaatgcagaacgttttctgcgtgttgccgatattctggaaagcaatgcc 420 > > > > > > Query: 421 > > aggcaggggcaggtggccaccgtcctctctgcccccgccaaaatcaccaaccacctggtg 480 > > > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > > Sbjct: 421 > > aggcaggggcaggtggccaccgtcctctctgcccccgccaaaatcaccaaccacctggtg 480 > > > > > > Query: 481 > > gcgatgattgaaaaaaccattagcggccaggatgctttacccaatatcagcgatgccgaa 540 > > > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > > Sbjct: 481 > > gcgatgattgaaaaaaccattagcggccaggatgctttacccaatatcagcgatgccgaa 540 > > > > > > Query: 541 cgtatttttgccgaactttt 560 > > |||||||||||||||||||| > > Sbjct: 541 cgtatttttgccgaactttt 560 > > > > > > >From: "Wiepert, Mathieu" > > >To: 'tai kwan do' , bioperl-l@bioperl.org > > >Subject: RE: [Bioperl-l] Difference between Date: Fri, 30 > > Jan 2004 11:13:05 > > >-0600 > > > > > >Hi, > > > > > >I have a vague recollection of this problem, so this answer > > is likely > > >wrong, but I think it has something to do with the filtered > > sequence? You > > >have 9 masked NT's, so it is probably a difference in the > > defaults, and > > >something to do with the XML output not masked? > > > > > >Sorry I can't find the emails I had with NCBI on this, but I > > am maybe 70% > > >sure that it is a problem like that, with defaults on the > > local server > > >versus NCBI, and the XML not using masked data? > > > > > >Someone else chime in if I am way off there... > > > > > >HTH, > > > > > >-mat > > > > > > > _________________________________________________________________ > > There are now three new levels of MSN Hotmail Extra Storage! > > Learn more. > > http://join.msn.com/?pgmarket=en-us&page=hotmail/es2&ST=1 > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From brian_osborne at cognia.com Tue Feb 3 16:18:41 2004 From: brian_osborne at cognia.com (Brian Osborne) Date: Tue Feb 3 16:25:46 2004 Subject: [Bioperl-l] Uniprot In-Reply-To: <200402031449.03089.Sebastien.Moretti@igs.cnrs-mrs.fr> Message-ID: Sebastien, Unfortunately no, there's no way to do a query of UniProt with Bioperl currently. You're looking for some data that's neither in Swissprot nor in PIR? Brian O. -----Original Message----- From: bioperl-l-bounces@portal.open-bio.org [mailto:bioperl-l-bounces@portal.open-bio.org]On Behalf Of Sebastien Moretti Sent: Tuesday, February 03, 2004 8:49 AM To: bioperl-l@bioperl.org Subject: [Bioperl-l] Uniprot Hello Is there a BioPerl module to send a request to UniProt db ? My script send a request to Swissprot : #!/usr/bin/perl use strict; use Bio::DB::SwissProt; use Bio::SeqIO; my $acc=$ARGV[0]; my $gb = new Bio::DB::SwissProt; my $stream = $gb->get_Seq_by_acc($acc); my $out=Bio::SeqIO->new(-format=>'swiss'); my $result=$out->write_seq($stream); $result =~ s/^1.*$//; print $result; exit; But how can I do the same with UniProt ? Thanks -- Sebastien MORETTI CNRS - IGS 31 chemin Joseph Aiguier 13402 Marseille cedex 20, FRANCE tel. 04 91 16 44 55 - 06 61 88 59 00 _______________________________________________ Bioperl-l mailing list Bioperl-l@portal.open-bio.org http://portal.open-bio.org/mailman/listinfo/bioperl-l From jason at cgt.duhs.duke.edu Tue Feb 3 16:28:24 2004 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Tue Feb 3 16:34:39 2004 Subject: [Bioperl-l] Bio::Species / Bio::Taxonomy::Node In-Reply-To: <9DBC0AB8-5425-11D8-AAD0-000A959EB4C4@gmx.net> References: <9DBC0AB8-5425-11D8-AAD0-000A959EB4C4@gmx.net> Message-ID: We can start making things create Taxonomy::Node objects - I know there code floating out there which does if( $sp->isa('Bio::Species') ) { } so presumably we could make Bio::Species interface s.t. taxonomy::Node isa Bio::Species...? I don't want to confuse people either. There may still be a little more functionality that is needed in the Taxonomy::Node objects and in the db - specifically how to deal with some of the methods which are really specific to the species level of the taxonomy (tips) such as classification/bionomial/ etc methods. -jason On Sat, 31 Jan 2004, Hilmar Lapp wrote: > Very cool Jason!! > > Now we can start hooking this into bioperl-db. > > And what about porting the SeqIO parsers, the target being to be able > to deprecate Bio::Species altogether? Alternatively, change the > SeqI/RichSeqI implementations to silently convert a Bio::Species > instance on set to a Bio::Taxonomy::Node instance? > > -hilmar > > On Friday, January 30, 2004, at 02:07 PM, Jason Stajich wrote: > > > I think I've finally committed code which will allow > > Bio::Taxonomy::Node > > to act like Bio::Species while supporting the notion of being a node > > in a > > taxonomy hierarchy. Added tests in t/Species.t to this effect. > > > > For Bio::DB::Taxonomy::flatfile I've added indexing by parent Id so it > > is > > quite fast to grab all the children for a given node. So you can walk > > up > > and down the classification system now. Practically speaking > > this means to get all the taxon ids of species in the same genus with a > > few simple lines like below. > > > > Unfortunately the the NCBI taxonomy API as part of E-Utils doesn't > > quite > > provide the information we need so the whole API can't be used without > > downloading the taxonomy db locally. > > > > nodefile and namesfile are the files from ncbi taxdump see > > Bio::DB::Taxonomy::flatfile for more info. > > > > #!/usr/bin/perl > > use strict; > > use warnings; > > > > use Bio::DB::Taxonomy; > > my $db = Bio::DB::Taxonomy->new > > (-source => 'flatfile', > > -nodesfile=> '/home/jason/taxonomy/nodes.dmp', > > -namesfile=> '/home/jason/taxonomy/names.dmp'); > > > > my $node = $db->get_Taxonomy_Node(-name => 'Caenorhabditis elegans'); > > > > my $parent = $node->get_Parent_Node(); > > for my $n ( $parent->get_Children_Nodes() ) { > > print $n->binomial, "\t", $n->ncbi_taxid,"\n"; > > } > > > > Someday I'll get around to making a HowTO unless someone else wants to > > do > > it... =) > > > > -jason > > -- > > Jason Stajich > > Duke University > > jason at cgt.mc.duke.edu > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l@portal.open-bio.org > > http://portal.open-bio.org/mailman/listinfo/bioperl-l > > > > > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From brian_osborne at cognia.com Tue Feb 3 16:55:25 2004 From: brian_osborne at cognia.com (Brian Osborne) Date: Tue Feb 3 17:02:37 2004 Subject: [Bioperl-l] Bio::Species / Bio::Taxonomy::Node In-Reply-To: Message-ID: Jason, So you'd automatically create the Node object without knowing if the underlying names and nodes files are present? I agree with you, that could be confusing. Test for the existence of an env that specifies the directory that contains these indexed files? Brian O. -----Original Message----- From: bioperl-l-bounces@portal.open-bio.org [mailto:bioperl-l-bounces@portal.open-bio.org]On Behalf Of Jason Stajich Sent: Tuesday, February 03, 2004 4:28 PM To: Hilmar Lapp Cc: Bioperl Subject: Re: [Bioperl-l] Bio::Species / Bio::Taxonomy::Node We can start making things create Taxonomy::Node objects - I know there code floating out there which does if( $sp->isa('Bio::Species') ) { } so presumably we could make Bio::Species interface s.t. taxonomy::Node isa Bio::Species...? I don't want to confuse people either. There may still be a little more functionality that is needed in the Taxonomy::Node objects and in the db - specifically how to deal with some of the methods which are really specific to the species level of the taxonomy (tips) such as classification/bionomial/ etc methods. -jason On Sat, 31 Jan 2004, Hilmar Lapp wrote: > Very cool Jason!! > > Now we can start hooking this into bioperl-db. > > And what about porting the SeqIO parsers, the target being to be able > to deprecate Bio::Species altogether? Alternatively, change the > SeqI/RichSeqI implementations to silently convert a Bio::Species > instance on set to a Bio::Taxonomy::Node instance? > > -hilmar > > On Friday, January 30, 2004, at 02:07 PM, Jason Stajich wrote: > > > I think I've finally committed code which will allow > > Bio::Taxonomy::Node > > to act like Bio::Species while supporting the notion of being a node > > in a > > taxonomy hierarchy. Added tests in t/Species.t to this effect. > > > > For Bio::DB::Taxonomy::flatfile I've added indexing by parent Id so it > > is > > quite fast to grab all the children for a given node. So you can walk > > up > > and down the classification system now. Practically speaking > > this means to get all the taxon ids of species in the same genus with a > > few simple lines like below. > > > > Unfortunately the the NCBI taxonomy API as part of E-Utils doesn't > > quite > > provide the information we need so the whole API can't be used without > > downloading the taxonomy db locally. > > > > nodefile and namesfile are the files from ncbi taxdump see > > Bio::DB::Taxonomy::flatfile for more info. > > > > #!/usr/bin/perl > > use strict; > > use warnings; > > > > use Bio::DB::Taxonomy; > > my $db = Bio::DB::Taxonomy->new > > (-source => 'flatfile', > > -nodesfile=> '/home/jason/taxonomy/nodes.dmp', > > -namesfile=> '/home/jason/taxonomy/names.dmp'); > > > > my $node = $db->get_Taxonomy_Node(-name => 'Caenorhabditis elegans'); > > > > my $parent = $node->get_Parent_Node(); > > for my $n ( $parent->get_Children_Nodes() ) { > > print $n->binomial, "\t", $n->ncbi_taxid,"\n"; > > } > > > > Someday I'll get around to making a HowTO unless someone else wants to > > do > > it... =) > > > > -jason > > -- > > Jason Stajich > > Duke University > > jason at cgt.mc.duke.edu > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l@portal.open-bio.org > > http://portal.open-bio.org/mailman/listinfo/bioperl-l > > > > > -- Jason Stajich Duke University jason at cgt.mc.duke.edu _______________________________________________ Bioperl-l mailing list Bioperl-l@portal.open-bio.org http://portal.open-bio.org/mailman/listinfo/bioperl-l From liam at mmb.usyd.edu.au Tue Feb 3 20:56:02 2004 From: liam at mmb.usyd.edu.au (Liam Elbourne) Date: Tue Feb 3 21:02:55 2004 Subject: [Bioperl-l] genome analysis Message-ID: I'm looking for the quickest way to take a write a complete genbank entry (ie with all annotation and features) from a microbial genome entry, using the start and end of the area of interest. In particular I want to 'restart' the nucleotide positions, so that the beginning becomes position one in my created genbank entry, and the end becomes the original end minus the original start. I can see how to do this by loading the whole genome into a Bio::DB::GenBank object and iterating through it etc, but there must be a better way...... I am new to Bioperl, so if this the wrong list for this question, a gentle nudge in the right direction would be appreciated. The answer to my question above would also be appreciated!. Regards, Liam Elbourne From ew9 at york.ac.uk Wed Feb 4 05:11:29 2004 From: ew9 at york.ac.uk (Elizabeth Williams) Date: Wed Feb 4 05:17:40 2004 Subject: [Bioperl-l] problem with neighbor.pm Message-ID: <6.0.1.1.0.20040204100745.024d07a0@ew9.imap.york.ac.uk> Hello, I am trying to run the phylip modules on a set of Bio::seq sequences. I have run into a problem with neighbor.pm The module runs the program but then loses the tree somehow and comes up with this error message. ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: neighbor did not create tree correctly (expected /tmp/lHCvy7ByeN/treefile) STACK: Error::throw STACK: Bio::Root::Root::throw /biol/programs/perl580/lib/site_perl/5.8.0/Bio/Root/Root.pm:328 STACK: Bio::Tools::Run::Phylo::Phylip::Neighbor::_run /biol/programs/perl580/lib/site_perl/5.8.0/Bio/Tools/Run/Phylo/Phylip/Neighbor.pm:412 STACK: Bio::Tools::Run::Phylo::Phylip::Neighbor::run /biol/programs/perl580/lib/site_perl/5.8.0/Bio/Tools/Run/Phylo/Phylip/Neighbor.pm:353 STACK: Bio::Tools::Run::Phylo::Phylip::Neighbor::create_tree /biol/programs/perl580/lib/site_perl/5.8.0/Bio/Tools/Run/Phylo/Phylip/Neighbor.pm:370 STACK: geneorigin.pl:74 ----------------------------------------------------------- The script I am using is below. Anyone have any ideas what is causing the problem? I am at a loss. use Bio::DB::GenPept; use Bio::Tools::Run::Alignment::Clustalw; use Bio::Tools::Run::Phylo::Phylip::ProtDist; use Bio::Tools::Run::Phylo::Phylip::Neighbor; #use strict; use Bio::SeqIO; use Bio::Seq; use Bio::AlignIO; use Bio::SimpleAlign; $ENV{PHYLIPDIR} = '/biol/programs/phylip/exe'; . . . . . . my @params_align = ('ktuple' => 2, 'matrix' => 'BLOSUM'); my $factory = Bio::Tools::Run::Alignment::Clustalw->new(@params_align); my $seq_array_ref = \@seq_array; # where @seq_array is an array of Bio::Seq objects created earlier my $aln = $factory->align($seq_array_ref); my @params_protdist = ('MODEL' => 'PAM'); my $protdist_factory = Bio::Tools::Run::Phylo::Phylip::ProtDist->new(@params_protdist); my $matrix = $protdist_factory->run($aln); my @params_neighbor = ('type'=>'NJ'); my $neighborfactory = Bio::Tools::Run::Phylo::Phylip::Neighbor->new(@params_neighbor); my $tree = $neighborfactory->create_tree($matrix); Elizabeth J.B. Williams From Eric.Jain at isb-sib.ch Wed Feb 4 05:30:41 2004 From: Eric.Jain at isb-sib.ch (Eric Jain) Date: Wed Feb 4 05:36:50 2004 Subject: [Bioperl-l] Re: Uniprot References: <200402031449.03089.Sebastien.Moretti@igs.cnrs-mrs.fr> Message-ID: <005d01c3eb09$f08d7020$c300000a@caliente> > Is there a BioPerl module to send a request to UniProt db ? > My script send a request to Swissprot : There is: UniProt. Identical to Swiss-Prot and TrEMBL. You should be able to use whatever tools you have been using so far to work with these two databases. Distributed as two separate files. UniRef. UniProt clusters. Available at three different levels of sequence similarity. No BioPerl module available yet, as far as I know. UniParc. Sequence archive. Doesn't really exist yet. All these three together are also referred to as 'UniProt', in which case 'UniProt UniProt' is called 'UniProt Knowledgebase'. Anybody else who finds this confusing, raise your hand now... -- Eric Jain From jason at cgt.duhs.duke.edu Wed Feb 4 08:23:47 2004 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Wed Feb 4 08:30:09 2004 Subject: [Bioperl-l] problem with neighbor.pm In-Reply-To: <6.0.1.1.0.20040204100745.024d07a0@ew9.imap.york.ac.uk> References: <6.0.1.1.0.20040204100745.024d07a0@ew9.imap.york.ac.uk> Message-ID: phylip 3.5 or 3.6? -- you may need to twiddle one setting if you are using phylip 3.6 -jason On Wed, 4 Feb 2004, Elizabeth Williams wrote: > Hello, > > I am trying to run the phylip modules on a set of Bio::seq sequences. I > have run into a problem with neighbor.pm The module runs the program but > then loses the tree somehow and comes up with this error message. > > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: neighbor did not create tree correctly (expected /tmp/lHCvy7ByeN/treefile) > STACK: Error::throw > STACK: Bio::Root::Root::throw > /biol/programs/perl580/lib/site_perl/5.8.0/Bio/Root/Root.pm:328 > STACK: Bio::Tools::Run::Phylo::Phylip::Neighbor::_run > /biol/programs/perl580/lib/site_perl/5.8.0/Bio/Tools/Run/Phylo/Phylip/Neighbor.pm:412 > STACK: Bio::Tools::Run::Phylo::Phylip::Neighbor::run > /biol/programs/perl580/lib/site_perl/5.8.0/Bio/Tools/Run/Phylo/Phylip/Neighbor.pm:353 > STACK: Bio::Tools::Run::Phylo::Phylip::Neighbor::create_tree > /biol/programs/perl580/lib/site_perl/5.8.0/Bio/Tools/Run/Phylo/Phylip/Neighbor.pm:370 > STACK: geneorigin.pl:74 > ----------------------------------------------------------- > > The script I am using is below. > Anyone have any ideas what is causing the problem? I am at a loss. > > use Bio::DB::GenPept; > use Bio::Tools::Run::Alignment::Clustalw; > use Bio::Tools::Run::Phylo::Phylip::ProtDist; > use Bio::Tools::Run::Phylo::Phylip::Neighbor; > > #use strict; > use Bio::SeqIO; > use Bio::Seq; > use Bio::AlignIO; > use Bio::SimpleAlign; > > $ENV{PHYLIPDIR} = '/biol/programs/phylip/exe'; > . > . > . > . > . > . > my @params_align = ('ktuple' => 2, 'matrix' => > 'BLOSUM'); > my $factory = > Bio::Tools::Run::Alignment::Clustalw->new(@params_align); > my $seq_array_ref = \@seq_array; # where > @seq_array is an array of Bio::Seq objects created earlier > my $aln = $factory->align($seq_array_ref); > my @params_protdist = ('MODEL' => 'PAM'); > > my $protdist_factory = > Bio::Tools::Run::Phylo::Phylip::ProtDist->new(@params_protdist); > > my $matrix = $protdist_factory->run($aln); > > my @params_neighbor = ('type'=>'NJ'); > > > > my $neighborfactory = > Bio::Tools::Run::Phylo::Phylip::Neighbor->new(@params_neighbor); > > my $tree = $neighborfactory->create_tree($matrix); > > > Elizabeth J.B. Williams > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From michael.watson at bbsrc.ac.uk Wed Feb 4 08:26:07 2004 From: michael.watson at bbsrc.ac.uk (michael watson (IAH-C)) Date: Wed Feb 4 08:35:20 2004 Subject: [Bioperl-l] Blast Images Message-ID: <20B7EB075F2D4542AFFAF813E98ACD9302822622@cl-exsrv1.irad.bbsrc.ac.uk> Hi Does anything exist within Bioperl, or otherwise, to take a Blast output (or Search object) and produce an image showing the location of the hits on the query sequence? (much like the NCBI have on their blast pages) Thanks Mick From jason at cgt.duhs.duke.edu Wed Feb 4 08:39:56 2004 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Wed Feb 4 08:46:14 2004 Subject: [Bioperl-l] problem with neighbor.pm In-Reply-To: <6.0.1.1.0.20040204132900.0252a810@ew9.imap.york.ac.uk> References: <6.0.1.1.0.20040204100745.024d07a0@ew9.imap.york.ac.uk> <6.0.1.1.0.20040204132900.0252a810@ew9.imap.york.ac.uk> Message-ID: either set the env variable PHYLIPVERSION in your shell or at the top of your script $ENV{PHYLIPVERSION} = '3.6'; (before any use ... statements) Or the less ideal, setting per Phylip factory object your create, i.e.: $protdist_factory->version('3.6'); $neighborfactory->version('3.6'); -jason On Wed, 4 Feb 2004, Elizabeth Williams wrote: > i am using 3.6. Which setting needs to be twiddled? > > At 13:23 04/02/2004, you wrote: > >phylip 3.5 or 3.6? -- you may need to twiddle one setting if you are using > >phylip 3.6 > > > > > >-jason > >On Wed, 4 Feb 2004, Elizabeth Williams wrote: > > > > > Hello, > > > > > > I am trying to run the phylip modules on a set of Bio::seq sequences. I > > > have run into a problem with neighbor.pm The module runs the program but > > > then loses the tree somehow and comes up with this error message. > > > > > > ------------- EXCEPTION: Bio::Root::Exception ------------- > > > MSG: neighbor did not create tree correctly (expected > > /tmp/lHCvy7ByeN/treefile) > > > STACK: Error::throw > > > STACK: Bio::Root::Root::throw > > > /biol/programs/perl580/lib/site_perl/5.8.0/Bio/Root/Root.pm:328 > > > STACK: Bio::Tools::Run::Phylo::Phylip::Neighbor::_run > > > > > /biol/programs/perl580/lib/site_perl/5.8.0/Bio/Tools/Run/Phylo/Phylip/Neighbor.pm:412 > > > STACK: Bio::Tools::Run::Phylo::Phylip::Neighbor::run > > > > > /biol/programs/perl580/lib/site_perl/5.8.0/Bio/Tools/Run/Phylo/Phylip/Neighbor.pm:353 > > > STACK: Bio::Tools::Run::Phylo::Phylip::Neighbor::create_tree > > > > > /biol/programs/perl580/lib/site_perl/5.8.0/Bio/Tools/Run/Phylo/Phylip/Neighbor.pm:370 > > > STACK: geneorigin.pl:74 > > > ----------------------------------------------------------- > > > > > > The script I am using is below. > > > Anyone have any ideas what is causing the problem? I am at a loss. > > > > > > use Bio::DB::GenPept; > > > use Bio::Tools::Run::Alignment::Clustalw; > > > use Bio::Tools::Run::Phylo::Phylip::ProtDist; > > > use Bio::Tools::Run::Phylo::Phylip::Neighbor; > > > > > > #use strict; > > > use Bio::SeqIO; > > > use Bio::Seq; > > > use Bio::AlignIO; > > > use Bio::SimpleAlign; > > > > > > $ENV{PHYLIPDIR} = '/biol/programs/phylip/exe'; > > > . > > > . > > > . > > > . > > > . > > > . > > > my @params_align = ('ktuple' => 2, 'matrix' => > > > 'BLOSUM'); > > > my $factory = > > > Bio::Tools::Run::Alignment::Clustalw->new(@params_align); > > > my $seq_array_ref = \@seq_array; # where > > > @seq_array is an array of Bio::Seq objects created earlier > > > my $aln = $factory->align($seq_array_ref); > > > my @params_protdist = ('MODEL' => 'PAM'); > > > > > > my $protdist_factory = > > > Bio::Tools::Run::Phylo::Phylip::ProtDist->new(@params_protdist); > > > > > > my $matrix = $protdist_factory->run($aln); > > > > > > my @params_neighbor = ('type'=>'NJ'); > > > > > > > > > > > > my $neighborfactory = > > > Bio::Tools::Run::Phylo::Phylip::Neighbor->new(@params_neighbor); > > > > > > my $tree = $neighborfactory->create_tree($matrix); > > > > > > > > > Elizabeth J.B. Williams > > > > > > _______________________________________________ > > > Bioperl-l mailing list > > > Bioperl-l@portal.open-bio.org > > > http://portal.open-bio.org/mailman/listinfo/bioperl-l > > > > > > >-- > >Jason Stajich > >Duke University > >jason at cgt.mc.duke.edu > > Elizabeth J.B. Williams > CNAP > Department of Biology > University of York > York > YO10 5YW > mobile: 07813149274 > work: 01904 328757 > Fax: 01904 328762 > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From jason at cgt.duhs.duke.edu Wed Feb 4 08:41:56 2004 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Wed Feb 4 08:48:08 2004 Subject: [Bioperl-l] Blast Images In-Reply-To: <20B7EB075F2D4542AFFAF813E98ACD9302822622@cl-exsrv1.irad.bbsrc.ac.uk> References: <20B7EB075F2D4542AFFAF813E98ACD9302822622@cl-exsrv1.irad.bbsrc.ac.uk> Message-ID: In scripts/graphics/search_overview.PLS You may have to tweak it some to inject that into a custom cgi page. the script should be a starting place not the end all be all. With a little more tweaking you can take advantage of Todd's SVG output instead of PNG if that floats yer boat. -jason On Wed, 4 Feb 2004, michael watson (IAH-C) wrote: > Hi > > Does anything exist within Bioperl, or otherwise, to take a Blast output (or Search object) and produce an image showing the location of the hits on the query sequence? (much like the NCBI have on their blast pages) > > Thanks > Mick > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From brian_osborne at cognia.com Wed Feb 4 08:48:12 2004 From: brian_osborne at cognia.com (Brian Osborne) Date: Wed Feb 4 08:54:30 2004 Subject: [Bioperl-l] Blast Images In-Reply-To: <20B7EB075F2D4542AFFAF813E98ACD9302822622@cl-exsrv1.irad.bbsrc.ac.uk> Message-ID: Mick, Bio::Graphics. Take a look at the Graphics HOWTO (http://bioperl.org/HOWTOs/html/Graphics-HOWTO.html). Brian O. -----Original Message----- From: bioperl-l-bounces@portal.open-bio.org [mailto:bioperl-l-bounces@portal.open-bio.org]On Behalf Of michael watson (IAH-C) Sent: Wednesday, February 04, 2004 8:26 AM To: bioperl-l@bioperl.org Subject: [Bioperl-l] Blast Images Hi Does anything exist within Bioperl, or otherwise, to take a Blast output (or Search object) and produce an image showing the location of the hits on the query sequence? (much like the NCBI have on their blast pages) Thanks Mick _______________________________________________ Bioperl-l mailing list Bioperl-l@portal.open-bio.org http://portal.open-bio.org/mailman/listinfo/bioperl-l From ANTIGEN_SATURNV at brooklyn.cuny.edu Tue Feb 3 21:45:11 2004 From: ANTIGEN_SATURNV at brooklyn.cuny.edu (ANTIGEN_SATURNV) Date: Wed Feb 4 14:21:28 2004 Subject: [Bioperl-l] Antigen found VIRUS= W32/Mydoom (ED) (NAI,Sophos) worm Message-ID: <4D655EDA19E2D611ABBD00508B3220CC01CD15BA@saturnv.brooklyn.cuny.edu> Antigen for Exchange found body.zip->body.txt .exe infected with VIRUS= W32/Mydoom (ED) (NAI,Sophos) worm. The message is currently Purged. The message, "", was sent from bioperl-l@bioperl.org and was discovered in IMC Queues\Inbound located at Brooklyn College/BCNET/SATURNV. From ecky.l at gmx.de Wed Feb 4 15:33:07 2004 From: ecky.l at gmx.de (Eckhard Lehmann) Date: Wed Feb 4 15:39:18 2004 Subject: [Bioperl-l] Blast Images References: <20B7EB075F2D4542AFFAF813E98ACD9302822622@cl-exsrv1.irad.bbsrc.ac.uk> Message-ID: <10718.1075926787@www29.gmx.net> Hi, > Does anything exist within Bioperl, or otherwise, to take a Blast output > (or Search object) and produce an image showing the location of the hits on > the query sequence? (much like the NCBI have on their blast pages) Bioperl-Tk does a good job if you want to have it outside any webpage and inside a Perl-Tk widget. The package to consider is Bio::Tk::HitDisplay. I wrote a blastviewer in Tcl/Tk that does almost the same (and shows a bit more than Bio::Tk::HitDisplay), but that one comes with its own self written BLAST parser which is not so effective as the one in BioPerl and may be a bit buggy... but nevertheless it works and the parser is extensible ;-). Eckhard ;) From Sebastien.Moretti at igs.cnrs-mrs.fr Wed Feb 4 05:11:59 2004 From: Sebastien.Moretti at igs.cnrs-mrs.fr (Sebastien Moretti) Date: Wed Feb 4 22:26:16 2004 Subject: [Bioperl-l] Uniprot In-Reply-To: References: Message-ID: <200402041111.59773.Sebastien.Moretti@igs.cnrs-mrs.fr> Hello I modify the Bio/DB/SwissProt.pm to be able to send a request to UniProt at EBI. I attach the UniProt.pm file. Set it near the SwissProt.pm file. I hope that it hasn't bugs. -- Sebastien MORETTI CNRS - IGS 31 chemin Joseph Aiguier 13402 Marseille cedex 20, FRANCE tel. 04 91 16 44 55 - 06 61 88 59 00 -------------- next part -------------- A non-text attachment was scrubbed... Name: UniProt.pm Type: text/x-perl Size: 11976 bytes Desc: not available Url : http://portal.open-bio.org/pipermail/bioperl-l/attachments/20040204/5b484e57/UniProt.bin From jason at cgt.duhs.duke.edu Wed Feb 4 22:57:58 2004 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Wed Feb 4 23:04:40 2004 Subject: [Bioperl-l] Uniprot In-Reply-To: <200402041111.59773.Sebastien.Moretti@igs.cnrs-mrs.fr> References: <200402041111.59773.Sebastien.Moretti@igs.cnrs-mrs.fr> Message-ID: Thanks Sebastien -- Since the change is only: 'db' => 'swall' to 'db' => 'uniprot' We might try and fix this directly in SwissProt.pm without having to create a whole new module. The sort of long way to do this is to add this to your script like this: $Bio::DB::SwissProt::HOSTS{'ebi'}->{'basevars'}->{'db'} = 'uniprot'; -jason On Wed, 4 Feb 2004, Sebastien Moretti wrote: > Hello > I modify the Bio/DB/SwissProt.pm to be able to send a request to UniProt at > EBI. > I attach the UniProt.pm file. Set it near the SwissProt.pm file. > I hope that it hasn't bugs. > > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From heikki at ebi.ac.uk Thu Feb 5 05:34:47 2004 From: heikki at ebi.ac.uk (Heikki Lehvaslaiho) Date: Thu Feb 5 05:41:03 2004 Subject: [Bioperl-l] Uniprot In-Reply-To: <200402031449.03089.Sebastien.Moretti@igs.cnrs-mrs.fr> References: <20040203130426.21627.qmail@web25209.mail.ukl.yahoo.com> <200402031449.03089.Sebastien.Moretti@igs.cnrs-mrs.fr> Message-ID: <200402051034.48276.heikki@ebi.ac.uk> For the time being the old Swiss-Prot and Uniprot are identical at data level. Uniprot is a political development integrating PIR. >From README: "The UniProt Knowledgebase has been created from Swiss-Prot, TrEMBL and PIR-PSD. It consists of two parts, one containing fully manually annotated records and another one with computationally analysed records awaiting full manual annotation. The two sections will be referred to as the Swiss-Prot Knowledgebase and TrEMBL Protein Database, respectively. PIR-PSD release 48.0 of 28-Oct-2003 has been fully integrated into these sections. This was the last release of PIR-PSD. " -Heikki On Tuesday 03 Feb 2004 13:49, Sebastien Moretti wrote: > Hello > Is there a BioPerl module to send a request to UniProt db ? > My script send a request to Swissprot : > > #!/usr/bin/perl > > use strict; > use Bio::DB::SwissProt; > use Bio::SeqIO; > my $acc=$ARGV[0]; > > my $gb = new Bio::DB::SwissProt; > my $stream = $gb->get_Seq_by_acc($acc); > > my $out=Bio::SeqIO->new(-format=>'swiss'); > > my $result=$out->write_seq($stream); > $result =~ s/^1.*$//; > print $result; > > exit; > > But how can I do the same with UniProt ? > Thanks -- ______ _/ _/_____________________________________________________ _/ _/ http://www.ebi.ac.uk/mutations/ _/ _/ _/ Heikki Lehvaslaiho heikki_at_ebi ac uk _/_/_/_/_/ EMBL Outstation, European Bioinformatics Institute _/ _/ _/ Wellcome Trust Genome Campus, Hinxton _/ _/ _/ Cambs. CB10 1SD, United Kingdom _/ Phone: +44 (0)1223 494 644 FAX: +44 (0)1223 494 468 ___ _/_/_/_/_/________________________________________________________ From POSTMASTER at nt.mitsui-chem.co.jp Thu Feb 5 06:51:32 2004 From: POSTMASTER at nt.mitsui-chem.co.jp (POSTMASTER@nt.mitsui-chem.co.jp) Date: Thu Feb 5 06:58:04 2004 Subject: [Bioperl-l] Undeliverable message Message-ID: <200402051151.i15Bpmd25421@mcimx03.mitsui-chem.co.jp> ------- Failure Reasons -------- User not listed in public Name & Address Book jose@mitsui-chem.co.jp ------- Returned Message -------- Received: from mcimx03.mitsui-chem.co.jp ([10.1.134.2]) by nt.mitsui-chem.co.jp (Lotus SMTP MTA v4.6.7 (934.1 12-30-1999)) with SMTP id 49256E31.00411E93; Thu, 5 Feb 2004 20:51:16 +0900 Received: from mcimx01.mitsui-chem.co.jp (localhost [127.0.0.1]) by mcimx03.mitsui-chem.co.jp (8.11.7/3.7W03122523) with ESMTP id i15BpPn25384 for ; Thu, 5 Feb 2004 20:51:25 +0900 (JST) Received: from bioperl.org ([61.183.73.4]) by mcimx01.mitsui-chem.co.jp (8.11.7/3.7W02060615) with ESMTP id i15BpOf26351 for ; Thu, 5 Feb 2004 20:51:24 +0900 (JST) Message-Id: <200402051151.i15BpOf26351@mcimx01.mitsui-chem.co.jp> From: bioperl-l@bioperl.org To: jose@mitsui-chem.co.jp Subject: test Date: Thu, 5 Feb 2004 19:49:53 +0800 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_NextPart_000_0013_1C5BFE5A.209C0CF0" X-Priority: 3 X-MSMail-Priority: Normal This is a multi-part message in MIME format. ------=_NextPart_000_0013_1C5BFE5A.209C0CF0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit ------------------ Virus Warning Message (on mcimx03) Found virus WORM_MYDOOM.A in file file.pif (in file.zip) The uncleanable file file.zip is moved to /etc/iscan/virus/virOPC02S7gK. --------------------------------------------------------- ------=_NextPart_000_0013_1C5BFE5A.209C0CF0 Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: 7bit The message contains Unicode characters and has been sent as a binary attachment. ------=_NextPart_000_0013_1C5BFE5A.209C0CF0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit ------------------ Virus Warning Message (on mcimx03) file.zip is removed from here because it contains a virus. --------------------------------------------------------- ------=_NextPart_000_0013_1C5BFE5A.209C0CF0-- From billthebrute at yahoo.fr Thu Feb 5 09:54:07 2004 From: billthebrute at yahoo.fr (=?iso-8859-1?q?william=20ritchie?=) Date: Thu Feb 5 10:00:15 2004 Subject: [Bioperl-l] mouse genome Message-ID: <20040205145407.66829.qmail@web25208.mail.ukl.yahoo.com> Hi I would like to blast against the mouse genome on NCBI through a RemoteBlast request but I don t know the code for the "mouse genome database"! Could you help me out ? Cheers! Yahoo! Mail : votre e-mail personnel et gratuit qui vous suit partout ! Cr?ez votre Yahoo! Mail sur http://fr.benefits.yahoo.com/ From sdavis2 at mail.nih.gov Thu Feb 5 10:02:49 2004 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Thu Feb 5 10:16:07 2004 Subject: [Bioperl-l] Parsing blast hits Message-ID: As many posts here, I am new to bioperl. I have a list of several thousand queries (microarray oligos) and the resulting blast hits to mRNAs. I would like to determine which of the hits for each query are to the same "gene"; in other words, I want to find query sequences with mappings to only one gene. I am familiar with blasting and the technicalities of the blast parsers, but I can't think how to tackle the bigger problem. Do I need to query the resulting hits and store the genes that they encode for each hit and just make sure they are the same, or is there something more clever? Any suggestions? Thanks, Sean From reveritt at ucalgary.ca Thu Feb 5 15:51:13 2004 From: reveritt at ucalgary.ca (reveritt@ucalgary.ca) Date: Thu Feb 5 15:57:22 2004 Subject: [Bioperl-l] Fwd: Error running makefile Message-ID: <200402052051.i15KpDc23210@mhost2.ucalgary.ca> Forwarded From: reveritt@ucalgary.ca > Hello, > > I am trying to install BioPerl on a Windows NT workstation that is running > Perl 5.6 with all the necessary modules (ie IO::String). When I run the > makefile I get the following error message: > > Please inform the author. > Could not open 'Bio/Root/Version.pm': No such file or directory at (eval 49) > line 6. > > Do you know how to fix this problem or is there documentation I should read? > > Thanks, > Rebecca Everitt > > > > > > > -- > > > -- From clangin at siu.edu Thu Feb 5 21:40:27 2004 From: clangin at siu.edu (Chet Langin) Date: Thu Feb 5 21:23:17 2004 Subject: [Bioperl-l] GD test 10 fails Message-ID: <4022FE9B.5030406@siu.edu> While installing GD, test 10 failed, thus halting the install from CPAN. I installed the latest zlib, libgd, PNG, JPEG, and FreeType libraries, and it still failed. It looked like test 10 might be converting between JPEG and PNG formats. The only strange output during the make was a warning about /usr/local/include being a system directory when it was a non-system directory and that the search order was changed. I went ahead and forced install. But, I was wondering if this might cause me further trouble down the road. ,,Chet Langin,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ~~~Diagonally parked in a parallel universe ~~~~~ From jason at cgt.duhs.duke.edu Thu Feb 5 21:29:54 2004 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Thu Feb 5 21:36:12 2004 Subject: [Bioperl-l] GD test 10 fails In-Reply-To: <4022FE9B.5030406@siu.edu> References: <4022FE9B.5030406@siu.edu> Message-ID: Only if you want to use Bio::Graphics Are you sure the libgd version on your system matches the requirements of the version of GD.pm you are installing. -jason On Thu, 5 Feb 2004, Chet Langin wrote: > > While installing GD, test 10 failed, thus halting > the install from CPAN. > > I installed the latest zlib, libgd, PNG, JPEG, > and FreeType libraries, and it still failed. > > It looked like test 10 might be converting between > JPEG and PNG formats. > > The only strange output during the make was a warning > about /usr/local/include being a system directory > when it was a non-system directory and that the > search order was changed. > > I went ahead and forced install. But, I was > wondering if this might cause me further trouble > down the road. > > ,,Chet Langin,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, > > ~~~Diagonally parked in a parallel universe ~~~~~ > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From tdhoufek at unity.ncsu.edu Thu Feb 5 21:49:52 2004 From: tdhoufek at unity.ncsu.edu (T.D. Houfek) Date: Thu Feb 5 21:56:37 2004 Subject: [Bioperl-l] oligos, mRNAs, and genes In-Reply-To: <200402051619.i15GJEHH004118@portal.open-bio.org> References: <200402051619.i15GJEHH004118@portal.open-bio.org> Message-ID: <1076035792.10041.72.camel@aether> There are a lot of people around here a lot more qualified to answer, and I hope someone will correct me if I misinform you (or if I've misunderstood your question). If you're dealing with a eukaryote, I think the method you are hinting at, effectively tallying which mRNAs were uniquely matched by your oligos, could run into problems dealing with alternatively-spliced genes, where there's not a 1:1 relationship between gene and mRNA product. But I'm not sure what the incidence of such genes is typically, I think it is just a few percent of genes. This shouldn't prevent you from finding "query sequences with mappings to only one gene", and it certainly won't keep you from sampling alternatively spliced products, but there might be a few cases where one gene has more than one query oligo that matches it (if multiple matched mRNA transcripts are subsequently related to the activity of one gene). If your mRNA's correlated genes are already well characterized in one of the major databases / formats, you should be able to use BioPerl to explore the relations between genes and transcripts, but is that your situation, or are these transcripts of yours somewhat less well contextually situated? TD -- :.-----.----------.----------.-----.: T.D. Houfek tdhoufek-AT-unity-DOT-ncsu-DOT-edu Tobacco Genome Initiative NCSU, Raleigh, NC 27606 :.-----.----------.----------.-----.: From tdhoufek at unity.ncsu.edu Thu Feb 5 22:12:47 2004 From: tdhoufek at unity.ncsu.edu (T.D. Houfek) Date: Thu Feb 5 22:19:35 2004 Subject: [Bioperl-l] Re: CVS hang problem In-Reply-To: <1075989943.1474.7.camel@localhost.localdomain> References: <200402030429.i134TWm2000680@uni03mr.unity.ncsu.edu> <1075941658.1362.18.camel@aether> <1075989943.1474.7.camel@localhost.localdomain> Message-ID: <1076037156.14479.0.camel@aether> For the (searchable) record, Scott was right. I tried the CVS checkout again today and got past the point where I was always stalling out before, so it was just a matter of e-weather. Thanks! TD On Thu, 2004-02-05 at 09:05, Scott Cain wrote: > TD, > > There is one moderately big file in the dat directory (7M), so you may > be running into bandwidth issues, either on your end or at SourceForge. > The anonymous cvs server there is notoriously overworked and it can be > difficult to checkout large repositories. > > Scott -- :.-----.----------.----------.-----.: T.D. Houfek tdhoufek-AT-unity-DOT-ncsu-DOT-edu Tobacco Genome Initiative NCSU, Raleigh, NC 27606 :.-----.----------.----------.-----.: From hcle028 at cse.unsw.edu.au Thu Feb 5 22:53:43 2004 From: hcle028 at cse.unsw.edu.au (Hong Ching Lee) Date: Thu Feb 5 22:59:58 2004 Subject: [Bioperl-l] Problems running blast Message-ID: Hey everyone, I have a question about whether i can run remote blast using just a string or whether i have to make it into a fasta format file. Can anyone help me with this? Thank You, Hong From barry.moore at genetics.utah.edu Fri Feb 6 01:02:52 2004 From: barry.moore at genetics.utah.edu (Barry Moore) Date: Fri Feb 6 01:09:05 2004 Subject: [Bioperl-l] Problems running blast In-Reply-To: References: Message-ID: <40232E0C.50805@genetics.utah.edu> Hong- You don't have to make your sequence into a fasta file. Have a look at the documentation for the submit_blast method of the Bio::Tools::Run::RemoteBlast module where it tells you that the input can be a sequence object, a reference to an array of sequence objects, or the filename of a fasta file. If your script already has your sequence as any of the Bioperl sequence objects, then you are ready to go. If your script has your sequence as a simple string, it is quite easy to convert that to a PrimarySeq object which you can then submit to BLAST. The following script (adapted from the module documentation) suggests one way of converting a string to a PrimaySeq object and submitting it to BLAST. See the example code in the Synopsis section of the RemoteBlast module documentation mentioned above for examples of how to submit a sequence object, or a fasta file to BLAST. Barry ---------------------------------------------------------------------------------------------- #!/usr/bin/perl use strict; use warnings; use Bio::PrimarySeq; use Bio::Tools::Run::RemoteBlast; #Your sequence as a string my $sequence_string = "atggagagcagaggcccactggctacctcgcgcctgctgctgttgctgctgttgctacta"; #Initialize string as new sequence my $seq = new Bio::PrimarySeq(-seq => $sequence_string, -display_id => "Your_favorite_gene"); #Build the BLAST factory my $BLAST_factory = Bio::Tools::Run::RemoteBlast->new('-prog' => 'blastn', '-data' => 'nr', '-expect' => .001, '-readmethod' => 'SearchIO' ); #Submit the sequence object to NCBI's BLAST server my $job = $BLAST_factory->submit_blast($seq); print STDERR "Blasting sequence "; #Load the RIDs returned for the BLAST job submitted (in this case only one) while ( my @rids = $BLAST_factory->each_rid ) { #Iterate over RIDs foreach my $rid ( @rids ) { #Hit the server for a result on RID my $blast_results = $BLAST_factory->retrieve_blast($rid); #Was a result returned? if( !ref($blast_results) ) { #If so and it returned an error remove that RID from the stack if ($blast_results < 0) { $BLAST_factory->remove_rid($rid); } print STDERR "."; #Keep staring at the dots sleep 5; #Plays nice with the servers } #If a result was returned and it isn't an error, then pass it to a variable... else { my $result = $blast_results->next_result(); $BLAST_factory->remove_rid($rid); #...and remove it's RID from the stack. #Check the result for a hit... my $hit = $result->next_hit; if (ref($hit)) { my $hsp = $hit->next_hsp; #...collect some values from the result, hit and hsp objects and do #something with them. my $q_name = $result->query_name(); my $h_name = $hit->name; my $evalue = $hsp->evalue(); print "\nQuery name: $q_name\nHit name: $h_name\nLowest e-value: $evalue\n"; } } } } Hong Ching Lee wrote: >Hey everyone, > >I have a question about whether i can run remote blast using just a string >or whether i have to make it into a fasta format file. Can anyone help me >with this? > >Thank You, >Hong >_______________________________________________ >Bioperl-l mailing list >Bioperl-l@portal.open-bio.org >http://portal.open-bio.org/mailman/listinfo/bioperl-l > From hanna21volley at hotmail.com Fri Feb 6 01:00:16 2004 From: hanna21volley at hotmail.com (stacey) Date: Fri Feb 6 01:09:31 2004 Subject: [Bioperl-l] This Drug puts VlAGRA to shame!! Message-ID: <1076047216-8771@excite.com> The Biggest New Drug since V1agra! Many times as powerful. C1AL1S has been seen all over TV as of late. So why is it so much better than V1agra? Why are so many switching brands? -A quicker more stable erection -More enjoyable sex for both -Longer sex -Known to add length to you erection -Lasts up to 36 hours (not a thrity-six hour erection, but enhancement for thirty-six) We have it at a discounted savings. Save when you go through our site on all your orders. See the difference today. http://mission.instrhh.com/s95c/index.php?id=s95 lamer deadpinkfloy suzuki biology liverpoo action tacobell canced parrot racoon septembe taffy paula cannon From clangin at siu.edu Fri Feb 6 04:30:09 2004 From: clangin at siu.edu (Chet Langin) Date: Fri Feb 6 04:12:55 2004 Subject: [Bioperl-l] GD test 10 fails References: <4022FE9B.5030406@siu.edu> Message-ID: <40235EA1.4040008@siu.edu> Greetings, I installed gd-2.0.22 and GD-2.11. --Chet Jason Stajich wrote: > Only if you want to use Bio::Graphics > > Are you sure the libgd version on your system matches the > requirements of the version of GD.pm you are installing. > > -jason > On Thu, 5 Feb 2004, Chet Langin wrote: > > >>While installing GD, test 10 failed, thus halting >>the install from CPAN. >> >>I installed the latest zlib, libgd, PNG, JPEG, >>and FreeType libraries, and it still failed. >> >>It looked like test 10 might be converting between >>JPEG and PNG formats. >> >>The only strange output during the make was a warning >>about /usr/local/include being a system directory >>when it was a non-system directory and that the >>search order was changed. >> >>I went ahead and forced install. But, I was >>wondering if this might cause me further trouble >>down the road. >> >>,,Chet Langin,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, >> >>~~~Diagonally parked in a parallel universe ~~~~~ >> >> >>_______________________________________________ >>Bioperl-l mailing list >>Bioperl-l@portal.open-bio.org >>http://portal.open-bio.org/mailman/listinfo/bioperl-l >> > > > -- > Jason Stajich > Duke University > jason at cgt.mc.duke.edu > -- ,,Chet Langin,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ~~~Diagonally parked in a parallel universe ~~~~~ From lstein at cshl.edu Fri Feb 6 04:13:47 2004 From: lstein at cshl.edu (Lincoln Stein) Date: Fri Feb 6 04:23:01 2004 Subject: [Bioperl-l] GD test 10 fails In-Reply-To: <4022FE9B.5030406@siu.edu> References: <4022FE9B.5030406@siu.edu> Message-ID: <200402061113.47851.lstein@cshl.edu> Hi Chet, I wrote and maintain the GD library. If you can send me the information on what operating system you're using and the versions of each of the libraries you have installed I might be able to help. Lincoln On Friday 06 February 2004 04:40 am, Chet Langin wrote: > While installing GD, test 10 failed, thus halting > the install from CPAN. > > I installed the latest zlib, libgd, PNG, JPEG, > and FreeType libraries, and it still failed. > > It looked like test 10 might be converting between > JPEG and PNG formats. > > The only strange output during the make was a warning > about /usr/local/include being a system directory > when it was a non-system directory and that the > search order was changed. > > I went ahead and forced install. But, I was > wondering if this might cause me further trouble > down the road. > > ,,Chet Langin,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, > > ~~~Diagonally parked in a parallel universe ~~~~~ > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 From khoueiry at ibsm.cnrs-mrs.fr Fri Feb 6 04:32:39 2004 From: khoueiry at ibsm.cnrs-mrs.fr (KHOUEIRY pierre) Date: Fri Feb 6 04:38:54 2004 Subject: [Bioperl-l] fetching a fasta file Message-ID: <40235F37.3020701@ibsm.cnrs-mrs.fr> Hello, I have to search for sequences from a local fasta file. my sequences Id's are in a table my @ID = ('AAS_ECOLI','ABGT_ECOLI','ABRB_ECOLI','ACFD_ECOLI','ACRA_ECOLI','ACRB_ECOLI'). I tried to index my file but it doesn't work. I used something like: $index = Bio::Index::Fasta->new("$file", 'WRITE'); $index->make_index($file); Sometimes I'm getting message Can't open 'DB_File' dbm file '/home/pierre/Perl/col2.fasta' : File exists I want to fetch col2.fasta for all IDs in my table (@ID) above In the doc of Bio::Index::Fasta, they index files and not their contents or am'I wrong. I want to do this approch because i want to search theses ID's in a big nb of fasta files... -- --------------------------------- Pierre Khoueiry khoueiry@ibsm.cnrs-mrs.fr LCB - CNRS 31, Chemin Joseph Aiguier, 13402 Marseille CEDEX 20, France --------------------------------- From Richard.Adams at ed.ac.uk Fri Feb 6 04:48:58 2004 From: Richard.Adams at ed.ac.uk (Richard Adams) Date: Fri Feb 6 04:55:04 2004 Subject: [Bioperl-l] protein networks Message-ID: <4023630A.2030406@ed.ac.uk> Hi, I was wondering if anyone has written or is writing any modules to deal with protein interaction networks? E.g., to read in from a DIP flatfile or XML, or other such interaction information source and to have methods such as get_number_of_interactors() get_interactor_ids() clustering_coefficient() path_length(from, to) degree() mean_path_length(). etc. Richard -- Dr Richard Adams, Psychiatric Genetics Group, Medical Genetics, Molecular Medicine Centre, Western General Hospital, Crewe Rd West, Edinburgh UK EH4 2XU Tel: 44 131 651 1084 richard.adams@ed.ac.uk From Marc.Logghe at devgen.com Fri Feb 6 04:52:20 2004 From: Marc.Logghe at devgen.com (Marc Logghe) Date: Fri Feb 6 04:58:42 2004 Subject: [Bioperl-l] fetching a fasta file Message-ID: > -----Original Message----- > From: KHOUEIRY pierre [mailto:khoueiry@ibsm.cnrs-mrs.fr] > Sent: Friday, February 06, 2004 10:33 AM > To: bioperl-l@bioperl.org > Subject: [Bioperl-l] fetching a fasta file > > > Hello, > I have to search for sequences from a local fasta file. my sequences > Id's are in a table > my @ID = > ('AAS_ECOLI','ABGT_ECOLI','ABRB_ECOLI','ACFD_ECOLI','ACRA_ECOL > I','ACRB_ECOLI'). > > I tried to index my file but it doesn't work. > I used something like: > $index = Bio::Index::Fasta->new("$file", 'WRITE'); > $index->make_index($file); > > Sometimes I'm getting message > Can't open 'DB_File' dbm file '/home/pierre/Perl/col2.fasta' > : File exists Hi, the problem is that you use the same name for your index file as for your fasta file. this should do it: $index = Bio::Index::Fasta->new("${file}.idx", 'WRITE'); $index->make_index($file); HTH, Marc From michael.watson at bbsrc.ac.uk Fri Feb 6 05:32:05 2004 From: michael.watson at bbsrc.ac.uk (michael watson (IAH-C)) Date: Fri Feb 6 05:39:38 2004 Subject: [Bioperl-l] Sub Seq Feature help Message-ID: <20B7EB075F2D4542AFFAF813E98ACD930282263D@cl-exsrv1.irad.bbsrc.ac.uk> Hello I want to manipulate the start and end position of a CDS feature that looks like this: FT CDS join(2307..3221,1..1623) I have tried: my @features = $seq->get_all_SeqFeatures; foreach $f (@features) { my @subs = $f->sub_SeqFeature; foreach $s (@subs) { print $s->start, "-", $s->end, "\n"; } } However, I get nothing out. The code doesn't descend into the sub seq features as $f->sub_SeqFeature doesn't return anything. Nor does $f->get_SeqFeatures. Clearly I am doing something wrong, but what? I am using Bioperl-1.2.3 Thanks Mick From james.wasmuth at ed.ac.uk Fri Feb 6 05:36:42 2004 From: james.wasmuth at ed.ac.uk (James Wasmuth) Date: Fri Feb 6 05:47:11 2004 Subject: [Bioperl-l] Problems running blast In-Reply-To: References: Message-ID: <40236E3A.8070607@ed.ac.uk> input can be: * sequence object * array ref of sequence objects * filename of file containing fasta formatted sequences best bet is create a Seq object. From the HOWTO: use IO::String; use Bio::SeqIO; # get a string into $string somehow, with its format in # $format, say from a web form my $stringfh = new IO::String($string); my $seqio = new Bio::SeqIO(-fh => $stringfh, -format => $format); while( my $seq = $seqio->next_seq ) { # process each seq } hth james Hong Ching Lee wrote: >Hey everyone, > >I have a question about whether i can run remote blast using just a string >or whether i have to make it into a fasta format file. Can anyone help me >with this? > >Thank You, >Hong >_______________________________________________ >Bioperl-l mailing list >Bioperl-l@portal.open-bio.org >http://portal.open-bio.org/mailman/listinfo/bioperl-l > > -- Nematode Bioinformatics || Blaxter Nematode Genomics Group || School of Biological Sciences || Ashworth Laboratories || King's Buildings || tel: +44 131 650 7403 University of Edinburgh || web: www.nematodes.org Edinburgh || EH9 3JT || UK || "I have not failed. I've just found 10,000 ways that don't work." --- Thomas Edison From Marc.Logghe at devgen.com Fri Feb 6 05:50:33 2004 From: Marc.Logghe at devgen.com (Marc Logghe) Date: Fri Feb 6 05:56:44 2004 Subject: [Bioperl-l] Sub Seq Feature help Message-ID: > -----Original Message----- > From: michael watson (IAH-C) [mailto:michael.watson@bbsrc.ac.uk] > Sent: Friday, February 06, 2004 11:32 AM > To: bioperl-l@bioperl.org > Subject: [Bioperl-l] Sub Seq Feature help > > > Hello > > I want to manipulate the start and end position of a CDS > feature that looks like this: > > FT CDS join(2307..3221,1..1623) > > I have tried: > > my @features = $seq->get_all_SeqFeatures; > foreach $f (@features) { > my @subs = $f->sub_SeqFeature; > foreach $s (@subs) { > print $s->start, "-", $s->end, "\n"; > } > } > Actually you are not dealing with sub_features here. Just a plain feature. What you really are looking for is sub_locations. When you envoke the method my $location = $f->location; you will get a Location object. In the specific case you showed, you will get a Bio::Location::Split object. There you will find the appropiate methods to achieve what you want (e.g. each_Location(), sub_Location) HTH, Marc From brian_osborne at cognia.com Fri Feb 6 07:38:01 2004 From: brian_osborne at cognia.com (Brian Osborne) Date: Fri Feb 6 07:44:11 2004 Subject: [Bioperl-l] Sub Seq Feature help In-Reply-To: <20B7EB075F2D4542AFFAF813E98ACD930282263D@cl-exsrv1.irad.bbsrc.ac.uk> Message-ID: Michael, There may be useful example code for you in the Feature-Annotation HOWTO (http://bioperl.org/HOWTOs/html/Feature-Annotation.html) or in the FAQ (http://bioperl.org/Core/Latest/faq.html#Q5.3). Brian O. -----Original Message----- From: bioperl-l-bounces@portal.open-bio.org [mailto:bioperl-l-bounces@portal.open-bio.org]On Behalf Of michael watson (IAH-C) Sent: Friday, February 06, 2004 5:32 AM To: bioperl-l@bioperl.org Subject: [Bioperl-l] Sub Seq Feature help Hello I want to manipulate the start and end position of a CDS feature that looks like this: FT CDS join(2307..3221,1..1623) I have tried: my @features = $seq->get_all_SeqFeatures; foreach $f (@features) { my @subs = $f->sub_SeqFeature; foreach $s (@subs) { print $s->start, "-", $s->end, "\n"; } } However, I get nothing out. The code doesn't descend into the sub seq features as $f->sub_SeqFeature doesn't return anything. Nor does $f->get_SeqFeatures. Clearly I am doing something wrong, but what? I am using Bioperl-1.2.3 Thanks Mick _______________________________________________ Bioperl-l mailing list Bioperl-l@portal.open-bio.org http://portal.open-bio.org/mailman/listinfo/bioperl-l From lstein at cshl.edu Fri Feb 6 09:33:13 2004 From: lstein at cshl.edu (Lincoln Stein) Date: Fri Feb 6 09:40:02 2004 Subject: [Bioperl-l] GD test 10 fails In-Reply-To: <40236791.4040107@siu.edu> References: <4022FE9B.5030406@siu.edu> <200402061113.47851.lstein@cshl.edu> <40236791.4040107@siu.edu> Message-ID: <200402061633.13772.lstein@cshl.edu> Hi, OK, the issue is that test 10 was broken when Tom Boutell released gd-2.0.22. Get GD version 2.12 and all should be well. It'll be appearing on CPAN in a day or so. Lincoln On Friday 06 February 2004 12:08 pm, Chet Langin wrote: > Greetings, > > SuSE 8.1 > gd-2.0.22 > GD-2.0.22 > freetype-2.1.5 > jpeg-6b > libpng-1.2.5 > zlib-1.2.1 > > I did not install a new XPM because I wasn't sure about the imake > system and the latest version was 1998, which should have come with > my distro. > > I started with a new server and installed SuSE. I did the online > SuSE updates. I got CPAN going and did the "r" updates. I > installed some lower level modules from CPAN. I got Apache and > MySQL running. I updated the MySQL modules from CPAN.I looked for > earlier version files of libgd on my machine, but did not have any. > I installed zlib. I installed libgd. It said that I already had > freetype, jpeg, and png. I started to install GD. It failed. I > went on the Internet and reinstalled freetype, jpeg and png. I > reinstalled libgd. I tried to install GD and it failed on test 10. > > =================================================================== >=== Running Mkbootstrap for GD () > chmod 644 GD.bs > rm -f blib/arch/auto/GD/GD.so > LD_RUN_PATH="/usr/local/lib:/usr/lib:/usr/X11R6/lib" cc -shared > GD.o -o blib/arch/auto/GD/GD.so -L/usr/local/lib -L/usr/lib/X11 > -L/usr/X11R6/lib -L/usr/X11/lib -L/usr/local/lib -lgd -lpng -lz > -lfreetype -ljpeg -lm -lX11 -lXpm > chmod 755 blib/arch/auto/GD/GD.so > cp GD.bs blib/arch/auto/GD/GD.bs > chmod 644 blib/arch/auto/GD/GD.bs > PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e" > "test_harness(0, 'blib/lib', 'blib/arch')" t/*.t > t/GD..........FAILED test 10 > Failed 1/10 tests, 90.00% okay (less 1 skipped test: 8 okay, > 80.00%) t/Polyline....ok > Failed Test Stat Wstat Total Fail Failed List of Failed > ------------------------------------------------------------------- >------------ t/GD.t 10 1 10.00% 10 > 1 subtest skipped. > Failed 1/2 test scripts, 50.00% okay. 1/11 subtests failed, 90.91% > okay. make: *** [test_dynamic] Error 29 > =================================================================== >==== > > --Chet > > Lincoln Stein wrote: > > Hi Chet, > > > > I wrote and maintain the GD library. If you can send me the > > information on what operating system you're using and the > > versions of each of the libraries you have installed I might be > > able to help. > > > > Lincoln > > > > On Friday 06 February 2004 04:40 am, Chet Langin wrote: > >>While installing GD, test 10 failed, thus halting > >>the install from CPAN. > >> > >>I installed the latest zlib, libgd, PNG, JPEG, > >>and FreeType libraries, and it still failed. > >> > >>It looked like test 10 might be converting between > >>JPEG and PNG formats. > >> > >>The only strange output during the make was a warning > >>about /usr/local/include being a system directory > >>when it was a non-system directory and that the > >>search order was changed. > >> > >>I went ahead and forced install. But, I was > >>wondering if this might cause me further trouble > >>down the road. > >> > >>,,Chet Langin,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, > >> > >>~~~Diagonally parked in a parallel universe ~~~~~ > >> > >> > >>_______________________________________________ > >>Bioperl-l mailing list > >>Bioperl-l@portal.open-bio.org > >>http://portal.open-bio.org/mailman/listinfo/bioperl-l -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 From clangin at siu.edu Fri Feb 6 10:27:33 2004 From: clangin at siu.edu (Chet Langin) Date: Fri Feb 6 10:10:16 2004 Subject: [Bioperl-l] GD test 10 fails References: <4022FE9B.5030406@siu.edu> <200402061113.47851.lstein@cshl.edu> <40236791.4040107@siu.edu> <200402061633.13772.lstein@cshl.edu> Message-ID: <4023B265.8010707@siu.edu> Greetings, It was GD-2.11. Sorry for the typo. Thanks for checking on it! I'll keep an eye out for GD 2.12. --Chet Lincoln Stein wrote: > Hi, > > OK, the issue is that test 10 was broken when Tom Boutell released > gd-2.0.22. Get GD version 2.12 and all should be well. It'll be > appearing on CPAN in a day or so. > > Lincoln > > On Friday 06 February 2004 12:08 pm, Chet Langin wrote: > >>Greetings, >> >>SuSE 8.1 >>gd-2.0.22 >>GD-2.0.22 >>freetype-2.1.5 >>jpeg-6b >>libpng-1.2.5 >>zlib-1.2.1 >> >>I did not install a new XPM because I wasn't sure about the imake >>system and the latest version was 1998, which should have come with >>my distro. >> >>I started with a new server and installed SuSE. I did the online >>SuSE updates. I got CPAN going and did the "r" updates. I >>installed some lower level modules from CPAN. I got Apache and >>MySQL running. I updated the MySQL modules from CPAN.I looked for >>earlier version files of libgd on my machine, but did not have any. >> I installed zlib. I installed libgd. It said that I already had >>freetype, jpeg, and png. I started to install GD. It failed. I >>went on the Internet and reinstalled freetype, jpeg and png. I >>reinstalled libgd. I tried to install GD and it failed on test 10. >> >>=================================================================== >>=== Running Mkbootstrap for GD () >>chmod 644 GD.bs >>rm -f blib/arch/auto/GD/GD.so >>LD_RUN_PATH="/usr/local/lib:/usr/lib:/usr/X11R6/lib" cc -shared >>GD.o -o blib/arch/auto/GD/GD.so -L/usr/local/lib -L/usr/lib/X11 >>-L/usr/X11R6/lib -L/usr/X11/lib -L/usr/local/lib -lgd -lpng -lz >>-lfreetype -ljpeg -lm -lX11 -lXpm >>chmod 755 blib/arch/auto/GD/GD.so >>cp GD.bs blib/arch/auto/GD/GD.bs >>chmod 644 blib/arch/auto/GD/GD.bs >>PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e" >>"test_harness(0, 'blib/lib', 'blib/arch')" t/*.t >>t/GD..........FAILED test 10 >> Failed 1/10 tests, 90.00% okay (less 1 skipped test: 8 okay, >>80.00%) t/Polyline....ok >>Failed Test Stat Wstat Total Fail Failed List of Failed >>------------------------------------------------------------------- >>------------ t/GD.t 10 1 10.00% 10 >>1 subtest skipped. >>Failed 1/2 test scripts, 50.00% okay. 1/11 subtests failed, 90.91% >>okay. make: *** [test_dynamic] Error 29 >>=================================================================== >>==== >> >>--Chet >> >>Lincoln Stein wrote: >> >>>Hi Chet, >>> >>>I wrote and maintain the GD library. If you can send me the >>>information on what operating system you're using and the >>>versions of each of the libraries you have installed I might be >>>able to help. >>> >>>Lincoln >>> >>>On Friday 06 February 2004 04:40 am, Chet Langin wrote: >>> >>>>While installing GD, test 10 failed, thus halting >>>>the install from CPAN. >>>> >>>>I installed the latest zlib, libgd, PNG, JPEG, >>>>and FreeType libraries, and it still failed. >>>> >>>>It looked like test 10 might be converting between >>>>JPEG and PNG formats. >>>> >>>>The only strange output during the make was a warning >>>>about /usr/local/include being a system directory >>>>when it was a non-system directory and that the >>>>search order was changed. >>>> >>>>I went ahead and forced install. But, I was >>>>wondering if this might cause me further trouble >>>>down the road. >>>> >>>>,,Chet Langin,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, >>>> >>>>~~~Diagonally parked in a parallel universe ~~~~~ >>>> >>>> >>>>_______________________________________________ >>>>Bioperl-l mailing list >>>>Bioperl-l@portal.open-bio.org >>>>http://portal.open-bio.org/mailman/listinfo/bioperl-l >>> > -- ,,Chet Langin,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ~~~Diagonally parked in a parallel universe ~~~~~ From tewang at ea.nacs.uci.edu Fri Feb 6 11:29:18 2004 From: tewang at ea.nacs.uci.edu (Eric Wang) Date: Fri Feb 6 11:35:24 2004 Subject: [Bioperl-l] Computing Allele Frequency From Heterozygosity In-Reply-To: <200402061510.i16FAgHH018676@portal.open-bio.org> Message-ID: Dear All, I am wondering if bioperl has implemented a way of converting Heterozygosity number (found in dbSNP) to true allele frequency. If not, I'd like to make some contributions. =) Many thanks! Eric From allenday at ucla.edu Fri Feb 6 11:44:20 2004 From: allenday at ucla.edu (Allen Day) Date: Fri Feb 6 11:50:26 2004 Subject: [Bioperl-l] protein networks In-Reply-To: <4023630A.2030406@ed.ac.uk> Message-ID: Richard, To my knowledge, nothing exists in bioperl. If you were to implement something, http://psidev.sf.net would be a good place to start. The Proteomics Standards Initiative, of which DIP is a part, is working to develop standard formats for proteomics data. -Allen On Fri, 6 Feb 2004, Richard Adams wrote: > Hi, > I was wondering if anyone has written or is writing any modules to deal > with protein interaction networks? > > E.g., to read in from a DIP flatfile or XML, or other such interaction > information source > and to have methods such as > > get_number_of_interactors() > get_interactor_ids() > clustering_coefficient() > path_length(from, to) > degree() > mean_path_length(). > > etc. > > > Richard > > From jason at cgt.duhs.duke.edu Fri Feb 6 13:09:52 2004 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Fri Feb 6 13:16:05 2004 Subject: [Bioperl-l] Bio::Species / Bio::Taxonomy::Node In-Reply-To: References: Message-ID: hmm - I was thinking that it is possible to create Taxonomy::Node which behaves just like a Bio::Species object if we feed it all the necessary information up front (the classification array essentially). It is only necessary to have a Bio::DB::Taxonomy handle if you want to do more sophisticated things [get all the sibling nodes at this level, etc]. Basically, I would expect Taxonomy::Node to be able to do everything that Bio::Species can do, AND also be db aware. It is just this pre-loaded with data versus a fully DB-dependent object. This differs from the way I built Taxonomy::Node at first, where if you wanted the Kingdom for a species, you had to walk up the hierarchy - now you push that all down at object creation time via the classification array. So for the simple case of Genbank/Swiss/EMBL parsing, we would operate as normal, and create Bio::Species (Bio::Taxonomy::Node really) objects as per normal. Only if someone wanted to do fun Bio::Taxonomy stuff they would need to instantiate a Bio::Taxonomy::Node from a taxonomydb (it needs to get the ncbi_taxid and a dbhandle). -jason On Tue, 3 Feb 2004, Brian Osborne wrote: > Jason, > > So you'd automatically create the Node object without knowing if the > underlying names and nodes files are present? I agree with you, that could > be confusing. > > Test for the existence of an env that specifies the directory that contains > these indexed files? > > Brian O. > > -----Original Message----- > From: bioperl-l-bounces@portal.open-bio.org > [mailto:bioperl-l-bounces@portal.open-bio.org]On Behalf Of Jason Stajich > Sent: Tuesday, February 03, 2004 4:28 PM > To: Hilmar Lapp > Cc: Bioperl > Subject: Re: [Bioperl-l] Bio::Species / Bio::Taxonomy::Node > > We can start making things create Taxonomy::Node objects - I know there > code floating out there which does > if( $sp->isa('Bio::Species') ) { } > > so presumably we could make Bio::Species interface s.t. taxonomy::Node > isa Bio::Species...? I don't want to confuse people either. > > There may still be a little more functionality that is needed in the > Taxonomy::Node objects and in the db - specifically how to deal with > some of the methods which are really specific to the species level of > the taxonomy (tips) such as classification/bionomial/ etc methods. > > -jason > > On Sat, 31 Jan 2004, Hilmar Lapp wrote: > > > Very cool Jason!! > > > > Now we can start hooking this into bioperl-db. > > > > And what about porting the SeqIO parsers, the target being to be able > > to deprecate Bio::Species altogether? Alternatively, change the > > SeqI/RichSeqI implementations to silently convert a Bio::Species > > instance on set to a Bio::Taxonomy::Node instance? > > > > -hilmar > > > > On Friday, January 30, 2004, at 02:07 PM, Jason Stajich wrote: > > > > > I think I've finally committed code which will allow > > > Bio::Taxonomy::Node > > > to act like Bio::Species while supporting the notion of being a node > > > in a > > > taxonomy hierarchy. Added tests in t/Species.t to this effect. > > > > > > For Bio::DB::Taxonomy::flatfile I've added indexing by parent Id so it > > > is > > > quite fast to grab all the children for a given node. So you can walk > > > up > > > and down the classification system now. Practically speaking > > > this means to get all the taxon ids of species in the same genus with a > > > few simple lines like below. > > > > > > Unfortunately the the NCBI taxonomy API as part of E-Utils doesn't > > > quite > > > provide the information we need so the whole API can't be used without > > > downloading the taxonomy db locally. > > > > > > nodefile and namesfile are the files from ncbi taxdump see > > > Bio::DB::Taxonomy::flatfile for more info. > > > > > > #!/usr/bin/perl > > > use strict; > > > use warnings; > > > > > > use Bio::DB::Taxonomy; > > > my $db = Bio::DB::Taxonomy->new > > > (-source => 'flatfile', > > > -nodesfile=> '/home/jason/taxonomy/nodes.dmp', > > > -namesfile=> '/home/jason/taxonomy/names.dmp'); > > > > > > my $node = $db->get_Taxonomy_Node(-name => 'Caenorhabditis elegans'); > > > > > > my $parent = $node->get_Parent_Node(); > > > for my $n ( $parent->get_Children_Nodes() ) { > > > print $n->binomial, "\t", $n->ncbi_taxid,"\n"; > > > } > > > > > > Someday I'll get around to making a HowTO unless someone else wants to > > > do > > > it... =) > > > > > > -jason > > > -- > > > Jason Stajich > > > Duke University > > > jason at cgt.mc.duke.edu > > > _______________________________________________ > > > Bioperl-l mailing list > > > Bioperl-l@portal.open-bio.org > > > http://portal.open-bio.org/mailman/listinfo/bioperl-l > > > > > > > > > > -- > Jason Stajich > Duke University > jason at cgt.mc.duke.edu > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From mcipriano at mbl.edu Fri Feb 6 16:43:59 2004 From: mcipriano at mbl.edu (Michael Cipriano) Date: Fri Feb 6 16:50:01 2004 Subject: [Bioperl-l] question on simple align object Message-ID: <000f01c3ecfa$5414e320$8fae8080@Ripley> Is there a way I can get a Simple Alignment object that I created from a clustalw alignment into a string of a specific format? I do not want to deal with any files at all, I want to just be able to print the alignment from a cgi-script on a web page. So far I have this: # Create alignment my @params = ('ktuple' => 2, 'matrix' => 'BLOSUM'); my $factory = Bio::Tools::Run::Alignment::Clustalw->new(@params); my $aln = $factory->align($seq_array_ref); I tried all sorts of things, but can't get specifically what I need (which is simply the whole alignment file as a msf formated string). I would like to not have to deal with any temporary files, unless there is a way I can get to the temporary file that is created from the clustalw alignment and just stick that into a string. Thanks for any help in advance. From brian_osborne at cognia.com Fri Feb 6 17:00:08 2004 From: brian_osborne at cognia.com (Brian Osborne) Date: Fri Feb 6 17:06:56 2004 Subject: [Bioperl-l] question on simple align object In-Reply-To: <000f01c3ecfa$5414e320$8fae8080@Ripley> Message-ID: Michael, I just wrote this on the command-line, it is sloppy but it seems to work: ~/programming/perl/Bioperl>perl -e 'use Bio::AlignIO; $io = Bio::AlignIO->new(-file => "aln.clustalw", -format => "clustalw" ); my $aln = $io->next_aln; use IO::String; $out = IO::String->new(\$str); $ioout = Bio::AlignIO->new(-format=> "msf", -fh => $out ); $ioout->write_aln($aln); print $str;' I've created my SimpleAlign object from a file, you won't need to do that, of course. Brian O. -----Original Message----- From: bioperl-l-bounces@portal.open-bio.org [mailto:bioperl-l-bounces@portal.open-bio.org]On Behalf Of Michael Cipriano Sent: Friday, February 06, 2004 4:44 PM To: bioperl-l@bioperl.org Subject: [Bioperl-l] question on simple align object Is there a way I can get a Simple Alignment object that I created from a clustalw alignment into a string of a specific format? I do not want to deal with any files at all, I want to just be able to print the alignment from a cgi-script on a web page. So far I have this: # Create alignment my @params = ('ktuple' => 2, 'matrix' => 'BLOSUM'); my $factory = Bio::Tools::Run::Alignment::Clustalw->new(@params); my $aln = $factory->align($seq_array_ref); I tried all sorts of things, but can't get specifically what I need (which is simply the whole alignment file as a msf formated string). I would like to not have to deal with any temporary files, unless there is a way I can get to the temporary file that is created from the clustalw alignment and just stick that into a string. Thanks for any help in advance. _______________________________________________ Bioperl-l mailing list Bioperl-l@portal.open-bio.org http://portal.open-bio.org/mailman/listinfo/bioperl-l From shawnh at stanford.edu Fri Feb 6 17:09:32 2004 From: shawnh at stanford.edu (Shawn Hoon) Date: Fri Feb 6 17:11:01 2004 Subject: [Bioperl-l] question on simple align object In-Reply-To: <000f01c3ecfa$5414e320$8fae8080@Ripley> References: <000f01c3ecfa$5414e320$8fae8080@Ripley> Message-ID: <2452422C-58F1-11D8-8EAE-000A95783436@stanford.edu> use Bio::AlignIO; use IO::String; my $stringio = IO::String->new($string); my $aout = Bio::AlignIO->new(-fh=>$stringio,-format=>'clustalw'); > my @params = ('ktuple' => 2, 'matrix' => 'BLOSUM'); > my $factory = Bio::Tools::Run::Alignment::Clustalw->new(@params); > > my $aln = $factory->align($seq_array_ref); $aout->write_aln($aln); print "Alignment\n".$string."\n"; something like above should work. shawn On Friday, February 6, 2004, at 1:43PM, Michael Cipriano wrote: > Is there a way I can get a Simple Alignment object that I created from > a > clustalw alignment into a string of a specific format? I do not want to > deal with any files at all, I want to just be able to print the > alignment from a cgi-script on a web page. > > So far I have this: > > # Create alignment > my @params = ('ktuple' => 2, 'matrix' => 'BLOSUM'); > my $factory = Bio::Tools::Run::Alignment::Clustalw->new(@params); > > my $aln = $factory->align($seq_array_ref); > > I tried all sorts of things, but can't get specifically what I need > (which is simply the whole alignment file as a msf formated string). I > would like to not have to deal with any temporary files, unless there > is > a way I can get to the temporary file that is created from the clustalw > alignment and just stick that into a string. > > Thanks for any help in advance. > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l From heikki at nildram.co.uk Sat Feb 7 19:26:32 2004 From: heikki at nildram.co.uk (Heikki Lehvaslaiho) Date: Sat Feb 7 19:32:38 2004 Subject: [Bioperl-l] Computing Allele Frequency From Heterozygosity In-Reply-To: References: Message-ID: <200402080026.32523.heikki@nildram.co.uk> Eric, We've got nothing along those lines. Please submit your contributions using bugzilla. -Heikki On Friday 06 Feb 2004 16:29, Eric Wang wrote: > Dear All, > > I am wondering if bioperl has implemented a way of converting > Heterozygosity number (found in dbSNP) to true allele frequency. If not, > I'd like to make some contributions. =) > > Many thanks! > > Eric > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l -- ______ _/ _/_____________________________________________________ _/ _/ http://www.ebi.ac.uk/mutations/ _/ _/ _/ Heikki Lehvaslaiho heikki_at_ebi ac uk _/_/_/_/_/ EMBL Outstation, European Bioinformatics Institute _/ _/ _/ Wellcome Trust Genome Campus, Hinxton _/ _/ _/ Cambs. CB10 1SD, United Kingdom _/ Phone: +44 (0)1223 494 644 FAX: +44 (0)1223 494 468 ___ _/_/_/_/_/________________________________________________________ From barry.moore at genetics.utah.edu Sat Feb 7 23:24:48 2004 From: barry.moore at genetics.utah.edu (Barry Moore) Date: Sat Feb 7 23:30:59 2004 Subject: [Bioperl-l] Remote BLAST returning lots of 500 time out errors In-Reply-To: <000f01c3ecfa$5414e320$8fae8080@Ripley> References: <000f01c3ecfa$5414e320$8fae8080@Ripley> Message-ID: <4025BA10.7000508@genetics.utah.edu> I have a script that uses Bio::Tools::Run::RemoteBlast to BLAST the translations of all ORFs from a mRNA transcript against the database. It works fine, except that if I run several sequences at once, after about 50 ORFs worth of BLASTing, NCBI starts to return errors (500 read time-out) for every job submitted. I can't figure out what's going on here. Any ideas? The script is kind of long and it take several minutes to get to the errors, but if anyone wants to try to recreate the error I've attached the code below. Some of you will probably recognize bits of your code that I've pilfered from various Bioperl docs. I'm running Bioperl 1.4, ActiveState perl 5.8.0.805 on Windows XP. I get the error by running: perl ORF_BLAST1.pl ?min_length 150 NM_001112 NM_007327 NM_015833 NM_021569 Barry Moore Dept. Human Genetics University of Utah ------------------------------------------------------------------------------------------------------ #!/usr/bin/perl #ORF_BLAST1.pl #See end of file for POD documentation use strict; use warnings; use GD; use Getopt::Long; use Bio::SeqIO; use Bio::PrimarySeq; use Bio::DB::GenBank; use Bio::Tools::Run::RemoteBlast; #Give documentation when requested, or when missing command line arguments. if ( ! $ARGV[0] || $ARGV[0] =~ /^-{1,2}(h|help|\?)$/i ) { system ( "perldoc", $0 ) and die "For usage, use perldoc $0\n"; exit( 0 ); } #Declare module level variables. my $in_filename; #This variable holds the filename of the input file. my $out_filename; #Suprisingly this variable holds the filname of the output file. my $format; #This variable defines the output format (png or jpg). my $min_length; #This variable defines the minimum ORF length to plot. my $require_start; #This boolean varible identifies of an ORF must begin with a start. my $seqio; #This variable holds the SeqIO object. #Handle command line options. GetOptions ( 'in_file:s' => \$in_filename, 'out_file:s' => \$out_filename, 'format:s' => \$format, 'min_length:i' => \$min_length, 'start!' => \$require_start ); my @accession = @ARGV; #Set defaults. $format ||= 'jpg'; $min_length ||= 150; $require_start ||= 0; #Define new SeqIO object. Take input from a file first if a #filename has been specified. Otherwise take input from accession numbers off #the command line (but don't try to do both) if ($in_filename) { $seqio = Bio::SeqIO->new(-format=>'fasta', -file=>$in_filename) or die "could not create Bio::SeqIO"; } elsif (@accession) { my $gb = new Bio::DB::GenBank(); $seqio = $gb->get_Stream_by_id(\@accession); } #Remote-BLAST factory object creation and blast-parameter initialization my $BLAST_factory = Bio::Tools::Run::RemoteBlast->new('-prog' => 'blastp', '-data' => 'nr', '-expect' => '10', '-readmethod' => 'SearchIO' ); #Main program loop to loop over all sequences input. while (my $seq_obj = $seqio->next_seq) { #Assign sequence specfic variables my @starts = (); my @stops = (); my @orfs = (); my $out_filename; my $sequence = $seq_obj->seq; my $sequence_length = length($sequence); my $header = $seq_obj->display_name."|".$seq_obj->desc; #Internal loop to find starts, stops, and ORFs for my $count1 (0 .. $sequence_length) { my $open = 0; if ($count1 < 4) {$open = 1} my $count2 = $count1; my $codon = substr($sequence, $count1, 3); my $frame = $count1 % 3; #Get the modulus of $count1/3 for ascertaining the frame #Convert the modulus above into frame 1, 2, or 3. if ($frame == 1) {$frame = 2} elsif ($frame == 2) {$frame = 3} elsif ($frame == 0) {$frame = 1} #Add starts to stack. if ($codon =~ /ATG/i) { push @starts, {start => $count1, frame => $frame}; if ($require_start == 1) {$open = 1} #Open the ORF flag. } #Add stops to stack. if ($codon =~ /TGA|TAG|TAA/i) { push @stops, {stop => $count1, frame => $frame}; if ($require_start == 0) {$open = 1} #Open the ORF flag. } #Find extend of ORF if one has been opened by either of the above #conditionals. if ($open == 1) { $codon = ""; my $count2 = $count1; #Loop to step forward through ORF looking for next in frame stop. while (($codon !~ /TGA|TAG|TAA/i) and ($count2 < $sequence_length - 4)) { $count2 = $count2 + 3; #Keep it in frame. $codon = substr($sequence, $count2, 3); } #Make sure the ORF is long enough to count... if ($count2 - $count1 >= $min_length) { push @orfs, {begin => $count1, end => $count2, frame => $frame }; #...then push it onto the ORF stack. } } } @orfs || die "No ORFs of $min_length nucleotide in length found"; #Loop to BLAST each ORF against the database, and check for a hit. my $BLAST_count; for my $orf (@orfs) { #Assign ORF specific variables. my $begin = $$orf{begin}; my $end = $$orf{end}; my $frame = $$orf{frame}; $BLAST_count++; #Initialize subsequence as new sequence my $seq = new Bio::PrimarySeq (-seq => $seq_obj->subseq($begin + 1, $end), -display_id => "${frame}_${begin}_${end}"); #Translate sequence my $trans = $seq->translate(); #Blast the sequence against a database: my $job = $BLAST_factory->submit_blast($trans); print STDERR "Blasting ORF ",$BLAST_count," of ", scalar @orfs, "..."; #Loop to load the RIDs returned for the BLAST job submitted (this probably #doesn't need to be a loop here but I won't take it out yet) while ( my @rids = $BLAST_factory->each_rid ) { #Loop iterate over RIDs, and hit NCBI's BLAST server for a result foreach my $rid ( @rids ) { #Hit the server for a result on RID. my $blast_results = $BLAST_factory->retrieve_blast($rid); #Was a result returned? if( !ref($blast_results) ) { #If so and it returned an error remove that RID from the stack if ($blast_results < 0) { $BLAST_factory->remove_rid($rid); } print STDERR "."; #Keep the user staring at the dots. sleep 5; #Plays nice with the servers. } #If a result was returned and it isn't an error, then pass it to a #variable... else { my $result = $blast_results->next_result(); $BLAST_factory->remove_rid($rid); #...and remove it's RID from the stack. #Check the result for a hit... my $hit = $result->next_hit; if (ref($hit)) { my $hsp = $hit->next_hsp; #...collect it's evalue from the hsp object, and add to the ORFs hash $$orf{evalue} = $hsp->evalue(); } #If no evalue found, default to 100 to keep undef from looking like a #significant e-value. else {$$orf{evalue} = 100} print "\n"; } } } } #Main block to draw image. my $image = new GD::Image(900, 150); #Create a new image. #Allocate some colors. my %color = ( white => $image->colorAllocate(255,255,255), aqua => $image->colorAllocate(0,255,255), black => $image->colorAllocate(0,0,0), blue => $image->colorAllocate(0,0,255), gray => $image->colorAllocate(128,128,128), fuchsia => $image->colorAllocate(255,0,255), green => $image->colorAllocate(0,255,0), lime => $image->colorAllocate(0,255,255), maroon => $image->colorAllocate(128,0,0), navy => $image->colorAllocate(0,0,128), olive => $image->colorAllocate(128,128,0), purple => $image->colorAllocate(128,0,128), red => $image->colorAllocate(255,0,0), silver => $image->colorAllocate(192,192,192), teal => $image->colorAllocate(0,128,128), yellow1 => $image->colorAllocate(255,255,0), yellow2 => $image->colorAllocate(200,200,0), yellow3 => $image->colorAllocate(150,150,0) ); #Make the background transparent and interlaced. $image->transparent($color{white}); $image->interlaced('true'); #Put a black frame around the picture. $image->rectangle(0,0,899,149,$color{black}); #Add the title line. $image->string(gdGiantFont,10,10,$header,$color{black}); #Draw the lines for each frame. $image->line(10,50,890,50,$color{black}); $image->line(10,75,890,75,$color{black}); $image->line(10,100,890,100,$color{black}); #Draw a line for the ruler. $image->line(10,125,890,125,$color{black}); #Loop to add ruler ticks and numbers to image. for my $tick (0 .. 10) { #Convert sequence coordniates to image X-asix values. $tick = $sequence_length/10*$tick; $tick = convert($tick, $sequence_length); #Add ruler ticks. $image->line($tick,125,$tick,130,$color{black}); #Add nubmers to ruler. $image->string(gdSmallFont,$tick-(2*length($tick-15)),130,$tick-15,$color{black}); } #Loop to add ORFs to image. for my $orf (@orfs) { my $top; #The variable sets the top of ORF rectangle. my $bottom; #This varibale sets the bottom of ORF rectangle. #Asign the Y coordinates for the ORF to place them in the correct frame. if ($$orf{frame} == 1) {$top = 40; $bottom = 60} elsif ($$orf{frame} == 2) {$top = 65; $bottom = 85} elsif ($$orf{frame} == 3) {$top = 90; $bottom = 110} #Convert sequence coordniates to image X-axis values. my $begin = convert($$orf{begin}, $sequence_length); my $end = convert($$orf{end}, $sequence_length); #Asign a shade of yellow to the ORF if the BLAST on that ORF returned an #evaule. my $orf_color = $color{black}; #Default ORF color to black. if (defined $$orf{evalue}) { if ($$orf{evalue} <= 10) {$orf_color = $color{yellow3}} #Dark yellow if ($$orf{evalue} <= 1.0e-3) {$orf_color = $color{yellow2}} #Meduim yellow if ($$orf{evalue} < 1.0e-25) {$orf_color = $color{yellow1}} #Bright yellow } #Draw rectangles for the ORFs. $image->filledRectangle($begin,$top,$end,$bottom,$orf_color); #Print the e-value on the ORF if it is below 10. if ($$orf{evalue} < 10) { $image->string(gdSmallFont,$begin + 3,$top + 2,$$orf{evalue},$color{black}); } } #Add green ticks to the image for each start. for my $start (@starts) { my $top; #This variable sets the top of the start line. my $bottom; #This varibale sets the bottom of the start line. #Assign the Y coordinates for the start line to put it in the correct frame. if ($$start{frame} == 1) {$top = 50; $bottom = 60} elsif ($$start{frame} == 2) {$top = 75; $bottom = 85} elsif ($$start{frame} == 3) {$top = 100; $bottom = 110} #Convert sequence coordniates to image X-axis values. my $location = convert($$start{start}, $sequence_length); #Draw the start ticks. $image->line($location,$top,$location,$bottom,$color{green}); } #Add red ticks to the image for each stop. for my $stop (@stops) { my $top; #This variable sets the top of the stop line. my $bottom; #This varibale sets the bottom of the stop line. #Assign the Y coordinates for the stop line to put it in the correct frame. if ($$stop{frame} == 1) {$top = 40; $bottom = 60} elsif ($$stop{frame} == 2) {$top = 65; $bottom = 85} elsif ($$stop{frame} == 3) {$top = 90; $bottom = 110} #Convert sequence coordniates to image X-asix values. my $location = convert($$stop{stop}, $sequence_length); #Draw the stop ticks. $image->line($location,$top,$location,$bottom,$color{red}); } #Set a default output filename if none was set on the command line. if (! $out_filename) { if ( $in_filename &&=~ /(.*?)\..*/) { $out_filename = $1.".".$format; } elsif ( $seq_obj->primary_id !~ /unknown/) { $out_filename = $seq_obj->display_name().".".$format; } else { $out_filename = $seq_obj->primary_id().".".$format; } } #Open a filehandle for output and make sure we are writing a binary stream. open (OUT, ">$out_filename"); binmode OUT; #Write the image to a file in specified format. if ($format =~ /jpg|jpeg/) { print OUT $image->jpeg; } if ($format =~ /png/) { print OUT $image->png; } close OUT; } #A subroutine to convert sequence coordinates to x-axis values on the image. sub convert { my ($value, $length) = @_; $value = (($value/$length)*870)+15; #Convert a sequence length value to an X-axis value. $value = sprintf("%.0f", $value); #Round $value to nearest integer. return $value; } =head1 NAME ORF_BLAST1.pl =head1 SYNOPSIS perl ORF_Plot.pl [--options] NM_007327 =head1 DESCRIPTION This program will take a sequence file as input, and generate a graphical output of it's ORF architecture in 3 frames plotting ORFs, start codons (ATG) and stop codons. It will BLAST the translation of each ORF against NCBI, and color the shades of yellow to black, depending on the e-value returned for that ORF. INPUT: Input can be a list of space seperated accession numbers on the command line, or a fasta file. OUTPUT: Output is a figure saved as either a png or jpg file to the current directory. OPTIONS: Several options can be specified, but all are optional. --in_file filename Use to set the input file name. The file that contains the input sequences in fasta format. --out_file filename Use to set the output file name. Defaults to input file name, then Bioperl's display name (usually the accession number), then Bioperl's accession number (usually the gi number). --min_size integer Use to set the minimum ORF size that will be plotted in the figure. --start Use to require plotted ORFs to begin with a start --format Use to set the output format. Valid values are png or jpg (or jpeg). Defaults to jpg. =head1 USING perl ORF_BLAST1.pl --min_size 300 --start --format png NM_001112 NM_007327 NM_015833 or perl ORF_BLAST1.pl --in_file sequence.fasta --out_file image_file --min_size 300 --start =head1 REQUIRES GD Getopt::Long Bio::SeqIO Bio::PrimarySeq Bio::DB::GenBank Bio::Tools::Run::RemoteBlast =head1 AUTHOR Barry Moore Department of Human Genetics University of Utah Salt Lake City, UT 84112 USA Address bug reports and comments to: barry.moore@genetics.utah.edu =head1 BUGS Currently after about 50 ORFs BLASTed, NCBI starts to return time-out errors. =head1 FUTURE DIRECTIONS Add command line options for the BLAST parameters. =head1 COPYRIGHT Copyright 2003, Barry Moore. All rights reserved. This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself. =head1 SEE ALSO =cut From postmaster at portlandpress.com Sun Feb 8 22:24:48 2004 From: postmaster at portlandpress.com (postmaster@portlandpress.com) Date: Sun Feb 8 22:30:49 2004 Subject: [Bioperl-l] Subject : Virus Detected in "Hi" Message-ID: <200402090330.i193UlHH022108@portal.open-bio.org> A mail message with subject "Hi" has been found containing a virus. The message was sent from bioperl-l@bioperl.org to the following: registration@portland-services.com The email has been Deleted. For more information contact support@portlandpress.com The description of the Virus is shown below: Scenarios/G.Virus: Threat: 'W32/MyDoom-A' detected by 'Sophos AV Interface for MIMEsweeper'. Scenarios/G.Exe: 'ItemLength.GE.0'. From hcle028 at cse.unsw.edu.au Sun Feb 8 22:45:22 2004 From: hcle028 at cse.unsw.edu.au (Hong Ching Lee) Date: Sun Feb 8 22:51:25 2004 Subject: [Bioperl-l] Problems running remote blast Message-ID: Hi everyone, I have a question about running remote blast. The scenario is that I have a DNA sequence stored as a string in $seq. What I'd like to do is to submit it to blastn, then retrieve the result and put it into html format without processing it. I've noticed the existence of modules like Bio::Tools::Blast::HTML, but I'm not sure about how I should use them, if I should use them at all. Thank You, Hong PS: Thank You for answering my previous message From Administrator at portal.open-bio.org Mon Feb 9 02:31:55 2004 From: Administrator at portal.open-bio.org (Administrator@portal.open-bio.org) Date: Mon Feb 9 02:31:31 2004 Subject: [Bioperl-l] ScanMail Message: To Sender virus found and action taken. Message-ID: <097301c3eede$cb487660$0764010a@OSRL.NET> ScanMail for Microsoft Exchange has detected virus-infected attachment(s). Sender = bioperl-l@bioperl.org Recipient(s) = Karren Jolliffe Subject = HELLO Scanning Time = 02/09/2004 07:31:54 Engine/Pattern = 6.810-1005/757 Action on virus found: The attachment body.zip contains WORM_MYDOOM.A virus. ScanMail has Deleted it. Warning to sender. ScanMail has detected a virus in an email you sent. From Richard.Adams at ed.ac.uk Mon Feb 9 03:46:55 2004 From: Richard.Adams at ed.ac.uk (Richard Adams) Date: Mon Feb 9 03:52:59 2004 Subject: [Bioperl-l] Remote BLAST returning lots of 500 time out Remote BLAST returning lots of 500 time outerrors Message-ID: <402748FF.2030406@ed.ac.uk> Hi Barry, There's certainly nothing wrong with your code, there seems to be a problem with the way the RIDs are stored in temporary files ... I get the same problems with my code as well... will look into it. Richard -- Dr Richard Adams Bioinformatician, Psychiatric Genetics Group, Medical Genetics, Molecular Medicine Centre, Western General Hospital, Crewe Rd West, Edinburgh UK EH4 2XU Tel: 44 131 651 1084 richard.adams@ed.ac.uk From ricky21chaos at hotmail.com Mon Feb 9 08:33:20 2004 From: ricky21chaos at hotmail.com (curtis) Date: Mon Feb 9 08:42:41 2004 Subject: [Bioperl-l] This Drug puts VlAGRA to shame!! Message-ID: <1076333600-19929@excite.com> The Biggest New Drug since V1agra! Many times as powerful. C1AL1S has been seen all over TV as of late. So why is it so much better than V1agra? Why are so many switching brands? -A quicker more stable erection -More enjoyable sex for both -Longer sex -Known to add length to you erection -Lasts up to 36 hours (not a thrity-six hour erection, but enhancement for thirty-six) We have it at a discounted savings. Save when you go through our site on all your orders. See the difference today. http://genius.roninnz.com/s95c/index.php?id=s95 lulu zenithjamaica barry whitney pearl wonder benson zenith fool diana abby lulu mantra kiss From Richard.Adams at ed.ac.uk Mon Feb 9 08:48:56 2004 From: Richard.Adams at ed.ac.uk (Richard Adams) Date: Mon Feb 9 08:55:01 2004 Subject: [Bioperl-l] Remote BLAST returning lots of 500 time out errors Message-ID: <40278FC8.3050903@ed.ac.uk> Hi Barry, I've changed the RemoteBlast module so that it no longer uses temporary files - version 1.19 in CVS. This is just a temporary solution as I'm not sure why making temporary files is causing this problem just now. But it should work OK all being in memory, unless you are making vast Blast outqput files. You don't have to change your script at all. It now appears to be run OK. Just curious, why don't you just submit the RNA sequence and use Blastx? This translates all your sequences and means you only have to submit 1/6th as many sequences to the server.. Cheers Richard -- Dr Richard Adams Bioinformatician, Psychiatric Genetics Group, Medical Genetics, Molecular Medicine Centre, Western General Hospital, Crewe Rd West, Edinburgh UK EH4 2XU Tel: 44 131 651 1084 richard.adams@ed.ac.uk From barry.moore at genetics.utah.edu Mon Feb 9 09:44:39 2004 From: barry.moore at genetics.utah.edu (Barry Moore) Date: Mon Feb 9 09:50:48 2004 Subject: [Bioperl-l] Remote BLAST returning lots of 500 time out errors In-Reply-To: <40278FC8.3050903@ed.ac.uk> References: <40278FC8.3050903@ed.ac.uk> Message-ID: <40279CD7.3070904@genetics.utah.edu> Richard- Thanks much for the help, and the solution. I use this program to help look for trans-frame proteins - that is proteins that require a frameshift for expression. With that in mind, this program BLASTs the translation of every ORF, and plots that ORF in shades of yellow (representing it's e-value) on a 3 frame plot of the transcript. It may be that blastx would work, and I could just map the location of significant HSPs onto my plot. When I started working on the program I tried translating the entire transcript (stops and all) in 3 frames, and BLASTing the 3 frames. I noticed that I wouldn't get HSPs to some small ORFs that I could get by BLASTing those ORFs individually. Because of that and because at the time it seemed simpler to keep track of and plot the results if each ORF was handled separately, I went that way. In retrospect now that I've seen how long it can take to BLAST 26 small ORFs I think it would be a good idea to go back and check more carefully if I can achieve the same results with blastx. It may be that by tweaking the parameters to BLAST, I can see hits to all the small ORFs on the transcript. Thanks again for your help, and for the suggestions. Barry Richard Adams wrote: > Hi Barry, > I've changed the RemoteBlast module so that it no longer uses > temporary files - version 1.19 in CVS. > This is just a temporary solution as I'm not sure why making temporary > files is causing this problem just now. > But it should work OK all being in memory, unless you are making vast > Blast outqput files. > You don't have to change your script at all. It now appears to be run OK. > > Just curious, why don't you just submit the RNA sequence and use > Blastx? This translates all your sequences and means you only have to > submit > 1/6th as many sequences to the server.. > Cheers > Richard > From Richard.Adams at ed.ac.uk Tue Feb 10 03:31:56 2004 From: Richard.Adams at ed.ac.uk (Richard Adams) Date: Tue Feb 10 03:37:56 2004 Subject: [Bioperl-l] Problems running remote blast Message-ID: <402896FC.8090607@ed.ac.uk> Hi, First of all make a sequence object : my $seqobj = Bio::PrimarySeq->new(-seq=> $your_Sequence_String, -display_id => 'whatever'); Then use the synopsis from Bio::Tools::Run::RemoteBlast which is a minimal remote blasting module (changing parameters as appropriate). The result that is returned in $result can be written in HTML my $result = $rc->next_result(); my $writerhtml = new Bio::SearchIO::Writer::HTMLResultWriter(); my $outhtml = new Bio::SearchIO(-writer => $writerhtml, -file => ">searchio.html"); # get a result from Bio::SearchIO parsing or build it up in memory $outhtml->write_result($result); The SearchIO modules are clearly explained in a HOWTO document if you want to know more. Cheers Richard -- Dr Richard Adams Bioinformatician, Psychiatric Genetics Group, Medical Genetics, Molecular Medicine Centre, Western General Hospital, Crewe Rd West, Edinburgh UK EH4 2XU Tel: 44 131 651 1084 richard.adams@ed.ac.uk From sdavis2 at mail.nih.gov Tue Feb 10 12:02:45 2004 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Tue Feb 10 12:04:50 2004 Subject: [Bioperl-l] Building biosql database errors Message-ID: I am trying to build a biosql database on a mysql database. I have mysql and biosql schema running and can successfully load some data, but for a proportion of the data when loading ontology or locuslink, I get the following (many times). Am I doing something wrong, or is this to be expected? I would just push on with --safe (as given below), but clearly part of the data is not loaded correctly after looking at the result. I have the same problem when loading locuslink. Any input is appreciated. Thanks, Sean % perl ../../../bioperl-db/scripts/biosql/load_ontology.pl --safe --fmtargs "-defs_file,GO.defs" --dbuser sdavis --dbpass mic2222 --namespace "Gene Ontology" --format goflat component.ontology.2004-02-01 process.ontology.2004-02-01 function.ontology.2004-02-01 -------------------- WARNING --------------------- MSG: insert in Bio::DB::BioSQL::TermAdaptor (driver) failed, values were ("GO:0042597","periplasmic space","The region between the inner (cytoplasmic) and outer membrane (Gram-negative bacteria) or inner membrane and cell wall (fungi).","") FKs (84) Duplicate entry 'periplasmic space-84' for key 2 --------------------------------------------------- Could not store term relationship (periplasmic space (sensu Fungi),IS_A,periplasmic space): ------------- EXCEPTION ------------- MSG: create: object (Bio::Ontology::GOterm) failed to insert or to be found by unique key STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create /Library/Perl/5.8.1/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:207 STACK Bio::DB::Persistent::PersistentObject::create /Library/Perl/5.8.1/Bio/DB/Persistent/PersistentObject.pm:243 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create /Library/Perl/5.8.1/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:170 STACK Bio::DB::Persistent::PersistentObject::create /Library/Perl/5.8.1/Bio/DB/Persistent/PersistentObject.pm:243 STACK (eval) ../../../bioperl-db/scripts/biosql/load_ontology.pl:548 STACK toplevel ../../../bioperl-db/scripts/biosql/load_ontology.pl:547 -------------------------------------- From awitney at sghms.ac.uk Tue Feb 10 13:07:52 2004 From: awitney at sghms.ac.uk (Adam Witney) Date: Tue Feb 10 13:15:00 2004 Subject: [Bioperl-l] Bioperl-db make test failures Message-ID: Hi, I am trying out bioperl-db and biosql. I downloaded both from CVS and installed the biosql schema ok. However I have some test failures with bioperl-db: t/cluster.......ok 5/160Use of uninitialized value in join or string at blib/lib/Bio/DB/BioSQL/BaseDriver.pm line 1835, line 1. -------------------- WARNING --------------------- MSG: insert in Bio::DB::BioSQL::BiosequenceAdaptor (driver) failed, values were ("","0","dna","") FKs (2) ERROR: invalid input syntax for integer: "" ... And t/species.......ok 68/65 ------------- EXCEPTION ------------- MSG: create: object (Bio::Species) failed to insert or to be found by unique key STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create blib/lib/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:207 STACK Bio::DB::Persistent::PersistentObject::create blib/lib/Bio/DB/Persistent/PersistentObject.pm:243 STACK toplevel t/species.t:76 Are these known problems or have I missed something? Thanks Adam -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From andreas.bernauer at gmx.de Tue Feb 10 15:35:09 2004 From: andreas.bernauer at gmx.de (Andreas Bernauer) Date: Tue Feb 10 15:41:11 2004 Subject: [Bioperl-l] Building biosql database errors In-Reply-To: References: Message-ID: <20040210203509.GI369@hgt.mcb.uconn.edu> Sean Davis wrote: > I am trying to build a biosql database on a mysql database. I have mysql > and biosql schema running and can successfully load some data, but for a > proportion of the data when loading ontology or locuslink, I get the > following (many times). Am I doing something wrong, or is this to be > expected? > -------------------- WARNING --------------------- > MSG: insert in Bio::DB::BioSQL::TermAdaptor (driver) failed, values were > ("GO:0042597","periplasmic space","The region between the inner > (cytoplasmic) and outer membrane (Gram-negative bacteria) or inner membrane > and cell wall (fungi).","") FKs (84) > Duplicate entry 'periplasmic space-84' for key 2 ^ | +-------------------------+ | I don't know, but maybe this + means something? I guess, the db can't handle duplicate keys. Andreas. From sdavis2 at mail.nih.gov Tue Feb 10 18:21:58 2004 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Tue Feb 10 18:23:49 2004 Subject: GO identifiers being mis-parsed? Was Re: [Bioperl-l] Building biosql database errors Message-ID: I thank Andreas for pointing out the obvious and the email list points this out as an ongoing problem, solved for the meantime by installing GO with --nodelete. However, there was another set of errors that remained after fixing this and seemingly related to terms such as: UM-BBD-pathwayID:van When I changed the ':' to '-' globally in the GO flat files, I got rid of the exceptions (Any ideas?). I continued to have issues with installing, though, in that the name column often contains the GO:xxxxxxx number rather than the name. I assume that the identifier column is supposed to contain the GO:xxxxxxx numbers, instead. It seems that any values in the flat file that contain a ':' are being treated as GO identifiers, as when I change all things like "metacyc:xxxx" or "EC:xxxx" to use '-' instead of ':', the output is as expected. I don't know enough about methods code to find where that parsing occurs, but just wanted to bring it up as an issue for me. Is this a problem specific to me or have others found similar issues? I am using bioperl-1.4 with bioperl-db installed from cvs this morning on macos 10.3.2. Sean On 2/10/04 3:35 PM, "Andreas Bernauer" wrote: > Sean Davis wrote: I am trying to build a biosql database on a mysql database. > I have mysql and biosql schema running and can successfully load some data, > but for a proportion of the data when loading ontology or locuslink, I get the > following (many times). Am I doing something wrong, or is this to be > expected? > >> -------------------- WARNING --------------------- MSG: insert in >> Bio::DB::BioSQL::TermAdaptor (driver) failed, values were >> ("GO:0042597","periplasmic space","The region between the inner (cytoplasmic) >> and outer membrane (Gram-negative bacteria) or inner membrane and cell wall >> (fungi).","") FKs (84) Duplicate entry 'periplasmic space-84' for key 2 ^ | >> +-------------------------+ | I don't know, but maybe this + means something? >> I guess, the db can't handle duplicate keys. >> > Andreas. _______________________________________________ Bioperl-l mailing > list Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > From hlapp at gmx.net Wed Feb 11 04:14:18 2004 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed Feb 11 04:20:19 2004 Subject: GO identifiers being mis-parsed? Was Re: [Bioperl-l] Building biosql database errors In-Reply-To: Message-ID: On Tuesday, February 10, 2004, at 03:21 PM, Sean Davis wrote: > UM-BBD-pathwayID:van > > When I changed the ':' to '-' globally in the GO flat files, I got rid > of > the exceptions (Any ideas?). I continued to have issues with > installing, > though, in that the name column often contains the GO:xxxxxxx number > rather > than the name. A bug was introduced into the GO parser (dagflat in fact) that causes this. I fixed it in the main trunk a week or two ago, but haven't yet migrated the fix to the branch. Will do that too. -hilmar -- ------------------------------------------------------------- Hilmar Lapp email: lapp at gnf.org GNF, San Diego, Ca. 92121 phone: +1-858-812-1757 ------------------------------------------------------------- From hlapp at gnf.org Wed Feb 11 14:15:38 2004 From: hlapp at gnf.org (Hilmar Lapp) Date: Wed Feb 11 14:21:40 2004 Subject: [Bioperl-l] Building biosql database errors In-Reply-To: Message-ID: This could have multiple reasons. Generally speaking, in an ideal world the unique key constraint is on the tuple (name, ontology), and given the error message this seems to be the constraint you're violating, because - you may have loaded GO before and did not specify --lookup or similar options that let the script deal with pre-existing content to be updated according to one of different policies (check out the load_ontology.pl POD for a more elaborate discussion of update options) - the term that appears to violate the constraint exists also as an obsoleted term with the same name, but a different GO identifier, and you did not choose to ignore and delete obsoleted terms The latter is particularly nasty and a reflection of the fact that the world we live in is not ideal. If you are going to always purge existing terms from the database and then reload GO then you can keep the unique key constraint the way it is, and just need to make sure that this strategy is reflected in the options (--noobsolete). The downside of doing so (and of using --delobsolete, too) is that deleting the terms will remove their associations to bioentries and features as well, i.e., if any bioentry or feature was annotated with either any GO term (if reload from scratch) or a GO term that is being obsoleted (if using --delobsolete) then obviously you lose that annotation when deleting the term(s). If you'll reload those associations right afterwards, then there's no problem with this. Alternatively, if you want to keep GO in the database and then update it with a new release, then apart from choosing what to do with terms that are obsolete (see the load_ontology.pl POD for the choices you have) you need to change the unique key constraint to the tuple of (name, ontology, is_obsolete). This should be a commented-out option in the schema DDL file. Hth, -hilmar On Tuesday, February 10, 2004, at 09:02 AM, Sean Davis wrote: > I am trying to build a biosql database on a mysql database. I have > mysql > and biosql schema running and can successfully load some data, but for > a > proportion of the data when loading ontology or locuslink, I get the > following (many times). Am I doing something wrong, or is this to be > expected? I would just push on with --safe (as given below), but > clearly > part of the data is not loaded correctly after looking at the result. > I > have the same problem when loading locuslink. Any input is > appreciated. > > Thanks, > Sean > > > % perl ../../../bioperl-db/scripts/biosql/load_ontology.pl --safe > --fmtargs > "-defs_file,GO.defs" --dbuser sdavis --dbpass mic2222 --namespace "Gene > Ontology" --format goflat component.ontology.2004-02-01 > process.ontology.2004-02-01 function.ontology.2004-02-01 > > -------------------- WARNING --------------------- > MSG: insert in Bio::DB::BioSQL::TermAdaptor (driver) failed, values > were > ("GO:0042597","periplasmic space","The region between the inner > (cytoplasmic) and outer membrane (Gram-negative bacteria) or inner > membrane > and cell wall (fungi).","") FKs (84) > Duplicate entry 'periplasmic space-84' for key 2 > --------------------------------------------------- > Could not store term relationship (periplasmic space (sensu > Fungi),IS_A,periplasmic space): > > ------------- EXCEPTION ------------- > MSG: create: object (Bio::Ontology::GOterm) failed to insert or to be > found > by unique key > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create > /Library/Perl/5.8.1/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:207 > STACK Bio::DB::Persistent::PersistentObject::create > /Library/Perl/5.8.1/Bio/DB/Persistent/PersistentObject.pm:243 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create > /Library/Perl/5.8.1/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:170 > STACK Bio::DB::Persistent::PersistentObject::create > /Library/Perl/5.8.1/Bio/DB/Persistent/PersistentObject.pm:243 > STACK (eval) ../../../bioperl-db/scripts/biosql/load_ontology.pl:548 > STACK toplevel ../../../bioperl-db/scripts/biosql/load_ontology.pl:547 > > -------------------------------------- > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > -- ------------------------------------------------------------- Hilmar Lapp email: lapp at gnf.org GNF, San Diego, Ca. 92121 phone: +1-858-812-1757 ------------------------------------------------------------- From hlapp at gnf.org Wed Feb 11 14:18:15 2004 From: hlapp at gnf.org (Hilmar Lapp) Date: Wed Feb 11 14:24:16 2004 Subject: [Bioperl-l] Bioperl-db make test failures In-Reply-To: Message-ID: <0AA63EAA-5CC7-11D8-BA20-000A959EB4C4@gnf.org> This is strange, actually both of them. Did you run the tests against a database with content loaded prior to the tests, or was it a freshly created instance of the schema? If the latter, which RDBMS, and version of bioperl are you using? -hilmar On Tuesday, February 10, 2004, at 10:07 AM, Adam Witney wrote: > Hi, > > I am trying out bioperl-db and biosql. I downloaded both from CVS and > installed the biosql schema ok. However I have some test failures with > bioperl-db: > > t/cluster.......ok 5/160Use of uninitialized value in join or string at > blib/lib/Bio/DB/BioSQL/BaseDriver.pm line 1835, line 1. > > -------------------- WARNING --------------------- > MSG: insert in Bio::DB::BioSQL::BiosequenceAdaptor (driver) failed, > values > were ("","0","dna","") FKs (2) > ERROR: invalid input syntax for integer: "" > > ... And > > t/species.......ok 68/65 > ------------- EXCEPTION ------------- > MSG: create: object (Bio::Species) failed to insert or to be found by > unique > key > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create > blib/lib/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:207 > STACK Bio::DB::Persistent::PersistentObject::create > blib/lib/Bio/DB/Persistent/PersistentObject.pm:243 > STACK toplevel t/species.t:76 > > Are these known problems or have I missed something? > > Thanks > > Adam > > > -- > This message has been scanned for viruses and > dangerous content by MailScanner, and is > believed to be clean. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > -- ------------------------------------------------------------- Hilmar Lapp email: lapp at gnf.org GNF, San Diego, Ca. 92121 phone: +1-858-812-1757 ------------------------------------------------------------- From awitney at sghms.ac.uk Wed Feb 11 14:39:27 2004 From: awitney at sghms.ac.uk (Adam Witney) Date: Wed Feb 11 14:46:24 2004 Subject: [Bioperl-l] Bioperl-db make test failures In-Reply-To: <0AA63EAA-5CC7-11D8-BA20-000A959EB4C4@gnf.org> Message-ID: On 11/2/04 7:18 pm, "Hilmar Lapp" wrote: > This is strange, actually both of them. Did you run the tests against a > database with content loaded prior to the tests, or was it a freshly > created instance of the schema? > > If the latter, which RDBMS, and version of bioperl are you using? I had created the database and only run load_ncbi_taxonomy.pl to download the taxonomy database from NCBI Both biosql-schema and bioper-db were downloaded from CVS yesterday. RDBMS is PostgreSQL 7.4.1 Thanks adam -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From hlapp at gnf.org Wed Feb 11 15:00:30 2004 From: hlapp at gnf.org (Hilmar Lapp) Date: Wed Feb 11 15:06:31 2004 Subject: [Bioperl-l] Bioperl-db make test failures In-Reply-To: Message-ID: That explains the species test failure. It tests, among other things, whether it can successfully insert a species. As it is not a made up taxon, it'll fail if you pre-loaded the ncbi taxon database. Generally, I recommend creating a test schema for test scripts that's separate from the instance you use for production or anything else that you don't want to throw away in an instant and not be sorry. -hilmar On Wednesday, February 11, 2004, at 11:39 AM, Adam Witney wrote: > On 11/2/04 7:18 pm, "Hilmar Lapp" wrote: > >> This is strange, actually both of them. Did you run the tests against >> a >> database with content loaded prior to the tests, or was it a freshly >> created instance of the schema? >> >> If the latter, which RDBMS, and version of bioperl are you using? > > I had created the database and only run load_ncbi_taxonomy.pl to > download > the taxonomy database from NCBI > > Both biosql-schema and bioper-db were downloaded from CVS yesterday. > RDBMS > is PostgreSQL 7.4.1 > > Thanks > > adam > > > -- > This message has been scanned for viruses and > dangerous content by MailScanner, and is > believed to be clean. > > -- ------------------------------------------------------------- Hilmar Lapp email: lapp at gnf.org GNF, San Diego, Ca. 92121 phone: +1-858-812-1757 ------------------------------------------------------------- From awitney at sghms.ac.uk Wed Feb 11 15:07:01 2004 From: awitney at sghms.ac.uk (Adam Witney) Date: Wed Feb 11 15:13:57 2004 Subject: [Bioperl-l] Bioperl-db make test failures In-Reply-To: Message-ID: I installed the bioperl-db module anyway, but when I tried to load a GenBank file into the database, the same species failure came up... Should I not load the ncbi taxon database? Thanks adam > That explains the species test failure. It tests, among other things, > whether it can successfully insert a species. As it is not a made up > taxon, it'll fail if you pre-loaded the ncbi taxon database. > > Generally, I recommend creating a test schema for test scripts that's > separate from the instance you use for production or anything else that > you don't want to throw away in an instant and not be sorry. > > -hilmar > > On Wednesday, February 11, 2004, at 11:39 AM, Adam Witney wrote: > >> On 11/2/04 7:18 pm, "Hilmar Lapp" wrote: >> >>> This is strange, actually both of them. Did you run the tests against >>> a >>> database with content loaded prior to the tests, or was it a freshly >>> created instance of the schema? >>> >>> If the latter, which RDBMS, and version of bioperl are you using? >> >> I had created the database and only run load_ncbi_taxonomy.pl to >> download >> the taxonomy database from NCBI >> >> Both biosql-schema and bioper-db were downloaded from CVS yesterday. >> RDBMS >> is PostgreSQL 7.4.1 >> >> Thanks >> >> adam >> >> >> -- >> This message has been scanned for viruses and >> dangerous content by MailScanner, and is >> believed to be clean. >> >> -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From Annie.Law at nrc-cnrc.gc.ca Thu Feb 12 12:46:56 2004 From: Annie.Law at nrc-cnrc.gc.ca (Law, Annie) Date: Thu Feb 12 12:52:57 2004 Subject: [Bioperl-l] Locuslink parser Message-ID: <10C94843061E094A98C02EB77CFC328722FE01@nrcmrdex1d.imsb.nrc.ca> Hi, I would appreciate help with the following. I have searched for questions on the locuslink parser but have not found answers to my questions. I am trying to understand how to use the locuslink parser. I am most interested in obtaining the fields locuslink id, GO id, accession number, unigene id. However, when I use the following code. I am only able to get information for the fields: ALIAS_PROT,ALIAS_SYMBOL,CDD,CHR,CURRENT_LOCUSID,ECNUM,EXTANNOT,MAP,NC,NR,OFF ICIAL_GENE_NAME,OFFICIAL_SYMBOL PHENOTYPE,PREFERRED_GENE_NAME, PREFERRED_PRODUCT, PREFERRED_SYMBOL, PRODUCT In the best scenario I would like to be able to obtain all of the information availabe form the LL_tmpl file From locus link meaning all of the fields. How do I access the fields I want after the parser has done its work? Thanks, Annie. ACCNUM ALIAS_PROT ALIAS_SYMBOL ASSEMBLY BUTTON CDD CHR COMP CONTIG CURRENT_LOCUSID DB_DESCR DB_LINK ECNUM EVID EXTANNOT GO GRIF LINK LOCUSID LOCUS_CONFIRMED LOCUS_TYPE MAP MAPLINK NC NG NM NP NR OFFICIAL_GENE_NAME OFFICIAL_SYMBOL OMIM ORGANISM PHENOTYPE PHENOTYPE_ID PMID PREFERRED_GENE_NAME PREFERRED_PRODUCT PREFERRED_SYMBOL PRODUCT PROT RELL STATUS STS SUMFUNC SUMMARY TRANSVAR TYPE UNIGENE XG XM XP XR use Bio::SeqIO; use strict; my $io = Bio::SeqIO->new(-file => '/var/lib/mysql/LL_tmpl', -format => "locuslink"); while (my $seq_obj=$io->next_seq()){ my $anno_collection = $seq_obj->annotation; foreach my $key ($anno_collection->get_all_annotation_keys){ my @annotations = $anno_collection->get_Annotations($key); foreach my $value (@annotations){ print "tagname: ", $value->tagname, "\n"; # $value is an Bio::Annotation, and has an "as_text" method print " annotation value: ", $value->as_text, "\n"; } } }#cycling through all of the sequences. From dfclark at neo.tamu.edu Thu Feb 12 13:16:45 2004 From: dfclark at neo.tamu.edu (David Clark) Date: Thu Feb 12 13:22:16 2004 Subject: [Bioperl-l] Fasta Genome Splice In-Reply-To: <10C94843061E094A98C02EB77CFC328722FE01@nrcmrdex1d.imsb.nrc.ca> References: <10C94843061E094A98C02EB77CFC328722FE01@nrcmrdex1d.imsb.nrc.ca> Message-ID: <9DCDC030-5D87-11D8-B24E-000A95E8AC0C@neo.tamu.edu> Hello, I'm a relative newcomer to bioperl, and would like a point in the right direction. I need to separate the yeast genome into two partial genomes--one with all ORF's, and one with everything else. I have a tab delimited list of the ORF's with the coordinates, and can probably parse that myself, but I wanted to see if anyone could point me to some example code, or give me some place to start in separating genomes based on the coordinates. Thanks, David Clark dfclark@neo.tamu.edu From pm66 at nyu.edu Thu Feb 12 13:30:44 2004 From: pm66 at nyu.edu (Philip MacMenamin) Date: Thu Feb 12 13:37:50 2004 Subject: [Bioperl-l] Strangeness in bioperl-1.4::Bio::DB::GFF::Segment...? Message-ID: <200402121832.i1CIW42N006581@mx6.nyu.edu> Hi, So I have WS118 SQLdb running here, and I run select * from fattribute where gname = 'AH6.5'; and I get some stuff returned. So, if I run (via perl) my $panelSeg = $db->segment('AH6'); I get stuff returned (ie all of the AH's, like I'd expect). However, if I run: my $panelSeg = $db->segment('AH6.5'); I get nothing returned. This seems odd to me... I am of course doomed to figure out why as soon as I post this though :) Philip From lstein at cshl.edu Thu Feb 12 13:39:37 2004 From: lstein at cshl.edu (Lincoln Stein) Date: Thu Feb 12 13:45:52 2004 Subject: [Bioperl-l] Strangeness in bioperl-1.4::Bio::DB::GFF::Segment...? In-Reply-To: <200402121832.i1CIW42N006581@mx6.nyu.edu> References: <200402121832.i1CIW42N006581@mx6.nyu.edu> Message-ID: <200402122039.37363.lstein@cshl.edu> The class of the gene sequence has changed. You'll have to get it this way: $panelSeg= $db->segment(CDS => 'AH6.5') Lincoln On Thursday 12 February 2004 08:30 pm, Philip MacMenamin wrote: > Hi, > > So I have WS118 SQLdb running here, and I run > select * from fattribute where gname = 'AH6.5'; > and I get some stuff returned. > > So, if I run (via perl) > my $panelSeg = $db->segment('AH6'); > I get stuff returned (ie all of the AH's, like I'd expect). > > However, if I run: > my $panelSeg = $db->segment('AH6.5'); > I get nothing returned. > > This seems odd to me... > > I am of course doomed to figure out why as soon as I post this > though :) > > Philip > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 From hlapp at gnf.org Thu Feb 12 14:09:51 2004 From: hlapp at gnf.org (Hilmar Lapp) Date: Thu Feb 12 14:15:51 2004 Subject: [Bioperl-l] Locuslink parser In-Reply-To: <10C94843061E094A98C02EB77CFC328722FE01@nrcmrdex1d.imsb.nrc.ca> Message-ID: <08A1CF46-5D8F-11D8-BD29-000A959EB4C4@gnf.org> On Thursday, February 12, 2004, at 09:46 AM, Law, Annie wrote: > I am most intereste in obtaining the fields locuslink id, GO id, > accession number, unigene id. The locuslink ID is the $seq->accession_number. GO should be there as term annotations, unigene ID and other accessions should be present as dbxref annotations. You can test for an annotation being a term annotation or a dbxref: foreach my $ann (@annotations) { if ($ann->isa("Bio::Ontology::TermI")) { # this is an ontology term as annotation } if ($ann->isa("Bio::Annotation::DBLink")) { # this is a dbxref annotation } } Using the map function you can easily filter for annotation types, for example: @term_annotations = map { $_->isa("Bio::Ontology::TermI"); } $seq->get_Annotations(); BTW if you want to get all annotations from a seq object, you can just say $seq->get_Annotations() and omit the key. Hth, -hilmar -- ------------------------------------------------------------- Hilmar Lapp email: lapp at gnf.org GNF, San Diego, Ca. 92121 phone: +1-858-812-1757 ------------------------------------------------------------- From jason at cgt.duhs.duke.edu Thu Feb 12 14:19:05 2004 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Thu Feb 12 14:25:11 2004 Subject: [Bioperl-l] Fasta Genome Splice In-Reply-To: <9DCDC030-5D87-11D8-B24E-000A95E8AC0C@neo.tamu.edu> References: <10C94843061E094A98C02EB77CFC328722FE01@nrcmrdex1d.imsb.nrc.ca> <9DCDC030-5D87-11D8-B24E-000A95E8AC0C@neo.tamu.edu> Message-ID: You want these as a fasta file per orf and per non-orf region or just 2 datasets with the genome masked (all N's or lowercased)? -jason On Thu, 12 Feb 2004, David Clark wrote: > Hello, > > I'm a relative newcomer to bioperl, and would like a point in the right > direction. I need to separate the yeast genome into two partial > genomes--one with all ORF's, and one with everything else. I have a > tab delimited list of the ORF's with the coordinates, and can probably > parse that myself, but I wanted to see if anyone could point me to some > example code, or give me some place to start in separating genomes > based on the coordinates. > > Thanks, > > David Clark > dfclark@neo.tamu.edu > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From dfclark at neo.tamu.edu Thu Feb 12 14:59:20 2004 From: dfclark at neo.tamu.edu (David Clark) Date: Thu Feb 12 15:05:32 2004 Subject: [Bioperl-l] Fasta Genome Splice In-Reply-To: References: <10C94843061E094A98C02EB77CFC328722FE01@nrcmrdex1d.imsb.nrc.ca> <9DCDC030-5D87-11D8-B24E-000A95E8AC0C@neo.tamu.edu> Message-ID: Good point. What I need is two fasta files: one with the ofr regions masked, and one with the non-ofr regions masked. There was another thing I wanted to do that I didn't mention before: how can I generate the reverse compliment of a whole genome file? On Feb 12, 2004, at 1:19 PM, Jason Stajich wrote: > You want these as a fasta file per orf and per non-orf region or just 2 > datasets with the genome masked (all N's or lowercased)? > > -jason > On Thu, 12 Feb 2004, David Clark wrote: > >> Hello, >> >> I'm a relative newcomer to bioperl, and would like a point in the >> right >> direction. I need to separate the yeast genome into two partial >> genomes--one with all ORF's, and one with everything else. I have a >> tab delimited list of the ORF's with the coordinates, and can probably >> parse that myself, but I wanted to see if anyone could point me to >> some >> example code, or give me some place to start in separating genomes >> based on the coordinates. >> >> Thanks, >> >> David Clark >> dfclark@neo.tamu.edu From ryank at drizzle.com Thu Feb 12 15:29:03 2004 From: ryank at drizzle.com (Ryan Kuykendall) Date: Thu Feb 12 15:34:57 2004 Subject: [Bioperl-l] Fasta Genome Splice In-Reply-To: Message-ID: I'm sure there is a Perl module for generating the reverse compliment of a whole genome, but assuming you wanted to write the code from scratch: ## ...and assuming your genome file has been turned into an array of bases ## called @listOfBases; my $baseComplimentMap = { 'a' => 't', 'c' => 'g', 'g' => 'c', 't' => 'a', }; my @baseComplimentList = (); foreach my $base ( @listOfBases ) { my $complimentBase = $baseComplimentMap->{$base}; push( @baseComplimentList, $complimentBase ); } That would do it... ============================================================ Ryan Kuykendall ryank@drizzle.com http://undef.com/ryank/ryanAtBawa50percent.JPG ============================================================ On Thu, 12 Feb 2004, David Clark wrote: > Good point. What I need is two fasta files: one with the ofr regions > masked, and one with the non-ofr regions masked. There was another > thing I wanted to do that I didn't mention before: how can I generate > the reverse compliment of a whole genome file? > > On Feb 12, 2004, at 1:19 PM, Jason Stajich wrote: > > > You want these as a fasta file per orf and per non-orf region or just 2 > > datasets with the genome masked (all N's or lowercased)? > > > > -jason > > On Thu, 12 Feb 2004, David Clark wrote: > > > >> Hello, > >> > >> I'm a relative newcomer to bioperl, and would like a point in the > >> right > >> direction. I need to separate the yeast genome into two partial > >> genomes--one with all ORF's, and one with everything else. I have a > >> tab delimited list of the ORF's with the coordinates, and can probably > >> parse that myself, but I wanted to see if anyone could point me to > >> some > >> example code, or give me some place to start in separating genomes > >> based on the coordinates. > >> > >> Thanks, > >> > >> David Clark > >> dfclark@neo.tamu.edu > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > -- ============================================================ Ryan Kuykendall ryank@drizzle.com http://undef.com/ryank/ryanAtBawa50percent.JPG ============================================================ From jason at cgt.duhs.duke.edu Thu Feb 12 15:46:33 2004 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Thu Feb 12 15:52:39 2004 Subject: [Bioperl-l] Fasta Genome Splice In-Reply-To: References: <10C94843061E094A98C02EB77CFC328722FE01@nrcmrdex1d.imsb.nrc.ca> <9DCDC030-5D87-11D8-B24E-000A95E8AC0C@neo.tamu.edu> Message-ID: On Thu, 12 Feb 2004, David Clark wrote: > Good point. What I need is two fasta files: one with the ofr regions > masked, and one with the non-ofr regions masked. This is a little bit of work, but pretty easy since you can fit whole yeast chromosomes into memory. I do it by figuring out what I want to mask and then do: substr($chromseq,$start,$len,'N'x$len) So you can just write a simple parser for the chromsomal_features.tab while( ){ my ($feature,$gene,$sgdid, ... etc ) = split(/\t/,$_); # do the substr replace here } > There was another thing I wanted to do that I didn't mention before: how > can I generate the reverse compliment of a whole genome file? That's easy with emboss % revseq FILE.fwd FILE.rev With bioperl -- see the Sequence HOWTO in the howto section of the bioperl website. you want to use the revcom method in bioperl Bio::PrimarySeq objects. # change fasta to whatever format you have/want the sequences in my $in = Bio::SeqIO->new(-file => 'filename', -format => 'fasta'); my $out = Bio::SeqIO->new(-file => '>filename.rev', -format => 'fasta'); while( my $s = $in->next_seq ) { $out->write_seq($s->revcom); } -jason > On Feb 12, 2004, at 1:19 PM, Jason Stajich wrote: > > > You want these as a fasta file per orf and per non-orf region or just 2 > > datasets with the genome masked (all N's or lowercased)? > > > > -jason > > On Thu, 12 Feb 2004, David Clark wrote: > > > >> Hello, > >> > >> I'm a relative newcomer to bioperl, and would like a point in the > >> right > >> direction. I need to separate the yeast genome into two partial > >> genomes--one with all ORF's, and one with everything else. I have a > >> tab delimited list of the ORF's with the coordinates, and can probably > >> parse that myself, but I wanted to see if anyone could point me to > >> some > >> example code, or give me some place to start in separating genomes > >> based on the coordinates. > >> > >> Thanks, > >> > >> David Clark > >> dfclark@neo.tamu.edu > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From lstein at cshl.edu Fri Feb 13 04:29:50 2004 From: lstein at cshl.edu (Lincoln Stein) Date: Fri Feb 13 04:35:56 2004 Subject: [Bioperl-l] Fasta Genome Splice In-Reply-To: References: Message-ID: <200402131129.50947.lstein@cshl.edu> There is actually a one-liner for this. You can find it in Jim Tisdall's "Beginning Bioinformatics" book, which I strongly recommend to anyone who wants to do basic bioinformatics tasks without learning Bioperl. Lincoln On Thursday 12 February 2004 10:29 pm, Ryan Kuykendall wrote: > I'm sure there is a Perl module for generating the reverse > compliment of a whole genome, but assuming you wanted to write the > code from scratch: > > ## ...and assuming your genome file has been turned into an array > of bases ## called @listOfBases; > > my $baseComplimentMap = > { > 'a' => 't', > 'c' => 'g', > 'g' => 'c', > 't' => 'a', > }; > > my @baseComplimentList = (); > > foreach my $base ( @listOfBases ) > { > my $complimentBase = $baseComplimentMap->{$base}; > push( @baseComplimentList, $complimentBase ); > } > > That would do it... > > ============================================================ > Ryan Kuykendall > ryank@drizzle.com > > http://undef.com/ryank/ryanAtBawa50percent.JPG > ============================================================ > > On Thu, 12 Feb 2004, David Clark wrote: > > Good point. What I need is two fasta files: one with the ofr > > regions masked, and one with the non-ofr regions masked. There > > was another thing I wanted to do that I didn't mention before: > > how can I generate the reverse compliment of a whole genome file? > > > > On Feb 12, 2004, at 1:19 PM, Jason Stajich wrote: > > > You want these as a fasta file per orf and per non-orf region > > > or just 2 datasets with the genome masked (all N's or > > > lowercased)? > > > > > > -jason > > > > > > On Thu, 12 Feb 2004, David Clark wrote: > > >> Hello, > > >> > > >> I'm a relative newcomer to bioperl, and would like a point in > > >> the right > > >> direction. I need to separate the yeast genome into two > > >> partial genomes--one with all ORF's, and one with everything > > >> else. I have a tab delimited list of the ORF's with the > > >> coordinates, and can probably parse that myself, but I wanted > > >> to see if anyone could point me to some > > >> example code, or give me some place to start in separating > > >> genomes based on the coordinates. > > >> > > >> Thanks, > > >> > > >> David Clark > > >> dfclark@neo.tamu.edu > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l@portal.open-bio.org > > http://portal.open-bio.org/mailman/listinfo/bioperl-l -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 From Annie.Law at nrc-cnrc.gc.ca Fri Feb 13 11:53:04 2004 From: Annie.Law at nrc-cnrc.gc.ca (Law, Annie) Date: Fri Feb 13 11:59:07 2004 Subject: [Bioperl-l] Locuslink parser Message-ID: <10C94843061E094A98C02EB77CFC328722FE02@nrcmrdex1d.imsb.nrc.ca> Hi Hilmar, Thanks for your response. By what you're saying I think that my existing code would be able To access the GO identifier. If I look up the tagname molecular function then I will get the value to be for example: Molecular function|ATP binding|GO:0005524. The method that I can think of is to take this value and write some code To parse the GO identifier out. Is there a more direct method? I used the test to test for term annotation or dbxref then if it was dbxref I was able to get the primary id and the Database name. Thanks! I am learning more about the objects I am using. Do you know if there is some doucmentation with Figures showing all of the relationship of objects with Bio::Seq class eg relationship of Bio::Seq and Bio::Annotation Collection among others. However, I am still unable to get all of the fields for example SUMFUNC( a brief summary of the function of the products of this locus), ORGANISM, OMIM etc... I am not sure how to access these. It also seems if I use foreach my $ann (@annotations) { if ($ann->isa("Bio::Ontology::TermI")) { # this is an ontology term as annotation } if ($ann->isa("Bio::Annotation::DBLink")) { # this is a dbxref annotation } } I am filtering out some of the annotation types such as OFFICIAL_GENE_NAME, CHR, OFFICIAL_SYMBOL, etc.. I only get GO information and DBLINK information. If I use the following I will get the maximum number of annotation and dbxref fields I have been able to extract so far. Is there another category I am missing. Better yet how do I find out what are the other missing categories? Ie. Other than Bio::Ontology::TermI, or Bio::Annotation::DBLink while (my $seq_obj=$io->next_seq()){ my $anno_collection = $seq_obj->annotation; foreach my $key ($anno_collection->get_all_annotation_keys){ my @annotations = $anno_collection->get_Annotations($key); foreach my $value (@annotations){ print "tagname: ", $value->tagname, "\n"; # $value is an Bio::Annotation, and has an "as_text" method print " annotation value: ", $value->as_text, "\n"; } } } **In the example you provided below I can see that all of the type Bio::Ontology::TermI annotation types being Grouped and stuck in @term_annotations but what is the $_-> for ? And why do you need the line $seq->get_Annotations(); Below it? @term_annotations = map { $_->isa("Bio::Ontology::TermI"); } $seq->get_Annotations(); Thanks very much, Annie. -----Original Message----- From: Hilmar Lapp [mailto:hlapp@gnf.org] Sent: Thursday, February 12, 2004 2:10 PM To: Law, Annie Cc: 'bioperl-l@bioperl.org' Subject: Re: [Bioperl-l] Locuslink parser On Thursday, February 12, 2004, at 09:46 AM, Law, Annie wrote: > I am most intereste in obtaining the fields locuslink id, GO id, > accession number, unigene id. The locuslink ID is the $seq->accession_number. GO should be there as term annotations, unigene ID and other accessions should be present as dbxref annotations. You can test for an annotation being a term annotation or a dbxref: foreach my $ann (@annotations) { if ($ann->isa("Bio::Ontology::TermI")) { # this is an ontology term as annotation } if ($ann->isa("Bio::Annotation::DBLink")) { # this is a dbxref annotation } } Using the map function you can easily filter for annotation types, for example: @term_annotations = map { $_->isa("Bio::Ontology::TermI"); } $seq->get_Annotations(); BTW if you want to get all annotations from a seq object, you can just say $seq->get_Annotations() and omit the key. Hth, -hilmar -- ------------------------------------------------------------- Hilmar Lapp email: lapp at gnf.org GNF, San Diego, Ca. 92121 phone: +1-858-812-1757 ------------------------------------------------------------- From brian_osborne at cognia.com Fri Feb 13 12:17:25 2004 From: brian_osborne at cognia.com (Brian Osborne) Date: Fri Feb 13 12:23:59 2004 Subject: [Bioperl-l] Locuslink parser In-Reply-To: <10C94843061E094A98C02EB77CFC328722FE02@nrcmrdex1d.imsb.nrc.ca> Message-ID: Annie, >Do >you know if there is some doucmentation with Figures showing all of the >relationship of objects with Bio::Seq class eg relationship of Bio::Seq and >Bio::Annotation Collection among others. There are class diagrams available, either as DIA files in the models/ directory within the package or as PDF on the Web documentation page (http://www.bioperl.org/Core/Latest/modules.html). There are also diagrams in the Pasteur tutorial (http://www.pasteur.fr/recherche/unites/sis/formation/bioperl). Brian O. -----Original Message----- From: bioperl-l-bounces@portal.open-bio.org [mailto:bioperl-l-bounces@portal.open-bio.org]On Behalf Of Law, Annie Sent: Friday, February 13, 2004 11:53 AM To: 'Hilmar Lapp' Cc: 'bioperl-l@bioperl.org' Subject: RE: [Bioperl-l] Locuslink parser Hi Hilmar, Thanks for your response. By what you're saying I think that my existing code would be able To access the GO identifier. If I look up the tagname molecular function then I will get the value to be for example: Molecular function|ATP binding|GO:0005524. The method that I can think of is to take this value and write some code To parse the GO identifier out. Is there a more direct method? I used the test to test for term annotation or dbxref then if it was dbxref I was able to get the primary id and the Database name. Thanks! I am learning more about the objects I am using. Do you know if there is some doucmentation with Figures showing all of the relationship of objects with Bio::Seq class eg relationship of Bio::Seq and Bio::Annotation Collection among others. However, I am still unable to get all of the fields for example SUMFUNC( a brief summary of the function of the products of this locus), ORGANISM, OMIM etc... I am not sure how to access these. It also seems if I use foreach my $ann (@annotations) { if ($ann->isa("Bio::Ontology::TermI")) { # this is an ontology term as annotation } if ($ann->isa("Bio::Annotation::DBLink")) { # this is a dbxref annotation } } I am filtering out some of the annotation types such as OFFICIAL_GENE_NAME, CHR, OFFICIAL_SYMBOL, etc.. I only get GO information and DBLINK information. If I use the following I will get the maximum number of annotation and dbxref fields I have been able to extract so far. Is there another category I am missing. Better yet how do I find out what are the other missing categories? Ie. Other than Bio::Ontology::TermI, or Bio::Annotation::DBLink while (my $seq_obj=$io->next_seq()){ my $anno_collection = $seq_obj->annotation; foreach my $key ($anno_collection->get_all_annotation_keys){ my @annotations = $anno_collection->get_Annotations($key); foreach my $value (@annotations){ print "tagname: ", $value->tagname, "\n"; # $value is an Bio::Annotation, and has an "as_text" method print " annotation value: ", $value->as_text, "\n"; } } } **In the example you provided below I can see that all of the type Bio::Ontology::TermI annotation types being Grouped and stuck in @term_annotations but what is the $_-> for ? And why do you need the line $seq->get_Annotations(); Below it? @term_annotations = map { $_->isa("Bio::Ontology::TermI"); } $seq->get_Annotations(); Thanks very much, Annie. -----Original Message----- From: Hilmar Lapp [mailto:hlapp@gnf.org] Sent: Thursday, February 12, 2004 2:10 PM To: Law, Annie Cc: 'bioperl-l@bioperl.org' Subject: Re: [Bioperl-l] Locuslink parser On Thursday, February 12, 2004, at 09:46 AM, Law, Annie wrote: > I am most intereste in obtaining the fields locuslink id, GO id, > accession number, unigene id. The locuslink ID is the $seq->accession_number. GO should be there as term annotations, unigene ID and other accessions should be present as dbxref annotations. You can test for an annotation being a term annotation or a dbxref: foreach my $ann (@annotations) { if ($ann->isa("Bio::Ontology::TermI")) { # this is an ontology term as annotation } if ($ann->isa("Bio::Annotation::DBLink")) { # this is a dbxref annotation } } Using the map function you can easily filter for annotation types, for example: @term_annotations = map { $_->isa("Bio::Ontology::TermI"); } $seq->get_Annotations(); BTW if you want to get all annotations from a seq object, you can just say $seq->get_Annotations() and omit the key. Hth, -hilmar -- ------------------------------------------------------------- Hilmar Lapp email: lapp at gnf.org GNF, San Diego, Ca. 92121 phone: +1-858-812-1757 ------------------------------------------------------------- _______________________________________________ Bioperl-l mailing list Bioperl-l@portal.open-bio.org http://portal.open-bio.org/mailman/listinfo/bioperl-l From hlapp at gnf.org Fri Feb 13 14:28:18 2004 From: hlapp at gnf.org (Hilmar Lapp) Date: Fri Feb 13 14:34:18 2004 Subject: [Bioperl-l] Locuslink parser In-Reply-To: <10C94843061E094A98C02EB77CFC328722FE02@nrcmrdex1d.imsb.nrc.ca> Message-ID: On Friday, February 13, 2004, at 08:53 AM, Law, Annie wrote: > I am learning more about the objects I am using. Do > you know if there is some doucmentation with Figures showing all of the > relationship of objects with Bio::Seq class eg relationship of > Bio::Seq and > Bio::Annotation Collection among others. > Brian answered that, right? > However, I am still unable to get all of the fields for example > SUMFUNC( a > brief summary of the function of the products of this locus), > ORGANISM, OMIM > etc... I am not sure how to access these. SUMFUNC becomes an annotation of type Bio::Annotation::SimpleValue, with a tag name of SUMFUNC. ORGANISM is a Bio::Species object available through $seq->species. OMIM references should be available as dbxrefs (Bio::Annotation::DBLink), possibly with the database renamed to 'MIM'. There's I think not a good reference yet as to where which tag goes, but the bottom line is that almost every tag ends up as an annotation of some kind, with ORGANISM being a notable exception. > > It also seems if I use > foreach my $ann (@annotations) { > if ($ann->isa("Bio::Ontology::TermI")) { > # this is an ontology term as annotation > } > if ($ann->isa("Bio::Annotation::DBLink")) { > # this is a dbxref annotation > } > } > I am filtering out some of the annotation types such as > OFFICIAL_GENE_NAME, > CHR, OFFICIAL_SYMBOL, etc.. I'm not sure I understand what you mean. I just gave some examples for how to test what type an annotation is of. There are other types too than the two given in the example. The array you get from $seq->annotation->get_Annotations() does contain all and any annotation that has been associated with the sequence. > I only get GO information and DBLINK > information. > If I use the following I will get the maximum number of annotation and > dbxref fields I have been able to extract so far. Is there another > category > I am missing. Better yet how do I find out what are the other missing > categories? Ie. Other than Bio::Ontology::TermI, or > Bio::Annotation::DBLink > Check out Bio/Annotation/*.pm to see all theoretically possible types. The most important are DBLink, SimpleValue, OntologyTerm (which basically adapts a Bio::Ontology::TermI), Comment, and Reference. Note that Reference is not used by the locuslink parser at this point. > > **In the example you provided below I can see that all of the type > Bio::Ontology::TermI annotation types being Grouped and stuck in > @term_annotations but what is the $_-> for ? And why do you need the > line > $seq->get_Annotations(); Below it? It's perl syntax and in part obfuscated by my or your email reader introducing a line break after the closing curly brace. Checkout $ perldoc -f map for documentation on how to use the map function. Now, using the map function in my example was in fact wrong, and calling get_Annotations() on a Bio::SeqI object also won't work. Sorry about these mistakes. Here's the corrected version: @term_anns = grep { $_->isa("Bio::Ontology::TermI"); } $seq->annotaton->get_Annotations(); (There was no linebreak above, but adding one won't bother perl.) Again, you can read about grep in perl by $ perdoc -f grep -hilmar > @term_annotations = map { $_->isa("Bio::Ontology::TermI"); } > $seq->get_Annotations(); > > Thanks very much, > Annie. > > > > -----Original Message----- > From: Hilmar Lapp [mailto:hlapp@gnf.org] > Sent: Thursday, February 12, 2004 2:10 PM > To: Law, Annie > Cc: 'bioperl-l@bioperl.org' > Subject: Re: [Bioperl-l] Locuslink parser > > > > On Thursday, February 12, 2004, at 09:46 AM, Law, Annie wrote: > >> I am most intereste in obtaining the fields locuslink id, GO id, >> accession number, unigene id. > > The locuslink ID is the $seq->accession_number. GO should be there as > term annotations, unigene ID and other accessions should be present as > dbxref annotations. > > You can test for an annotation being a term annotation or a dbxref: > > foreach my $ann (@annotations) { > if ($ann->isa("Bio::Ontology::TermI")) { > # this is an ontology term as annotation > } > if ($ann->isa("Bio::Annotation::DBLink")) { > # this is a dbxref annotation > } > } > > Using the map function you can easily filter for annotation types, for > example: > > @term_annotations = map { $_->isa("Bio::Ontology::TermI"); } > $seq->get_Annotations(); > > BTW if you want to get all annotations from a seq object, you can just > say $seq->get_Annotations() and omit the key. > > Hth, > > -hilmar > -- > ------------------------------------------------------------- > Hilmar Lapp email: lapp at gnf.org > GNF, San Diego, Ca. 92121 phone: +1-858-812-1757 > ------------------------------------------------------------- > > -- ------------------------------------------------------------- Hilmar Lapp email: lapp at gnf.org GNF, San Diego, Ca. 92121 phone: +1-858-812-1757 ------------------------------------------------------------- From dag at sonsorol.org Fri Feb 13 14:53:51 2004 From: dag at sonsorol.org (Chris Dagdigian) Date: Fri Feb 13 14:59:44 2004 Subject: [Bioperl-l] Re: Bundle-Bioperl installation under Activestate In-Reply-To: <20040213190801.31707.qmail@web10003.mail.yahoo.com> References: <20040213190801.31707.qmail@web10003.mail.yahoo.com> Message-ID: <402D2B4F.6000104@sonsorol.org> Hi Jennifer, I've never used perl on Windows - only under various Unix flavors and my personal systems are mostly Mac OS X or Linux these days. I'm cc'ing this reply to the bioperl discussion list where I know there are active windows users of bioperl. We also have some .ppd files on our download site http://bioperl.org/DIST/ but since I've never used the ActiveState stuff I have very little clue about them! A quick google search for io::string + ppd turnd up this link which may be helpful: http://www.apache.org/dist/perl/win32-bin/ppms/IO-String.ppd There seem to be several ppm archives on the net, IO-String is also found here apparently: http://www.online-mirror.org/apache/perl/win32-bin/ppms/ Good luck with the course! Regards, Chris Jennifer Hsu wrote: > Hi, Chris: > My name is Jennifer, I am a student in BioPerl class at Foothill College, Los Altos, CA. Our class > is trying to install bioperl. The entire class is encountering the problem of not being able to > find IO-String. > > So I searched for IO-String and got 4 choices: > ppm> search IO-String > Searching in Active Repositories > 1. IO-String <1.02> Emulate IO::File interface for in-core strings > 2. IO-String <1.03> Emulate file interface for in-core strings > 3. IO-String <1.04> Emulate file interface for in-core strings > 4. IO-stringy <2.108> stringy - I/O on in-core objects like strings and ar~ > > - I tried to install 3, 2, 1 , but each time I got: > PPD for 'IO-String.ppd' could not be found. > It found that IO-stringy (item 4) is already installed in my system, but BioPerl is looking for > IO-String. > - I have ActivePerl 5.6.1.635. Is this build incompatible with Bioperl? > - Please advise me, where can I find IO-String? What can I do to build this PPD. All the students > in my class are stuck, Help! > Thanks > Jennifer > -- Chris Dagdigian, BioTeam - Independent life science IT & informatics consulting Office: 617-666-6454, Mobile: 617-877-5498, Fax: 425-699-0193 PGP KeyID: 83D4310E iChat/AIM: bioteamdag Web: http://bioteam.net From barry.moore at genetics.utah.edu Fri Feb 13 19:53:56 2004 From: barry.moore at genetics.utah.edu (Barry Moore) Date: Fri Feb 13 19:59:52 2004 Subject: [Bioperl-l] Re: Bundle-Bioperl installation under Activestate In-Reply-To: <402D2B4F.6000104@sonsorol.org> References: <20040213190801.31707.qmail@web10003.mail.yahoo.com> <402D2B4F.6000104@sonsorol.org> Message-ID: <402D71A4.8050000@genetics.utah.edu> Jennifer, What repositories are you using with ppm (try typing rep at the ppm> prompt if you don't know). I just reinstalled IO-String O.K. but it may have installed from a non-standard ppm repository. I don't know why ActiveState wouldn't have IO-String, but Randy Kobes has a ppm collection that does. Set him up as a repository, and you should be able to install. Again from the ppm> promt type "rep add Kobes http://theoryx5.uwinnipeg.ca/ppms". Then re-try your IO-Sting install. I'm using ActiveState Perl 5.8, but from what I hear on this list, 5.6 should work just fine. Good luck, Barry Moore Chris Dagdigian wrote: > > Hi Jennifer, > > I've never used perl on Windows - only under various Unix flavors and > my personal systems are mostly Mac OS X or Linux these days. > > I'm cc'ing this reply to the bioperl discussion list where I know > there are active windows users of bioperl. We also have some .ppd > files on our download site http://bioperl.org/DIST/ but since I've > never used the ActiveState stuff I have very little clue about them! > > A quick google search for io::string + ppd turnd up this link which > may be helpful: > > http://www.apache.org/dist/perl/win32-bin/ppms/IO-String.ppd > > There seem to be several ppm archives on the net, IO-String is also > found here apparently: > http://www.online-mirror.org/apache/perl/win32-bin/ppms/ > > Good luck with the course! > > Regards, > Chris > > > > Jennifer Hsu wrote: > >> Hi, Chris: >> My name is Jennifer, I am a student in BioPerl class at Foothill >> College, Los Altos, CA. Our class >> is trying to install bioperl. The entire class is encountering the >> problem of not being able to >> find IO-String. >> So I searched for IO-String and got 4 choices: >> ppm> search IO-String >> Searching in Active Repositories >> 1. IO-String <1.02> Emulate IO::File interface for in-core strings >> 2. IO-String <1.03> Emulate file interface for in-core strings >> 3. IO-String <1.04> Emulate file interface for in-core strings >> 4. IO-stringy <2.108> stringy - I/O on in-core objects like strings >> and ar~ >> - I tried to install 3, 2, 1 , but each time I got: >> PPD for 'IO-String.ppd' could not be found. >> It found that IO-stringy (item 4) is already installed in my system, >> but BioPerl is looking for >> IO-String. >> - I have ActivePerl 5.6.1.635. Is this build incompatible with Bioperl? >> - Please advise me, where can I find IO-String? What can I do to >> build this PPD. All the students >> in my class are stuck, Help! >> Thanks >> Jennifer >> > > > From dfclark at neo.tamu.edu Fri Feb 13 20:39:22 2004 From: dfclark at neo.tamu.edu (David Clark) Date: Fri Feb 13 20:45:16 2004 Subject: [Bioperl-l] Fasta Genome Splice In-Reply-To: References: <10C94843061E094A98C02EB77CFC328722FE01@nrcmrdex1d.imsb.nrc.ca> <9DCDC030-5D87-11D8-B24E-000A95E8AC0C@neo.tamu.edu> Message-ID: <9D52E948-5E8E-11D8-859C-0030657E637C@neo.tamu.edu> Thanks Jason, this is exactly what I needed. I just took peek in Seq.pm to see how the sequence objects are implemented, used your example, and I'm ready to go. David On Feb 12, 2004, at 2:46 PM, Jason Stajich wrote: > On Thu, 12 Feb 2004, David Clark wrote: > >> Good point. What I need is two fasta files: one with the ofr regions >> masked, and one with the non-ofr regions masked. > > This is a little bit of work, but pretty easy since you can fit whole > yeast chromosomes into memory. I do it by figuring out what I want to > mask and then do: > substr($chromseq,$start,$len,'N'x$len) > > So you can just write a simple parser for the chromsomal_features.tab > while( ){ > my ($feature,$gene,$sgdid, ... etc ) = split(/\t/,$_); > # do the substr replace here > } > >> There was another thing I wanted to do that I didn't mention before: >> how >> can I generate the reverse compliment of a whole genome file? > > That's easy with emboss > % revseq FILE.fwd FILE.rev > > With bioperl -- see the Sequence HOWTO in the howto section of the > bioperl > website. you want to use the revcom method in bioperl Bio::PrimarySeq > objects. > > # change fasta to whatever format you have/want the sequences in > my $in = Bio::SeqIO->new(-file => 'filename', -format => 'fasta'); > my $out = Bio::SeqIO->new(-file => '>filename.rev', -format => > 'fasta'); > while( my $s = $in->next_seq ) { > $out->write_seq($s->revcom); > } From pedro21angus at hotmail.com Sun Feb 15 08:21:34 2004 From: pedro21angus at hotmail.com (royce) Date: Sun Feb 15 06:34:38 2004 Subject: [Bioperl-l] Forget V1AGRA, there's a new game in town! Message-ID: <1076851294-936@excite.com> Here is an fantastic way to please your lady. You can be ready for up to thirty-six hours. The results are far greater than any other product. http://fastactingpills.com/sv/?pid=eph9106 action barrykiss dougie mikael kleenex ladybug wanker cookies rambo1sailor front242 cannonda meow lloyd smiths From lstein at cshl.edu Sun Feb 15 13:13:53 2004 From: lstein at cshl.edu (Lincoln Stein) Date: Sun Feb 15 13:19:58 2004 Subject: [Bioperl-l] Creating imagemaps from Bio::Graphics (Was Re: Bio::Graphics questions) In-Reply-To: <200402131128.00665.lstein@cshl.edu> References: <89AA811FD79DC94788093B23DA79E71FD9A83F@edunivmail02.ad.umassmed.edu> <200402131128.00665.lstein@cshl.edu> Message-ID: <200402152013.53586.lstein@cshl.edu> Hi Nathan, I've just committed a chunk of code to Bio::Graphics::Panel that should drastically simplify the task of generating a clickable imagemap from within a CGI script. It should be pretty obvious what to do from the POD documentation. Best, Lincoln On Friday 13 February 2004 11:28 am, Lincoln Stein wrote: > Oh gee, I just answered that question on the bioperl mailing list > and now I can't find it. Maybe you can find it in the archive? > > In any case, since this is such a general and useful question, and > the answer requires about a page of typing, I'm going to > incorporate the answer into the next revision of the tutorial. I > think it needs some example code to go along with it. > > Regards, > > Lincoln > > On Thursday 12 February 2004 10:41 pm, Agrin, Nathan wrote: > > Hey Lincoln, > > > > I talked to you a while back about some Bio::Graphics questions > > and was hoping you could help me with one (or two) more. I need > > to generate HTML image maps using a dynamically created > > Bio::Graphics image based off of blast reports. I am still > > somewhat new to CGI, so bare with me. Basically, I think my main > > problems are first off, generating the image, and having the > > browser display it. I know you need to put the correct image/png > > header in the script, but where will the image reside once it's > > created, and how can direct the browser to that image? > > > > Also, I tried looking at the Generic Genome Browser for info on > > creating HTML image maps, and could find none. Can you point me > > in the right direction, or send me an example script? > > > > Thanks in advance, > > Nate Agrin > > > > Nathan Agrin > > Research Associate > > UMass Medical Center > > 55 Lake Ave. N. > > Worcester MA, 01655 > > (508)-856-6018 > > nathan.agrin@umassmed.edu -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 From darndt at treasurehouseimports.com Sun Feb 15 14:24:55 2004 From: darndt at treasurehouseimports.com (David Arndt) Date: Sun Feb 15 14:33:05 2004 Subject: [Bioperl-l] Does Bio::Tools::Glimmer only parse GlimmerM? Message-ID: Does anyone know whether Bio::Tools::Glimmer will parse results from the regular Glimmer (not GlimmerM) correctly? Thanks From pvh at egenetics.com Sun Feb 15 15:29:04 2004 From: pvh at egenetics.com (Peter van Heusden) Date: Sun Feb 15 15:34:58 2004 Subject: [Bioperl-l] Validating Bioperl Message-ID: <402FD690.2030407@egenetics.com> Hi BioPerl people I have been hired by Electric Genetics to spend no less than 50% of my time "validating" Bioperl. What this means is that I'm slowly going through BioPerl, reviewing the code and documentation, and trying to ensure three related things: 1) The documentation clearly specifies input, output and exception conditions for the code. 2) The code complies with the documentation and behaves as expected. 3) The test suite exhaustively tests the code to ensure that 2 is true. The goal of this 'validation' is to be able to offer some kind of assurance to our customers (who includ some big names in the pharmaceuticals field) that Bioperl is robust enough to be included without worry in their development process. Their fear surrounding open source tools is based on past experiences, particularly upgrading across various versions of operating systems and tools, and the slow tightening of FDA requirements for software included in any clinical development process. The tangible output of the validation work will be: - improved code that is submitted back to the Bioperl CVS - new features, as requested by our pharma clients, that are implemented by EG and submitted to the Bioperl CVS - professional-grade documentation, which is provided to EG's customers as part of the Bioperl validation and support product on offer Finally to give a bit of background: Electric Genetics is a bioinformatics software company based in South Africa and the USA. The name should be familiar to a number of BioPerl hackers - we've been around for some years and sponsored the first 'BioHackathon' (in Cape Town in 2002). We've been open source enthusiasts for years, and with this product can finally bridge the gap between our commerical reality and our open source aspirations. Looking forward to lots of BioPerl hacking, Peter From wes.barris at csiro.au Mon Feb 16 01:33:02 2004 From: wes.barris at csiro.au (Wes Barris) Date: Mon Feb 16 01:39:02 2004 Subject: [Bioperl-l] Bioperl and ACE files Message-ID: <4030641E.2000403@csiro.au> Hi, I have an ACE file that I am trying to process with bioperl. A portion of the ACE file looks like this: AF CB429506 U 2 AF CB428704 U 6 AF CB430643 U 1 AF CB431187 U 0 AF CB430639 U -7 AF CB430480 C 24 AF CB430055 U 10 Notice the line in the middle that shows a starting position of '0' (zero)? When bioperl tries to process this sequence, an error is thrown. I have found the port of the bioperl code that throws the error: Bio/LocatableSeq.pm: sub get_nse{ my ($self,$char1,$char2) = @_; $char1 ||= "/"; $char2 ||= "-"; $self->throw("Attribute id not set") unless $self->id(); $self->throw("Attribute start not set") unless $self->start();<----- $self->throw("Attribute end not set") unless $self->end(); return $self->id() . $char1 . $self->start . $char2 . $self->end ; } Notice how "$self->start()" is tested. When it encounters a sequence whose start is set to zero, an error is thrown. I don't know much about the ACE file format. Do I have a questionable ACE file or is this test incomplete? -- Wes Barris E-Mail: Wes.Barris@csiro.au From lstein at cshl.edu Mon Feb 16 04:05:43 2004 From: lstein at cshl.edu (Lincoln Stein) Date: Mon Feb 16 04:11:41 2004 Subject: [Bioperl-l] Re: Bundle-Bioperl installation under Activestate In-Reply-To: <402D2B4F.6000104@sonsorol.org> References: <20040213190801.31707.qmail@web10003.mail.yahoo.com> <402D2B4F.6000104@sonsorol.org> Message-ID: <200402161105.43148.lstein@cshl.edu> Hi Jen, Don't forget that you can also use CPAN for installing Perl modules on Windows: C:\ perl -MCPAN -e shell cpan> install IO::String This will work as long as you are installing a "pure perl" module that doesn't need compilation. Most of the CPAN modules fall into this category. Lincoln On Friday 13 February 2004 09:53 pm, Chris Dagdigian wrote: > Hi Jennifer, > > I've never used perl on Windows - only under various Unix flavors > and my personal systems are mostly Mac OS X or Linux these days. > > I'm cc'ing this reply to the bioperl discussion list where I know > there are active windows users of bioperl. We also have some .ppd > files on our download site http://bioperl.org/DIST/ but since I've > never used the ActiveState stuff I have very little clue about > them! > > A quick google search for io::string + ppd turnd up this link which > may be helpful: > > http://www.apache.org/dist/perl/win32-bin/ppms/IO-String.ppd > > There seem to be several ppm archives on the net, IO-String is also > found here apparently: > http://www.online-mirror.org/apache/perl/win32-bin/ppms/ > > Good luck with the course! > > Regards, > Chris > > Jennifer Hsu wrote: > > Hi, Chris: > > My name is Jennifer, I am a student in BioPerl class at Foothill > > College, Los Altos, CA. Our class is trying to install bioperl. > > The entire class is encountering the problem of not being able to > > find IO-String. > > > > So I searched for IO-String and got 4 choices: > > ppm> search IO-String > > Searching in Active Repositories > > 1. IO-String <1.02> Emulate IO::File interface for in-core > > strings 2. IO-String <1.03> Emulate file interface for in-core > > strings 3. IO-String <1.04> Emulate file interface for in-core > > strings 4. IO-stringy <2.108> stringy - I/O on in-core objects > > like strings and ar~ > > > > - I tried to install 3, 2, 1 , but each time I got: > > PPD for 'IO-String.ppd' could not be found. > > It found that IO-stringy (item 4) is already installed in my > > system, but BioPerl is looking for IO-String. > > - I have ActivePerl 5.6.1.635. Is this build incompatible with > > Bioperl? - Please advise me, where can I find IO-String? What can > > I do to build this PPD. All the students in my class are stuck, > > Help! > > Thanks > > Jennifer -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 From jason at cgt.duhs.duke.edu Mon Feb 16 07:46:41 2004 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Mon Feb 16 07:52:35 2004 Subject: [Bioperl-l] Bioperl and ACE files In-Reply-To: <4030641E.2000403@csiro.au> References: <4030641E.2000403@csiro.au> Message-ID: As always, more code and information as to how you got here makes it easier for someone to answer. Not really sure how you are getting to the point where you have created Bio::LocatableSeq objects - presumably you are trying to do an assembly so I'll guess you got there from Bio::Assembly::IO. You may need to get help from Robson about what the format is supposed to support. A start of 0 is not really proper in Bioperl - sequences/features start at 1 in our system, so the assembly code needs to adjust for that. presumably those numbers are offsets not actual start positions so the parsing code may need some looking at. -jason On Mon, 16 Feb 2004, Wes Barris wrote: > Hi, > > I have an ACE file that I am trying to process with bioperl. A portion > of the ACE file looks like this: > > AF CB429506 U 2 > AF CB428704 U 6 > AF CB430643 U 1 > AF CB431187 U 0 > AF CB430639 U -7 > AF CB430480 C 24 > AF CB430055 U 10 > > Notice the line in the middle that shows a starting position of '0' > (zero)? When bioperl tries to process this sequence, an error is > thrown. I have found the port of the bioperl code that throws the > error: > Bio/LocatableSeq.pm: > sub get_nse{ > my ($self,$char1,$char2) = @_; > > $char1 ||= "/"; > $char2 ||= "-"; > > $self->throw("Attribute id not set") unless $self->id(); > $self->throw("Attribute start not set") unless $self->start();<----- > $self->throw("Attribute end not set") unless $self->end(); > > return $self->id() . $char1 . $self->start . $char2 . $self->end ; > > } > > Notice how "$self->start()" is tested. When it encounters a sequence > whose start is set to zero, an error is thrown. > > I don't know much about the ACE file format. Do I have a questionable > ACE file or is this test incomplete? > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From katie21amelie at hotmail.com Mon Feb 16 11:07:08 2004 From: katie21amelie at hotmail.com (cody) Date: Mon Feb 16 09:20:28 2004 Subject: [Bioperl-l] Stronger than V1AGRA?! Message-ID: <1076947628-9152@excite.com> Here is an fantastic way to please your lady. You can be ready for up to thirty-six hours. The results are far greater than any other product. http://fastactingpills.com/sv/?pid=eph9106 river boogieroxy cougars andre parrot ricky trek robinhoo cookiesmaster1 roy olivier honda1 kiss hanson From billthebrute at yahoo.fr Mon Feb 16 10:42:38 2004 From: billthebrute at yahoo.fr (=?iso-8859-1?q?william=20ritchie?=) Date: Mon Feb 16 10:48:26 2004 Subject: [Bioperl-l] get sequence failure Message-ID: <20040216154238.4435.qmail@web25209.mail.ukl.yahoo.com> Hi, I m trying to retrieve a sequence with the following code: use Bio::SearchIO; use Bio::Perl; use Bio::SeqIO; my $seq_object = get_sequence("nr","NT_039208.2"); and I m getting this error: MSG: id does not exist STACK Bio::DB::WebDBSeqI::get_Seq_by_id /usr/lib/perl5/site_perl/5.8.0/Bio/DB/WebDBSeqI.pm:155 STACK Bio::Perl::get_sequence /usr/lib/perl5/site_perl/5.8.0/Bio/Perl.pm:513 STACK toplevel getseq.pl:9 this ID does exist, using the ncbi interface I can retrieve it! Help!!!!Please!! Yahoo! Mail : votre e-mail personnel et gratuit qui vous suit partout ! Cr?ez votre Yahoo! Mail sur http://fr.benefits.yahoo.com/ From jason at cgt.duhs.duke.edu Mon Feb 16 13:09:33 2004 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Mon Feb 16 13:15:23 2004 Subject: [Bioperl-l] get sequence failure In-Reply-To: <20040216154238.4435.qmail@web25209.mail.ukl.yahoo.com> References: <20040216154238.4435.qmail@web25209.mail.ukl.yahoo.com> Message-ID: You need to use 'refseq' as the db type for refseq sequences ("NT", "NM") 'nr' is not a recognized type of seq db in the first place anyways... I'm noticing for the documentation and code that the types are really documented very well for get_sequence -- need someone to fix this please. -jason On Mon, 16 Feb 2004, [iso-8859-1] william ritchie wrote: > Hi, > > I m trying to retrieve a sequence with the following > code: > > use Bio::SearchIO; > use Bio::Perl; > use Bio::SeqIO; > > my $seq_object = get_sequence("nr","NT_039208.2"); > > and I m getting this error: > > MSG: id does not exist > STACK Bio::DB::WebDBSeqI::get_Seq_by_id > /usr/lib/perl5/site_perl/5.8.0/Bio/DB/WebDBSeqI.pm:155 > STACK Bio::Perl::get_sequence > /usr/lib/perl5/site_perl/5.8.0/Bio/Perl.pm:513 > STACK toplevel getseq.pl:9 > > this ID does exist, using the ncbi interface I can > retrieve it! > Help!!!!Please!! > > > > > > > > Yahoo! Mail : votre e-mail personnel et gratuit qui vous suit partout ! > Cr?ez votre Yahoo! Mail sur http://fr.benefits.yahoo.com/ > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From tex at biosysadmin.com Mon Feb 16 17:49:04 2004 From: tex at biosysadmin.com (Tex Thompson) Date: Mon Feb 16 17:44:31 2004 Subject: [Bioperl-l] Bug in GCG SeqIO Formatting? Message-ID: Hello Mailing List, I have a user complaining that the following code isn't working on his GCG-formatted sequence files: #!/usr/bin/perl use strict; use Bio::SeqIO; my $io = Bio::SeqIO->new( -file => "af317472.gbpln3", -format => "gcg"); my $out = Bio::SeqIO->new( -fh => \*STDOUT, -format => "fasta" ); while ( my $seq = $io->next_seq ) { $out->write_seq( $seq ); } Here's an example sequence file: !!NA_SEQUENCE 1.0 LOCUS AF317472 2679 bp DNA linear PLN 07-DEC-2000 DEFINITION Candida albicans cAMP-dependent protein kinase regulatory subunit (PKA-R) gene, complete cds. ACCESSION AF317472 VERSION AF317472.1 GI:11596392 KEYWORDS . SOURCE Candida albicans ORGANISM Candida albicans Eukaryota; Fungi; Ascomycota; Saccharomycotina; Saccharomycetes; Saccharomycetales; mitosporic Saccharomycetales; Candida. REFERENCE 1 (bases 1 to 2679) AUTHORS Giasson,L. and Parrot,M. TITLE Sequence of the Candida albicans cAMP-dependent protein kinase regulatory subunit JOURNAL Unpublished REFERENCE 2 (bases 1 to 2679) AUTHORS Giasson,L. and Parrot,M. TITLE Direct Submission JOURNAL Submitted (27-OCT-2000) School of Dentistry, Laval University, GREB, Ste-Foy, Quebec G1K 7P4, Canada FEATURES Location/Qualifiers source 1. .2679 /organism="Candida albicans" /mol_type="genomic DNA" /strain="CAI4" /db_xref="taxon:5476" gene <977. .>2356 /gene="PKA-R" mRNA <977. .>2356 /gene="PKA-R" /product="cAMP-dependent protein kinase regulatory subunit" CDS 977. .2356 /gene="PKA-R" /codon_start=1 /transl_table=12 /product="cAMP-dependent protein kinase regulatory subunit" /protein_id="AAG38599.1" /db_xref="GI:11596393" /translation="MSNPQQQFISDELSQLQKEIISKNPQDVLQFCANYFNTKLQAQR SELWSQQAKAEAAGIDLFPSVDHVNVNSSGVSIVNDRQPSFKSPFGVNDPHSNHDEDP HAKDTKTDTAAAAVGGGIFKSNFDVKKSASNPPTKEVDPDDPSKPSSSSQPNQQSASA SSKTPSSKIPVAFNANRRTSVSAEALNPAKLKLDSWKPPVNNLSITEEETLANNLKNN FLFKQLDANSKKTVIAALQQKSFAKDTVIIQQGDEGDFFYIIETGTVDFYVNDAKVSS SSEGSSFGELALMYNSPRAATAVAATDVVCWALDRLTFRRILLEGTFNKRLMYEDFLK DIEVLKSLSDHARSKLADALSTEMYHKGDKIVTEGEQGENFYLIESGNCQVYNEKLGN IKQLTKGDYFGELALIKDLPRQATVEALDNVIVATLGKSGFQRLLGPVVEVLKEQDPT KSQDPTAGH" ORIGIN AF317472 Length: 2679 February 16, 2004 17:02 Type: N Check: 9369 .. 1 GAATTCAAAA AATCAAAAAA ATCAAAAAAA AACCGTGGAA GGTAAGTTGT 51 ATATTTATAA ATCAACGTGA ATAATTTTCA ACACTGTGTC AACATCTGTG 101 AAAAAAACCT GTGTGTACTG CATATAGGAC CTCACCTATT ACGTAGAATA 151 TACTAGAAAT AGTTACAACC ATAAAAAGAT TAATTGTGCT TACGTGGCAA 201 CTTTGAGATT TTTCTTTTTT CTGTTTCTTT CTTTCTTTTT TTGGCTTAAA 251 CAACAAATGT CGCAAATTAT ACAAACGACA TTTGCTGCCC ATGTCATTTT 301 GTCGTTATCA CGTGAAGTGT CGCAGATTTA TGTATTCTCA CTTCATTTCT 351 ATGGTCATCA ATTGTTCATT CATTCTCTAT CTTCAAAAAT CTGTGATTTG 401 ATGATTTTGA TTAAAAGAAA GCAAAGAGAA TACTGAAAAA AAGCAAAGAG 451 AATATAGAAA AGAAACAATA AAAGAATAGT TTCTAAGTTA CTTTGGAGTC 501 TGCTATTACC ATGTATCTAT GTGATTGCCC TATCAAATTG GACAATACGG 551 GTTTTTGTTT AGTCACGATA ATCACAAACT TCCCCCAGCA ATGACATACG 601 TAGCAAGTAA TATTTATATC TCTTCTATTT TTTTGATCTT ACATAATCTG 651 TCGTGTTTTT TTAAGTTGTT GTTATGAAGA AGTAATTTCA TAATGATCAA 701 GTGTGTAACT GAAATTTCAT CGCAATTTTA AACAAACAAG CTAATAATTA 751 TTATTATTAA TAGTTAATTT GCTAAGTTGA GTAAAATTTG CTTTTCTTGA 801 GAAAAAGGAG AAATTACTTT GGGAGTGAGT TTGAAGAGAG AAACTAAAGT 851 AAGTAAATGA GTGAGAGGGA GAGACAGAGA GCGAGAGGGG GAGTAAAAAA 901 AAAAGTTGCC CACAAACAAA TTGTGATACC GGTCTTTTAG CATATATCTT 951 CTACTCTTCA ATCAACATCT TTACCAATGT CTAATCCTCA ACAACAATTC 1001 ATATCTGATG AATTGTCGCA GTTACAGAAA GAAATAATTT CCAAAAACCC 1051 GCAAGATGTC TTACAGTTTT GCGCCAACTA TTTCAACACC AAGTTACAAG 1101 CTCAAAGAAG TGAGTTATGG TCGCAACAAG CTAAAGCAGA AGCCGCAGGC 1151 ATCGACTTAT TCCCATCTGT TGATCATGTG AATGTTAATT CTAGTGGTGT 1201 GAGCATTGTG AATGATAGAC AACCAAGTTT TAAATCACCT TTTGGTGTTA 1251 ATGATCCACA TCTGAATCAC GACGAAGATC CCCATGCCAA AGATACCAAA 1301 ACAGATACTG CTGCTGCTGC TGTTGGTGGG GGTATTTTCA AATCAAATTT 1351 TGATGTTAAA AAGAGTGCTT CTAATCCTCC AACCAAGGAA GTAGATCCAG 1401 ATGACCCATC AAAACCATCG TCATCGAGCC AACCAAATCA ACAATCAGCA 1451 TCAGCATCAT CAAAAACGCC ATCATCAAAG ATCCCAGTTG CTTTCAACGC 1501 TAATAGAAGA ACATCTGTAT CTGCTGAAGC CTTGAATCCA GCAAAATTGA 1551 AATTAGATAG TTGGAAACCT CCAGTTAATA ATTTGAGCAT TACCGAAGAA 1601 GAAACATTAG CCAACAATTT AAAGAACAAT TTCCTTTTCA AACAATTGGA 1651 CGCAAACTCT AAGAAAACTG TGATTGCTGC TTTACAACAA AAATCATTTG 1701 CTAAAGATAC AGTAATTATC CAACAAGGTG ATGAAGGGGA CTTTTTTTAC 1751 ATTATTGAAA CTGGTACAGT TGATTTCTAT GTTAATGATG CTAAAGTAAG 1801 TTCCAGTAGC GAAGGGTCAT CTTTTGGGGA ATTGGCTTTG ATGTATAATT 1851 CACCAAGAGC TGCTACGGCA GTTGCTGCCA CCGATGTTGT CTGTTGGGCA 1901 TTGGACCGTT TGACATTCCG TCGAATTCTT TTGGAAGGTA CTTTTAACAA 1951 GAGATTGATG TACGAGGATT TCTTAAAAGA TATTGAGGTT TTGAAATCTC 2001 TTTCGGATCA TGCACGTTCA AAATTGGCAG ATGCATTGAG CACAGAAATG 2051 TATCACAAGG GTGATAAAAT AGTCACTGAA GGTGAACAAG GAGAGAACTT 2101 TTATTTAATA GAAAGTGGAA ACTGTCAAGT TTACAATGAA AAGTTGGGCA 2151 ATATCAAACA ATTAACAAAA GGTGATTATT TTGGTGAGCT TGCATTAATA 2201 AAAGACTTAC CAAGACAAGC TACTGTGGAA GCATTGGATA ATGTAATCGT 2251 TGCCACATTA GGTAAATCCG GGTTCCAAAG ATTATTGGGT CCTGTTGTGG 2301 AGGTATTGAA AGAACAAGAC CCTACAAAGA GTCAAGACCC AACTGCTGGT 2351 CATTAAGTGT ACAATAAGTA GTTGTTTATT ATCTTATATT GTTTTATGTT 2401 AGTATATTCT ATCTTTTTTT TTTTGGCTTA CTCACCTTCT GGTGTTTTCG 2451 TTGCGATTTT GATAATGGAT GGTTGGTGCA AAAGTTCAAC TACATTTCTT 2501 GTTGTCAGGT ATATACGAGA TGGCAGCATG AACGAGCTCA CCATGGGTTG 2551 AACATTATTG AAGTTATCCG GCCGTGCCTT TTGCGAAACA TGGTAACTAA 2601 TATATTGCAA ACTTGGCTTC TACAGAAAAT ATACAATCTA ATACCTTGAG 2651 GAATTTCCTC TATATATAAT AGAGAATTC I'm not a GCG expert, but is this a correctly formatted GCG file in the first place? If not, is this an error in the SeqIO parser? I've found this behavior to be the same on Solaris 8 and on Linux, both running BioPerl 1.4 and Perl 5.8.1. Thanks a bunch, Tex Thompson RIT Bioinformatics From wes.barris at csiro.au Mon Feb 16 18:19:02 2004 From: wes.barris at csiro.au (Wes Barris) Date: Mon Feb 16 18:28:10 2004 Subject: [Bioperl-l] ace.pm Message-ID: <40314FE6.60906@csiro.au> Hi, ACE files generated by an application called tgicl have "CO" lines of the form: CO CL15Contig2 794 4 0 U This line is not parsed properly by the ace.pm bioperl module. Notice this line from Bio/Assembly/IO/ace.pm . (/^CO Contig(\d+) (\d+) (\d+) (\d+) (\w+)/) && do { # New contig found! Bioperl expects the second "word" in the line to be "Contig\d+" where the number is used as the "contigID". Is there a reason why "contigID" must be a number? Why can't it be the whole second "word" of the "CO" line? -- Wes Barris E-Mail: Wes.Barris@csiro.au From jason at cgt.duhs.duke.edu Mon Feb 16 21:14:28 2004 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Mon Feb 16 21:20:20 2004 Subject: [Bioperl-l] ace.pm In-Reply-To: <40314FE6.60906@csiro.au> References: <40314FE6.60906@csiro.au> Message-ID: People write code and modules to support the work they are doing, sometimes for a specific data set - so I suspect Robson wrote this to support phrap ace format which has a convention of them being ContigXX. You are welcome to make changes to code on your local system to get it working and then post the diffs so they can be incorporated back in. Why not try changing the code as you have noticed and seeing if it works. It is a collaborative project and these modules are newish, so give a try fixing things and then getting feedback on your fixes. -jason On Tue, 17 Feb 2004, Wes Barris wrote: > Hi, > > ACE files generated by an application called tgicl have "CO" > lines of the form: > > CO CL15Contig2 794 4 0 U > > This line is not parsed properly by the ace.pm bioperl module. > Notice this line from Bio/Assembly/IO/ace.pm . > > (/^CO Contig(\d+) (\d+) (\d+) (\d+) (\w+)/) && do { # New > contig found! > > Bioperl expects the second "word" in the line to be "Contig\d+" where > the number is used as the "contigID". Is there a reason why > "contigID" must be a number? Why can't it be the whole second > "word" of the "CO" line? > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From john.herbert at clinical-pharmacology.oxford.ac.uk Thu Feb 12 06:50:45 2004 From: john.herbert at clinical-pharmacology.oxford.ac.uk (john herbert) Date: Mon Feb 16 21:22:34 2004 Subject: [Bioperl-l] Bio::SeqFeature::Primer Calculating the Primer TM Message-ID: Hello BioPerl. I was thinking of using BioPerl to calculate the Tm of my primers. I looked from the documentation and found the method for getting a primer's Tm. my $tm = $primer->Tm; In the notes of this method, the author is confused by the fact that a BioPerl calculated Tm never matches that of Primer3's prediction. This is because Primer3 uses a different method to calculate a Primers TM than the method used in the BioPerl. BioPerl $primer->Tm = Calculated using: Tm = 81.5 + 16.6(log10([Na+])) + .41*(%GC) - 600/length (Sambrook, Fritsch and Maniatis, Molecular Cloning, p 11.46 (1989, CSHL Press). Primer3 primer TM = Primer3 uses the oligo melting temperature formula given in Rychlik, Spencer and Rhoads, Nucleic Acids Research, vol 18, num 12, pp 6409-6412 and Breslauer, Frank, Bloeker and Marky, Proc. Natl. Acad. Sci. USA, vol 83, pp 3746-3750. This method uses thermodynamics to calc TM. Primer3 does use the Sambrook method to predict the TM of the PCR products but not the Primer TM. Would it be possible to update the BioPerl to match how Primer3 calculates the Primer TM? Kind regards, John Herbert Cancer Research UK. From walsh at cenix-bioscience.com Mon Feb 16 11:16:14 2004 From: walsh at cenix-bioscience.com (Andrew Walsh) Date: Mon Feb 16 21:26:43 2004 Subject: [Bioperl-l] get sequence failure In-Reply-To: <20040216154238.4435.qmail@web25209.mail.ukl.yahoo.com> References: <20040216154238.4435.qmail@web25209.mail.ukl.yahoo.com> Message-ID: <4030ECCE.8090608@cenix-bioscience.com> Hi, I think the problem could be that you are trying to retrieve a contig (NT number). I remember Bio::DB::GenBank used to have problems with those, but I'm not sure about now. And from reading the Bio::Perl::get_sequence POD, I see: get_sequence Title : get_sequence Usage : $seq_object = get_sequence('swiss',"ROA1_HUMAN"); ... Args : database type - one of swiss, embl, genbank or refseq identifier or accession number Are you sure 'nr' is a valid database type? Cheers, Andrew william ritchie wrote: > Hi, > > I m trying to retrieve a sequence with the following > code: > > use Bio::SearchIO; > use Bio::Perl; > use Bio::SeqIO; > > my $seq_object = get_sequence("nr","NT_039208.2"); > > and I m getting this error: > > MSG: id does not exist > STACK Bio::DB::WebDBSeqI::get_Seq_by_id > /usr/lib/perl5/site_perl/5.8.0/Bio/DB/WebDBSeqI.pm:155 > STACK Bio::Perl::get_sequence > /usr/lib/perl5/site_perl/5.8.0/Bio/Perl.pm:513 > STACK toplevel getseq.pl:9 > > this ID does exist, using the ncbi interface I can > retrieve it! > Help!!!!Please!! > > > > > > > > Yahoo! Mail : votre e-mail personnel et gratuit qui vous suit partout ! > Cr?ez votre Yahoo! Mail sur http://fr.benefits.yahoo.com/ > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > -- ------------------------------------------------------------------ Andrew Walsh, M.Sc. Bioinformatics Software Engineer IT Unit Cenix BioScience GmbH Pfotenhauerstr. 108 01307 Dresden, Germany Tel. +49(351)210-2699 Fax +49(351)210-1309 public key: http://www.cenix-bioscience.com/public_keys/walsh.gpg ------------------------------------------------------------------ From wes.barris at csiro.au Mon Feb 16 17:42:54 2004 From: wes.barris at csiro.au (Wes Barris) Date: Mon Feb 16 21:26:44 2004 Subject: [Bioperl-l] Bioperl and ACE files In-Reply-To: References: Message-ID: <4031476E.9020900@csiro.au> Jason Stajich wrote: > As always, more code and information as to how you got here makes it > easier for someone to answer. I have attached the perl script that I am using. The sample ACE file is available here: http://www.livestockgenomics.csiro.au/junk.ace You might run this script like this: acetest.pl junk.ace outdir > > Not really sure how you are getting to the point where you have created > Bio::LocatableSeq objects - presumably you are trying to do an assembly > so I'll guess you got there from Bio::Assembly::IO. > > You may need to get help from Robson about what the format is supposed to > support. A start of 0 is not really proper in Bioperl - > sequences/features start at 1 in our system, so the assembly code needs to > adjust for that. presumably those numbers are offsets not actual start > positions so the parsing code may need some looking at. > > -jason > > > On Mon, 16 Feb 2004, Wes Barris wrote: > > > Hi, > > > > I have an ACE file that I am trying to process with bioperl. A portion > > of the ACE file looks like this: > > > > AF CB429506 U 2 > > AF CB428704 U 6 > > AF CB430643 U 1 > > AF CB431187 U 0 > > AF CB430639 U -7 > > AF CB430480 C 24 > > AF CB430055 U 10 > > > > Notice the line in the middle that shows a starting position of '0' > > (zero)? When bioperl tries to process this sequence, an error is > > thrown. I have found the port of the bioperl code that throws the > > error: > > Bio/LocatableSeq.pm: > > sub get_nse{ > > my ($self,$char1,$char2) = @_; > > > > $char1 ||= "/"; > > $char2 ||= "-"; > > > > $self->throw("Attribute id not set") unless $self->id(); > > $self->throw("Attribute start not set") unless $self->start();<----- > > $self->throw("Attribute end not set") unless $self->end(); > > > > return $self->id() . $char1 . $self->start . $char2 . $self->end ; > > > > } > > > > Notice how "$self->start()" is tested. When it encounters a sequence > > whose start is set to zero, an error is thrown. > > > > I don't know much about the ACE file format. Do I have a questionable > > ACE file or is this test incomplete? > > > > -- > Jason Stajich > Duke University > jason at cgt.mc.duke.edu > -- Wes Barris E-Mail: Wes.Barris@csiro.au -------------- next part -------------- #!/usr/local/bin/perl -w # # use strict; use Bio::Assembly::IO; use Bio::AlignIO; use Bio::SeqIO; # my $usage = "Usage: $0 \n"; my $infile = shift or die $usage; my $outdir = shift or die $usage; my $prefix = 'cn'; my $ext = 'msf'; mkdir $outdir, 0755 if (! -d $outdir); my $io = new Bio::Assembly::IO(-file=>$infile, -format=>'ace'); my $assembly = $io->next_assembly; # Bio::Assembly::ScaffoldI foreach my $contig ($assembly->all_contigs()) { # Bio::Assembly::Contig my $contigName = $prefix.($contig->id); # # Write the consensus to a file. # my $consensusSeq = new Bio::Seq( -seq=>$contig->get_consensus_sequence->seq, -id=>$contigName); my $seqout = new Bio::SeqIO(-file=>">$outdir/$contigName.fa", -format=>'fasta'); $seqout->write_seq($consensusSeq); # # Make the consensus the first sequence of the simple align object. # my $aln = new Bio::SimpleAlign(); $aln->id('alignment.msf'); $contig->get_consensus_sequence->id($contigName); $aln->add_seq($contig->get_consensus_sequence); # # Loop through each sequence in the contig adding it to the new alignment. # foreach my $seq ($contig->each_seq) { my $id; if ($seq->display_id =~ /\|/) { my @junk = split(/[\|\.]/, $seq->display_id); $id = $junk[3]; } else { $id = $seq->display_id; } my $lseq = new Bio::LocatableSeq( -seq=>$seq->seq, -id=>$id, -start=>$contig->get_seq_coord($seq)->start, -end=>$contig->get_seq_coord($seq)->end, ); &alignSeq($lseq,$contig->get_consensus_sequence->length); $aln->add_seq($lseq); } $aln->set_displayname_flat; my $outstream = new Bio::AlignIO(-format=>'msf', -file=>">$outdir/$contigName.$ext"); $outstream->write_aln($aln); undef $outstream; } exit; sub alignSeq { my ($lseq, $cnlength) = @_; # # Clip any sequence that begins before the consensus. # if ($lseq->start <= 0) { my $offset = -$lseq->start + 2; $lseq->seq($lseq->subseq($offset,$lseq->length)); print($lseq->display_id," was clipped at the beginning by $offset\n"); } # # Pad each sequence so it aligns with the consensus. # my $before = $lseq->start - 1; my $after = $cnlength - $lseq->end; my $alignedSequence = '-' x $before . $lseq->seq . '-' x $after; # # Trim any sequence that extends beyond the consensus. # if (length($alignedSequence) > $cnlength) { $alignedSequence = substr($alignedSequence, 0 ,$cnlength); print($lseq->display_id," was clipped at the end\n"); } $lseq->seq($alignedSequence); return; } From lhaifeng at dso.org.sg Mon Feb 16 22:14:19 2004 From: lhaifeng at dso.org.sg (Liu Haifeng) Date: Mon Feb 16 22:31:03 2004 Subject: [Bioperl-l] align and profile_align Message-ID: <001a01c3f504$225cd790$706712ac@GENETHON> Hi, Anybody can help me understand the difference between the methods "align" and "profile_align" when using Bio::Tools::Run::Alignment::Clustalw to align multiple protein sequences? If I have 5 protein sequences, would the 2 options below be same? 1. align the 5 sequences together 2. align the first 4 sequence, then profile_align the alignment obtained with the last sequence together I have tested the two options, the consensus_strings obtained are different. So can anybody tell me how profile_align can be useful? It seems that profile_align may save some computation. Thanks! Haifeng Liu From wes.barris at csiro.au Mon Feb 16 22:35:49 2004 From: wes.barris at csiro.au (Wes Barris) Date: Mon Feb 16 22:41:46 2004 Subject: [Bioperl-l] ace.pm In-Reply-To: References: Message-ID: <40318C15.8020903@csiro.au> Jason Stajich wrote: > > People write code and modules to support the work they are doing, > sometimes for a specific data set - so I suspect Robson wrote this to > support phrap ace format which has a convention of them being ContigXX. > > You are welcome to make changes to code on your local system to get it > working and then post the diffs so they can be incorporated back in. Why > not try changing the code as you have noticed and seeing if it works. It > is a collaborative project and these modules are newish, so give a try > fixing things and then getting feedback on your fixes. I have modified one line in Bio/Assembly/IO/ace.pm as shown below: # Loading contig sequence (COntig sequence field) # (/^CO Contig(\d+) (\d+) (\d+) (\d+) (\w+)/) && do { # New contig found! (/^CO (\w+) (\d+) (\d+) (\d+) (\w+)/) && do { # New contig found! The change will cause the contigID to be whatever the second field of this line is (CO CL15Contig1 794 4 0 U). In this case, it would be set to "CL15Contig1". > > -jason > > On Tue, 17 Feb 2004, Wes Barris wrote: > > > Hi, > > > > ACE files generated by an application called tgicl have "CO" > > lines of the form: > > > > CO CL15Contig2 794 4 0 U > > > > This line is not parsed properly by the ace.pm bioperl module. > > Notice this line from Bio/Assembly/IO/ace.pm . > > > > (/^CO Contig(\d+) (\d+) (\d+) (\d+) (\w+)/) && do { # New > > contig found! > > > > Bioperl expects the second "word" in the line to be "Contig\d+" where > > the number is used as the "contigID". Is there a reason why > > "contigID" must be a number? Why can't it be the whole second > > "word" of the "CO" line? > > > > -- > Jason Stajich > Duke University > jason at cgt.mc.duke.edu > -- Wes Barris E-Mail: Wes.Barris@csiro.au From redwards at utmem.edu Mon Feb 16 22:40:40 2004 From: redwards at utmem.edu (Rob Edwards) Date: Mon Feb 16 22:46:24 2004 Subject: [Bioperl-l] Bio::SeqFeature::Primer Calculating the Primer TM In-Reply-To: References: Message-ID: <0E84084E-60FB-11D8-B87B-000A959E1622@utmem.edu> I wrote this implementation of the Bio::SeqFeature::Primer module, and I am guilty as charged. I was confused by the primer3 documentation, but hey, at least I was honest in the docs - I said that I couldn't figure out what was going on. I don't know what the formula is that Primer3 used. If you have some code for this I can update the module. Rob On Feb 12, 2004, at 5:50 AM, john herbert wrote: > Hello BioPerl. > I was thinking of using BioPerl to calculate the Tm of my primers. I > looked from the documentation and found the method for getting a > primer's Tm. my $tm = $primer->Tm; > > In the notes of this method, the author is confused by the fact that a > BioPerl calculated Tm never matches that of Primer3's prediction. This > is because Primer3 uses a different method to calculate a Primers TM > than the method used in the BioPerl. > > BioPerl $primer->Tm = Calculated using: Tm = 81.5 + 16.6(log10([Na+])) > + .41*(%GC) - 600/length > (Sambrook, Fritsch and Maniatis, Molecular Cloning, p 11.46 (1989, CSHL > Press). > > Primer3 primer TM = Primer3 uses the oligo melting temperature formula > given in Rychlik, Spencer and Rhoads, Nucleic Acids Research, vol 18, > num 12, pp 6409-6412 and Breslauer, Frank, Bloeker and Marky, Proc. > Natl. Acad. Sci. USA, vol 83, pp 3746-3750. This method uses > thermodynamics to calc TM. > > Primer3 does use the Sambrook method to predict the TM of the PCR > products but not the Primer TM. > > Would it be possible to update the BioPerl to match how Primer3 > calculates the Primer TM? > > Kind regards, > > John Herbert > > Cancer Research UK. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l From hlapp at gmx.net Tue Feb 17 00:03:45 2004 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue Feb 17 00:09:33 2004 Subject: [Bioperl-l] Bug in GCG SeqIO Formatting? In-Reply-To: Message-ID: Rule #1: If your code doesn't work the way you think it should, or fails with an exception, and you do want help from the mailing list, then be sure to send along the *complete* output, in particular the stack trace if there was any. Rule #2: Double check that you followed rule #1. Rule #3: Check again that you followed rule #1. There really aren't any other rules here. If you choose not to follow rule #1 you indicate that you're not actually interested in getting help. -hilmar On Monday, February 16, 2004, at 02:49 PM, Tex Thompson wrote: > Hello Mailing List, > > I have a user complaining that the following code isn't working on his > GCG-formatted sequence files: > > #!/usr/bin/perl > > use strict; > > use Bio::SeqIO; > my $io = Bio::SeqIO->new( -file => "af317472.gbpln3", -format => > "gcg"); > my $out = Bio::SeqIO->new( -fh => \*STDOUT, -format => "fasta" ); > > while ( my $seq = $io->next_seq ) { > $out->write_seq( $seq ); > } > > Here's an example sequence file: > > !!NA_SEQUENCE 1.0 > LOCUS AF317472 2679 bp DNA linear PLN > 07-DEC-2000 > DEFINITION Candida albicans cAMP-dependent protein kinase regulatory > subunit > (PKA-R) gene, complete cds. > ACCESSION AF317472 > VERSION AF317472.1 GI:11596392 > KEYWORDS . > SOURCE Candida albicans > ORGANISM Candida albicans > Eukaryota; Fungi; Ascomycota; Saccharomycotina; > Saccharomycetes; > Saccharomycetales; mitosporic Saccharomycetales; Candida. > REFERENCE 1 (bases 1 to 2679) > AUTHORS Giasson,L. and Parrot,M. > TITLE Sequence of the Candida albicans cAMP-dependent protein > kinase > regulatory subunit > JOURNAL Unpublished > REFERENCE 2 (bases 1 to 2679) > AUTHORS Giasson,L. and Parrot,M. > TITLE Direct Submission > JOURNAL Submitted (27-OCT-2000) School of Dentistry, Laval > University, > GREB, Ste-Foy, Quebec G1K 7P4, Canada > FEATURES Location/Qualifiers > source 1. .2679 > /organism="Candida albicans" > /mol_type="genomic DNA" > /strain="CAI4" > /db_xref="taxon:5476" > gene <977. .>2356 > /gene="PKA-R" > mRNA <977. .>2356 > /gene="PKA-R" > /product="cAMP-dependent protein kinase regulatory > subunit" > CDS 977. .2356 > /gene="PKA-R" > /codon_start=1 > /transl_table=12 > /product="cAMP-dependent protein kinase regulatory > subunit" > /protein_id="AAG38599.1" > /db_xref="GI:11596393" > > /translation="MSNPQQQFISDELSQLQKEIISKNPQDVLQFCANYFNTKLQAQR > > SELWSQQAKAEAAGIDLFPSVDHVNVNSSGVSIVNDRQPSFKSPFGVNDPHSNHDEDP > > HAKDTKTDTAAAAVGGGIFKSNFDVKKSASNPPTKEVDPDDPSKPSSSSQPNQQSASA > > SSKTPSSKIPVAFNANRRTSVSAEALNPAKLKLDSWKPPVNNLSITEEETLANNLKNN > > FLFKQLDANSKKTVIAALQQKSFAKDTVIIQQGDEGDFFYIIETGTVDFYVNDAKVSS > > SSEGSSFGELALMYNSPRAATAVAATDVVCWALDRLTFRRILLEGTFNKRLMYEDFLK > > DIEVLKSLSDHARSKLADALSTEMYHKGDKIVTEGEQGENFYLIESGNCQVYNEKLGN > > IKQLTKGDYFGELALIKDLPRQATVEALDNVIVATLGKSGFQRLLGPVVEVLKEQDPT > KSQDPTAGH" > ORIGIN > > AF317472 Length: 2679 February 16, 2004 17:02 Type: N Check: 9369 > .. > > 1 GAATTCAAAA AATCAAAAAA ATCAAAAAAA AACCGTGGAA GGTAAGTTGT > > 51 ATATTTATAA ATCAACGTGA ATAATTTTCA ACACTGTGTC AACATCTGTG > > 101 AAAAAAACCT GTGTGTACTG CATATAGGAC CTCACCTATT ACGTAGAATA > > 151 TACTAGAAAT AGTTACAACC ATAAAAAGAT TAATTGTGCT TACGTGGCAA > > 201 CTTTGAGATT TTTCTTTTTT CTGTTTCTTT CTTTCTTTTT TTGGCTTAAA > > 251 CAACAAATGT CGCAAATTAT ACAAACGACA TTTGCTGCCC ATGTCATTTT > > 301 GTCGTTATCA CGTGAAGTGT CGCAGATTTA TGTATTCTCA CTTCATTTCT > > 351 ATGGTCATCA ATTGTTCATT CATTCTCTAT CTTCAAAAAT CTGTGATTTG > > 401 ATGATTTTGA TTAAAAGAAA GCAAAGAGAA TACTGAAAAA AAGCAAAGAG > > 451 AATATAGAAA AGAAACAATA AAAGAATAGT TTCTAAGTTA CTTTGGAGTC > > 501 TGCTATTACC ATGTATCTAT GTGATTGCCC TATCAAATTG GACAATACGG > > 551 GTTTTTGTTT AGTCACGATA ATCACAAACT TCCCCCAGCA ATGACATACG > > 601 TAGCAAGTAA TATTTATATC TCTTCTATTT TTTTGATCTT ACATAATCTG > > 651 TCGTGTTTTT TTAAGTTGTT GTTATGAAGA AGTAATTTCA TAATGATCAA > > 701 GTGTGTAACT GAAATTTCAT CGCAATTTTA AACAAACAAG CTAATAATTA > > 751 TTATTATTAA TAGTTAATTT GCTAAGTTGA GTAAAATTTG CTTTTCTTGA > > 801 GAAAAAGGAG AAATTACTTT GGGAGTGAGT TTGAAGAGAG AAACTAAAGT > > 851 AAGTAAATGA GTGAGAGGGA GAGACAGAGA GCGAGAGGGG GAGTAAAAAA > > 901 AAAAGTTGCC CACAAACAAA TTGTGATACC GGTCTTTTAG CATATATCTT > > 951 CTACTCTTCA ATCAACATCT TTACCAATGT CTAATCCTCA ACAACAATTC > > 1001 ATATCTGATG AATTGTCGCA GTTACAGAAA GAAATAATTT CCAAAAACCC > > 1051 GCAAGATGTC TTACAGTTTT GCGCCAACTA TTTCAACACC AAGTTACAAG > > 1101 CTCAAAGAAG TGAGTTATGG TCGCAACAAG CTAAAGCAGA AGCCGCAGGC > > 1151 ATCGACTTAT TCCCATCTGT TGATCATGTG AATGTTAATT CTAGTGGTGT > > 1201 GAGCATTGTG AATGATAGAC AACCAAGTTT TAAATCACCT TTTGGTGTTA > > 1251 ATGATCCACA TCTGAATCAC GACGAAGATC CCCATGCCAA AGATACCAAA > > 1301 ACAGATACTG CTGCTGCTGC TGTTGGTGGG GGTATTTTCA AATCAAATTT > > 1351 TGATGTTAAA AAGAGTGCTT CTAATCCTCC AACCAAGGAA GTAGATCCAG > > 1401 ATGACCCATC AAAACCATCG TCATCGAGCC AACCAAATCA ACAATCAGCA > > 1451 TCAGCATCAT CAAAAACGCC ATCATCAAAG ATCCCAGTTG CTTTCAACGC > > 1501 TAATAGAAGA ACATCTGTAT CTGCTGAAGC CTTGAATCCA GCAAAATTGA > > 1551 AATTAGATAG TTGGAAACCT CCAGTTAATA ATTTGAGCAT TACCGAAGAA > > 1601 GAAACATTAG CCAACAATTT AAAGAACAAT TTCCTTTTCA AACAATTGGA > > 1651 CGCAAACTCT AAGAAAACTG TGATTGCTGC TTTACAACAA AAATCATTTG > > 1701 CTAAAGATAC AGTAATTATC CAACAAGGTG ATGAAGGGGA CTTTTTTTAC > > 1751 ATTATTGAAA CTGGTACAGT TGATTTCTAT GTTAATGATG CTAAAGTAAG > > 1801 TTCCAGTAGC GAAGGGTCAT CTTTTGGGGA ATTGGCTTTG ATGTATAATT > > 1851 CACCAAGAGC TGCTACGGCA GTTGCTGCCA CCGATGTTGT CTGTTGGGCA > > 1901 TTGGACCGTT TGACATTCCG TCGAATTCTT TTGGAAGGTA CTTTTAACAA > > 1951 GAGATTGATG TACGAGGATT TCTTAAAAGA TATTGAGGTT TTGAAATCTC > > 2001 TTTCGGATCA TGCACGTTCA AAATTGGCAG ATGCATTGAG CACAGAAATG > > 2051 TATCACAAGG GTGATAAAAT AGTCACTGAA GGTGAACAAG GAGAGAACTT > > 2101 TTATTTAATA GAAAGTGGAA ACTGTCAAGT TTACAATGAA AAGTTGGGCA > > 2151 ATATCAAACA ATTAACAAAA GGTGATTATT TTGGTGAGCT TGCATTAATA > > 2201 AAAGACTTAC CAAGACAAGC TACTGTGGAA GCATTGGATA ATGTAATCGT > > 2251 TGCCACATTA GGTAAATCCG GGTTCCAAAG ATTATTGGGT CCTGTTGTGG > > 2301 AGGTATTGAA AGAACAAGAC CCTACAAAGA GTCAAGACCC AACTGCTGGT > > 2351 CATTAAGTGT ACAATAAGTA GTTGTTTATT ATCTTATATT GTTTTATGTT > > 2401 AGTATATTCT ATCTTTTTTT TTTTGGCTTA CTCACCTTCT GGTGTTTTCG > > 2451 TTGCGATTTT GATAATGGAT GGTTGGTGCA AAAGTTCAAC TACATTTCTT > > 2501 GTTGTCAGGT ATATACGAGA TGGCAGCATG AACGAGCTCA CCATGGGTTG > > 2551 AACATTATTG AAGTTATCCG GCCGTGCCTT TTGCGAAACA TGGTAACTAA > > 2601 TATATTGCAA ACTTGGCTTC TACAGAAAAT ATACAATCTA ATACCTTGAG > > 2651 GAATTTCCTC TATATATAAT AGAGAATTC > > I'm not a GCG expert, but is this a correctly formatted GCG file in > the first > place? If not, is this an error in the SeqIO parser? I've found this > behavior > to be the same on Solaris 8 and on Linux, both running BioPerl 1.4 and > Perl > 5.8.1. > > Thanks a bunch, > > Tex Thompson > RIT Bioinformatics > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > > -- ------------------------------------------------------------- Hilmar Lapp email: lapp at gnf.org GNF, San Diego, Ca. 92121 phone: +1-858-812-1757 ------------------------------------------------------------- From tex at biosysadmin.com Tue Feb 17 01:21:51 2004 From: tex at biosysadmin.com (Tex Thompson) Date: Tue Feb 17 01:17:14 2004 Subject: [Bioperl-l] Bug in GCG SeqIO Formatting? In-Reply-To: Message-ID: Hilmar, Thanks for the tip. There are no stack errors, but here is the output from the test program shown below: >AF317472 !!NA_SEQUENCE 1.0LOCUS AF317472 2679 bp DNA linear PLN 07-DEC-2000DEFINITION Candida albicans cAMP-dependent protein kinase regulatory subunit (PKA-R) gene, complete cds.ACCESSION AF317472VERSION AF317472.1 GI:11596392KEYWORDS .SOURCE Candida albicans ORGANISM Candida albicans Eukaryota; Fungi; Ascomycota; Saccharomycotina; Saccharomycetes; Saccharomycetales; mitosporic Saccharomycetales; Candida.REFERENCE 1 (bases 1 to 2679) AUTHORS Giasson,L. and Parrot,M. TITLE Sequence of the Candida albicans cAMP-dependent protein kinase regulatory subunit JOURNAL UnpublishedREFERENCE 2 (bases 1 to 2679) AUTHORS Giasson,L. and Parrot,M. TITLE Direct Submission JOURNAL Submitted (27-OCT-2000) School of Dentistry, Laval University, GREB, Ste-Foy, Quebec G1K 7P4, CanadaFEATURES Location/Qualifiers source 1. .2679 /organism="Candida albicans" /mol_type="genomic DNA" /strain="CAI4" /db_xref="taxon:5476" gene <977. .>2356 /gene="PKA-R" mRNA <977. .>2356 /gene="PKA-R" /product="cAMP-dependent protein kinase regulatory subunit" CDS 977. .2356 /gene="PKA-R" /codon_start=1 /transl_table=12 /product="cAMP-dependent protein kinase regulatory subunit" /protein_id="AAG38599.1" /db_xref="GI:11596393" /translation="MSNPQQQFISDELSQLQKEIISKNPQDVLQFCANYFNTKLQAQR SELWSQQAKAEAAGIDLFPSVDHVNVNSSGVSIVNDRQPSFKSPFGVNDPHSNHDEDP HAKDTKTDTAAAAVGGGIFKSNFDVKKSASNPPTKEVDPDDPSKPSSSSQPNQQSASA SSKTPSSKIPVAFNANR RTSVSAEALNPAKLKLDSWKPPVNNLSITEEETLANNLKNN FLFKQLDANSKKTVIAALQQKSFAKDTVIIQQGDEGDFFYIIETGTVDFYVNDAKVSS SSEGSSFGELALMYNSPRAATAVAATDVVCWALDRLTFRRILLEGTFNKRLMYEDFLK DIEVLKSLSDHARSKLADALSTEMYHKGDKIVTEGEQGENFYLIESGNCQVYNEKLGN IKQLTKGDYFGELALIKDLPRQATVEALDNVIVATLGKSGFQRLLGPVVEVLKEQDPT KSQDPTAGH"ORIGIN GAATTCAAAAAATCAAAAAAATCAAAAAAAAACCGTGGAAGGTAAGTTGTATATTTATAA ATCAACGTGAATAATTTTCAACACTGTGTCAACATCTGTGAAAAAAACCTGTGTGTACTG CATATAGGACCTCACCTATTACGTAGAATATACTAGAAATAGTTACAACCATAAAAAGAT TAATTGTGCTTACGTGGCAACTTTGAGATTTTTCTTTTTTCTGTTTCTTTCTTTCTTTTT TTGGCTTAAACAACAAATGTCGCAAATTATACAAACGACATTTGCTGCCCATGTCATTTT GTCGTTATCACGTGAAGTGTCGCAGATTTATGTATTCTCACTTCATTTCTATGGTCATCA ATTGTTCATTCATTCTCTATCTTCAAAAATCTGTGATTTGATGATTTTGATTAAAAGAAA GCAAAGAGAATACTGAAAAAAAGCAAAGAGAATATAGAAAAGAAACAATAAAAGAATAGT TTCTAAGTTACTTTGGAGTCTGCTATTACCATGTATCTATGTGATTGCCCTATCAAATTG GACAATACGGGTTTTTGTTTAGTCACGATAATCACAAACTTCCCCCAGCAATGACATACG TAGCAAGTAATATTTATATCTCTTCTATTTTTTTGATCTTACATAATCTGTCGTGTTTTT TTAAGTTGTTGTTATGAAGAAGTAATTTCATAATGATCAAGTGTGTAACTGAAATTTCAT CGCAATTTTAAACAAACAAGCTAATAATTATTATTATTAATAGTTAATTTGCTAAGTTGA GTAAAATTTGCTTTTCTTGAGAAAAAGGAGAAATTACTTTGGGAGTGAGTTTGAAGAGAG AAACTAAAGTAAGTAAATGAGTGAGAGGGAGAGACAGAGAGCGAGAGGGGGAGTAAAAAA AAAAGTTGCCCACAAACAAATTGTGATACCGGTCTTTTAGCATATATCTTCTACTCTTCA ATCAACATCTTTACCAATGTCTAATCCTCAACAACAATTCATATCTGATGAATTGTCGCA GTTACAGAAAGAAATAATTTCCAAAAACCCGCAAGATGTCTTACAGTTTTGCGCCAACTA TTTCAACACCAAGTTACAAGCTCAAAGAAGTGAGTTATGGTCGCAACAAGCTAAAGCAGA AGCCGCAGGCATCGACTTATTCCCATCTGTTGATCATGTGAATGTTAATTCTAGTGGTGT GAGCATTGTGAATGATAGACAACCAAGTTTTAAATCACCTTTTGGTGTTAATGATCCACA TCTGAATCACGACGAAGATCCCCATGCCAAAGATACCAAAACAGATACTGCTGCTGCTGC TGTTGGTGGGGGTATTTTCAAATCAAATTTTGATGTTAAAAAGAGTGCTTCTAATCCTCC AACCAAGGAAGTAGATCCAGATGACCCATCAAAACCATCGTCATCGAGCCAACCAAATCA ACAATCAGCATCAGCATCATCAAAAACGCCATCATCAAAGATCCCAGTTGCTTTCAACGC TAATAGAAGAACATCTGTATCTGCTGAAGCCTTGAATCCAGCAAAATTGAAATTAGATAG TTGGAAACCTCCAGTTAATAATTTGAGCATTACCGAAGAAGAAACATTAGCCAACAATTT AAAGAACAATTTCCTTTTCAAACAATTGGACGCAAACTCTAAGAAAACTGTGATTGCTGC TTTACAACAAAAATCATTTGCTAAAGATACAGTAATTATCCAACAAGGTGATGAAGGGGA CTTTTTTTACATTATTGAAACTGGTACAGTTGATTTCTATGTTAATGATGCTAAAGTAAG TTCCAGTAGCGAAGGGTCATCTTTTGGGGAATTGGCTTTGATGTATAATTCACCAAGAGC TGCTACGGCAGTTGCTGCCACCGATGTTGTCTGTTGGGCATTGGACCGTTTGACATTCCG TCGAATTCTTTTGGAAGGTACTTTTAACAAGAGATTGATGTACGAGGATTTCTTAAAAGA TATTGAGGTTTTGAAATCTCTTTCGGATCATGCACGTTCAAAATTGGCAGATGCATTGAG CACAGAAATGTATCACAAGGGTGATAAAATAGTCACTGAAGGTGAACAAGGAGAGAACTT TTATTTAATAGAAAGTGGAAACTGTCAAGTTTACAATGAAAAGTTGGGCAATATCAAACA ATTAACAAAAGGTGATTATTTTGGTGAGCTTGCATTAATAAAAGACTTACCAAGACAAGC TACTGTGGAAGCATTGGATAATGTAATCGTTGCCACATTAGGTAAATCCGGGTTCCAAAG ATTATTGGGTCCTGTTGTGGAGGTATTGAAAGAACAAGACCCTACAAAGAGTCAAGACCC AACTGCTGGTCATTAAGTGTACAATAAGTAGTTGTTTATTATCTTATATTGTTTTATGTT AGTATATTCTATCTTTTTTTTTTTGGCTTACTCACCTTCTGGTGTTTTCGTTGCGATTTT GATAATGGATGGTTGGTGCAAAAGTTCAACTACATTTCTTGTTGTCAGGTATATACGAGA TGGCAGCATGAACGAGCTCACCATGGGTTGAACATTATTGAAGTTATCCGGCCGTGCCTT TTGCGAAACATGGTAACTAATATATTGCAAACTTGGCTTCTACAGAAAATATACAATCTA ATACCTTGAGGAATTTCCTCTATATATAATAGAGAATTC It looks like a lot of the header information is all stuck on that first line. Looking at it more carefully it looks like a valid FASTA file, but is this really desired behavior? Thanks for the help, Tex Thompson RIT Bioinformatics On Mon, 16 Feb 2004, Hilmar Lapp wrote: > Rule #1: If your code doesn't work the way you think it should, or > fails with an exception, and you do want help from the mailing list, > then be sure to send along the *complete* output, in particular the > stack trace if there was any. > > Rule #2: Double check that you followed rule #1. > > Rule #3: Check again that you followed rule #1. > > There really aren't any other rules here. If you choose not to follow > rule #1 you indicate that you're not actually interested in getting > help. > > -hilmar > > On Monday, February 16, 2004, at 02:49 PM, Tex Thompson wrote: > > > Hello Mailing List, > > > > I have a user complaining that the following code isn't working on his > > GCG-formatted sequence files: > > > > #!/usr/bin/perl > > > > use strict; > > > > use Bio::SeqIO; > > my $io = Bio::SeqIO->new( -file => "af317472.gbpln3", -format => > > "gcg"); > > my $out = Bio::SeqIO->new( -fh => \*STDOUT, -format => "fasta" ); > > > > while ( my $seq = $io->next_seq ) { > > $out->write_seq( $seq ); > > } > > > > Here's an example sequence file: > > > > !!NA_SEQUENCE 1.0 > > LOCUS AF317472 2679 bp DNA linear PLN > > 07-DEC-2000 > > DEFINITION Candida albicans cAMP-dependent protein kinase regulatory > > subunit > > (PKA-R) gene, complete cds. > > ACCESSION AF317472 > > VERSION AF317472.1 GI:11596392 > > KEYWORDS . > > SOURCE Candida albicans > > ORGANISM Candida albicans > > Eukaryota; Fungi; Ascomycota; Saccharomycotina; > > Saccharomycetes; > > Saccharomycetales; mitosporic Saccharomycetales; Candida. > > REFERENCE 1 (bases 1 to 2679) > > AUTHORS Giasson,L. and Parrot,M. > > TITLE Sequence of the Candida albicans cAMP-dependent protein > > kinase > > regulatory subunit > > JOURNAL Unpublished > > REFERENCE 2 (bases 1 to 2679) > > AUTHORS Giasson,L. and Parrot,M. > > TITLE Direct Submission > > JOURNAL Submitted (27-OCT-2000) School of Dentistry, Laval > > University, > > GREB, Ste-Foy, Quebec G1K 7P4, Canada > > FEATURES Location/Qualifiers > > source 1. .2679 > > /organism="Candida albicans" > > /mol_type="genomic DNA" > > /strain="CAI4" > > /db_xref="taxon:5476" > > gene <977. .>2356 > > /gene="PKA-R" > > mRNA <977. .>2356 > > /gene="PKA-R" > > /product="cAMP-dependent protein kinase regulatory > > subunit" > > CDS 977. .2356 > > /gene="PKA-R" > > /codon_start=1 > > /transl_table=12 > > /product="cAMP-dependent protein kinase regulatory > > subunit" > > /protein_id="AAG38599.1" > > /db_xref="GI:11596393" > > > > /translation="MSNPQQQFISDELSQLQKEIISKNPQDVLQFCANYFNTKLQAQR > > > > SELWSQQAKAEAAGIDLFPSVDHVNVNSSGVSIVNDRQPSFKSPFGVNDPHSNHDEDP > > > > HAKDTKTDTAAAAVGGGIFKSNFDVKKSASNPPTKEVDPDDPSKPSSSSQPNQQSASA > > > > SSKTPSSKIPVAFNANRRTSVSAEALNPAKLKLDSWKPPVNNLSITEEETLANNLKNN > > > > FLFKQLDANSKKTVIAALQQKSFAKDTVIIQQGDEGDFFYIIETGTVDFYVNDAKVSS > > > > SSEGSSFGELALMYNSPRAATAVAATDVVCWALDRLTFRRILLEGTFNKRLMYEDFLK > > > > DIEVLKSLSDHARSKLADALSTEMYHKGDKIVTEGEQGENFYLIESGNCQVYNEKLGN > > > > IKQLTKGDYFGELALIKDLPRQATVEALDNVIVATLGKSGFQRLLGPVVEVLKEQDPT > > KSQDPTAGH" > > ORIGIN > > > > AF317472 Length: 2679 February 16, 2004 17:02 Type: N Check: 9369 > > .. > > > > 1 GAATTCAAAA AATCAAAAAA ATCAAAAAAA AACCGTGGAA GGTAAGTTGT > > > > 51 ATATTTATAA ATCAACGTGA ATAATTTTCA ACACTGTGTC AACATCTGTG > > > > 101 AAAAAAACCT GTGTGTACTG CATATAGGAC CTCACCTATT ACGTAGAATA > > > > 151 TACTAGAAAT AGTTACAACC ATAAAAAGAT TAATTGTGCT TACGTGGCAA > > > > 201 CTTTGAGATT TTTCTTTTTT CTGTTTCTTT CTTTCTTTTT TTGGCTTAAA > > > > 251 CAACAAATGT CGCAAATTAT ACAAACGACA TTTGCTGCCC ATGTCATTTT > > > > 301 GTCGTTATCA CGTGAAGTGT CGCAGATTTA TGTATTCTCA CTTCATTTCT > > > > 351 ATGGTCATCA ATTGTTCATT CATTCTCTAT CTTCAAAAAT CTGTGATTTG > > > > 401 ATGATTTTGA TTAAAAGAAA GCAAAGAGAA TACTGAAAAA AAGCAAAGAG > > > > 451 AATATAGAAA AGAAACAATA AAAGAATAGT TTCTAAGTTA CTTTGGAGTC > > > > 501 TGCTATTACC ATGTATCTAT GTGATTGCCC TATCAAATTG GACAATACGG > > > > 551 GTTTTTGTTT AGTCACGATA ATCACAAACT TCCCCCAGCA ATGACATACG > > > > 601 TAGCAAGTAA TATTTATATC TCTTCTATTT TTTTGATCTT ACATAATCTG > > > > 651 TCGTGTTTTT TTAAGTTGTT GTTATGAAGA AGTAATTTCA TAATGATCAA > > > > 701 GTGTGTAACT GAAATTTCAT CGCAATTTTA AACAAACAAG CTAATAATTA > > > > 751 TTATTATTAA TAGTTAATTT GCTAAGTTGA GTAAAATTTG CTTTTCTTGA > > > > 801 GAAAAAGGAG AAATTACTTT GGGAGTGAGT TTGAAGAGAG AAACTAAAGT > > > > 851 AAGTAAATGA GTGAGAGGGA GAGACAGAGA GCGAGAGGGG GAGTAAAAAA > > > > 901 AAAAGTTGCC CACAAACAAA TTGTGATACC GGTCTTTTAG CATATATCTT > > > > 951 CTACTCTTCA ATCAACATCT TTACCAATGT CTAATCCTCA ACAACAATTC > > > > 1001 ATATCTGATG AATTGTCGCA GTTACAGAAA GAAATAATTT CCAAAAACCC > > > > 1051 GCAAGATGTC TTACAGTTTT GCGCCAACTA TTTCAACACC AAGTTACAAG > > > > 1101 CTCAAAGAAG TGAGTTATGG TCGCAACAAG CTAAAGCAGA AGCCGCAGGC > > > > 1151 ATCGACTTAT TCCCATCTGT TGATCATGTG AATGTTAATT CTAGTGGTGT > > > > 1201 GAGCATTGTG AATGATAGAC AACCAAGTTT TAAATCACCT TTTGGTGTTA > > > > 1251 ATGATCCACA TCTGAATCAC GACGAAGATC CCCATGCCAA AGATACCAAA > > > > 1301 ACAGATACTG CTGCTGCTGC TGTTGGTGGG GGTATTTTCA AATCAAATTT > > > > 1351 TGATGTTAAA AAGAGTGCTT CTAATCCTCC AACCAAGGAA GTAGATCCAG > > > > 1401 ATGACCCATC AAAACCATCG TCATCGAGCC AACCAAATCA ACAATCAGCA > > > > 1451 TCAGCATCAT CAAAAACGCC ATCATCAAAG ATCCCAGTTG CTTTCAACGC > > > > 1501 TAATAGAAGA ACATCTGTAT CTGCTGAAGC CTTGAATCCA GCAAAATTGA > > > > 1551 AATTAGATAG TTGGAAACCT CCAGTTAATA ATTTGAGCAT TACCGAAGAA > > > > 1601 GAAACATTAG CCAACAATTT AAAGAACAAT TTCCTTTTCA AACAATTGGA > > > > 1651 CGCAAACTCT AAGAAAACTG TGATTGCTGC TTTACAACAA AAATCATTTG > > > > 1701 CTAAAGATAC AGTAATTATC CAACAAGGTG ATGAAGGGGA CTTTTTTTAC > > > > 1751 ATTATTGAAA CTGGTACAGT TGATTTCTAT GTTAATGATG CTAAAGTAAG > > > > 1801 TTCCAGTAGC GAAGGGTCAT CTTTTGGGGA ATTGGCTTTG ATGTATAATT > > > > 1851 CACCAAGAGC TGCTACGGCA GTTGCTGCCA CCGATGTTGT CTGTTGGGCA > > > > 1901 TTGGACCGTT TGACATTCCG TCGAATTCTT TTGGAAGGTA CTTTTAACAA > > > > 1951 GAGATTGATG TACGAGGATT TCTTAAAAGA TATTGAGGTT TTGAAATCTC > > > > 2001 TTTCGGATCA TGCACGTTCA AAATTGGCAG ATGCATTGAG CACAGAAATG > > > > 2051 TATCACAAGG GTGATAAAAT AGTCACTGAA GGTGAACAAG GAGAGAACTT > > > > 2101 TTATTTAATA GAAAGTGGAA ACTGTCAAGT TTACAATGAA AAGTTGGGCA > > > > 2151 ATATCAAACA ATTAACAAAA GGTGATTATT TTGGTGAGCT TGCATTAATA > > > > 2201 AAAGACTTAC CAAGACAAGC TACTGTGGAA GCATTGGATA ATGTAATCGT > > > > 2251 TGCCACATTA GGTAAATCCG GGTTCCAAAG ATTATTGGGT CCTGTTGTGG > > > > 2301 AGGTATTGAA AGAACAAGAC CCTACAAAGA GTCAAGACCC AACTGCTGGT > > > > 2351 CATTAAGTGT ACAATAAGTA GTTGTTTATT ATCTTATATT GTTTTATGTT > > > > 2401 AGTATATTCT ATCTTTTTTT TTTTGGCTTA CTCACCTTCT GGTGTTTTCG > > > > 2451 TTGCGATTTT GATAATGGAT GGTTGGTGCA AAAGTTCAAC TACATTTCTT > > > > 2501 GTTGTCAGGT ATATACGAGA TGGCAGCATG AACGAGCTCA CCATGGGTTG > > > > 2551 AACATTATTG AAGTTATCCG GCCGTGCCTT TTGCGAAACA TGGTAACTAA > > > > 2601 TATATTGCAA ACTTGGCTTC TACAGAAAAT ATACAATCTA ATACCTTGAG > > > > 2651 GAATTTCCTC TATATATAAT AGAGAATTC > > > > I'm not a GCG expert, but is this a correctly formatted GCG file in > > the first > > place? If not, is this an error in the SeqIO parser? I've found this > > behavior > > to be the same on Solaris 8 and on Linux, both running BioPerl 1.4 and > > Perl > > 5.8.1. > > > > Thanks a bunch, > > > > Tex Thompson > > RIT Bioinformatics > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l@portal.open-bio.org > > http://portal.open-bio.org/mailman/listinfo/bioperl-l > > > > > From Sebastien.Moretti at igs.cnrs-mrs.fr Tue Feb 17 02:44:45 2004 From: Sebastien.Moretti at igs.cnrs-mrs.fr (Sebastien Moretti) Date: Tue Feb 17 02:45:25 2004 Subject: [Bioperl-l] get sequence failure In-Reply-To: <4030ECCE.8090608@cenix-bioscience.com> References: <20040216154238.4435.qmail@web25209.mail.ukl.yahoo.com> <4030ECCE.8090608@cenix-bioscience.com> Message-ID: <200402170844.45672.Sebastien.Moretti@igs.cnrs-mrs.fr> Hi, I use use Bio::DB::GenBank; use Bio::DB::Query::GenBank; use Bio::SeqIO; and I can get RefSeq, nr and GenPept files with accession numbers. > Hi, > > I think the problem could be that you are trying to retrieve a contig > (NT number). I remember Bio::DB::GenBank used to have problems with > those, but I'm not sure about now. > > And from reading the Bio::Perl::get_sequence POD, I see: > > get_sequence > > Title : get_sequence > Usage : $seq_object = get_sequence('swiss',"ROA1_HUMAN"); > ... > > Args : database type - one of swiss, embl, genbank or refseq > identifier or accession number > > > Are you sure 'nr' is a valid database type? > > Cheers, > > Andrew > > william ritchie wrote: > > Hi, > > > > I m trying to retrieve a sequence with the following > > code: > > > > use Bio::SearchIO; > > use Bio::Perl; > > use Bio::SeqIO; > > > > my $seq_object = get_sequence("nr","NT_039208.2"); > > > > and I m getting this error: > > > > MSG: id does not exist > > STACK Bio::DB::WebDBSeqI::get_Seq_by_id > > /usr/lib/perl5/site_perl/5.8.0/Bio/DB/WebDBSeqI.pm:155 > > STACK Bio::Perl::get_sequence > > /usr/lib/perl5/site_perl/5.8.0/Bio/Perl.pm:513 > > STACK toplevel getseq.pl:9 > > > > this ID does exist, using the ncbi interface I can > > retrieve it! > > Help!!!!Please!! -- Sebastien MORETTI CNRS - IGS 31 chemin Joseph Aiguier 13402 Marseille cedex 20, FRANCE tel. +334 91 16 44 55 - +336 61 88 59 00 From Marc.Logghe at devgen.com Tue Feb 17 03:03:00 2004 From: Marc.Logghe at devgen.com (Marc Logghe) Date: Tue Feb 17 03:09:22 2004 Subject: [Bioperl-l] ace.pm Message-ID: > -----Original Message----- > From: Wes Barris [mailto:wes.barris@csiro.au] > Sent: dinsdag 17 februari 2004 0:19 > To: Bioperl Mailing List > Subject: [Bioperl-l] ace.pm > > > Hi, > > ACE files generated by an application called tgicl have "CO" > lines of the form: > > CO CL15Contig2 794 4 0 U > > This line is not parsed properly by the ace.pm bioperl module. > Notice this line from Bio/Assembly/IO/ace.pm . > > (/^CO Contig(\d+) (\d+) (\d+) (\d+) (\w+)/) && do { # New > contig found! > A long time ago I've adapted the code in order to handle tgicl (cap3) generated ACE files. Don't know however, if it made it to CVS. I'll have a look, I'll send you the patch as soon as I traced it HTH, Marc From pvh at egenetics.com Tue Feb 17 03:45:16 2004 From: pvh at egenetics.com (Peter van Heusden) Date: Tue Feb 17 03:52:09 2004 Subject: [Bioperl-l] ace.pm In-Reply-To: References: Message-ID: <4031D49C.9060402@egenetics.com> Marc Logghe wrote: > > >>-----Original Message----- >>From: Wes Barris [mailto:wes.barris@csiro.au] >>Sent: dinsdag 17 februari 2004 0:19 >>To: Bioperl Mailing List >>Subject: [Bioperl-l] ace.pm >> >> >>Hi, >> >>ACE files generated by an application called tgicl have "CO" >>lines of the form: >> >>CO CL15Contig2 794 4 0 U >> >>This line is not parsed properly by the ace.pm bioperl module. >>Notice this line from Bio/Assembly/IO/ace.pm . >> >> (/^CO Contig(\d+) (\d+) (\d+) (\d+) (\w+)/) && do { # New >>contig found! >> >> >> >A long time ago I've adapted the code in order to handle tgicl (cap3) generated ACE files. Don't know however, if it made it to CVS. I'll have a look, I'll send you the patch as soon as I traced it >HTH, >Marc > > Modern versions of phrap can generate two types of ACE files: I think the one with the CO instead of Contig is generated when you pass phrap the -old_ace command line parameter. The CO/Contig difference is only one of many, but I assume your code deals with the other changes as well. I know we had to handle these different formats in stackPACK. Peter From john.herbert at clinical-pharmacology.oxford.ac.uk Tue Feb 17 04:07:44 2004 From: john.herbert at clinical-pharmacology.oxford.ac.uk (john herbert) Date: Tue Feb 17 04:13:40 2004 Subject: [Bioperl-l] Bio::SeqFeature::Primer Calculating the Primer TM Message-ID: Hello Rob, Thanks for the reply. I agree about Primer3 documentation, had me puzzled for a while before I sussed it. At the moment I don't have the code but I guess the easiest way could be to look at the source code of Primer3 and see what the C code looks like (I am assuming it is C) for calculating the Primer TM. Then try to mimic it in Perl. If I get time soon, I will have a look. Alternatively we could look at the references. If there are TM Gurus out there who have a better suggestion, please mail us. Kind regards, John. >>> Rob Edwards 17/02/2004 03:40:40 >>> I wrote this implementation of the Bio::SeqFeature::Primer module, and I am guilty as charged. I was confused by the primer3 documentation, but hey, at least I was honest in the docs - I said that I couldn't figure out what was going on. I don't know what the formula is that Primer3 used. If you have some code for this I can update the module. Rob On Feb 12, 2004, at 5:50 AM, john herbert wrote: > Hello BioPerl. > I was thinking of using BioPerl to calculate the Tm of my primers. I > looked from the documentation and found the method for getting a > primer's Tm. my $tm = $primer->Tm; > > In the notes of this method, the author is confused by the fact that a > BioPerl calculated Tm never matches that of Primer3's prediction. This > is because Primer3 uses a different method to calculate a Primers TM > than the method used in the BioPerl. > > BioPerl $primer->Tm = Calculated using: Tm = 81.5 + 16.6(log10([Na+])) > + .41*(%GC) - 600/length > (Sambrook, Fritsch and Maniatis, Molecular Cloning, p 11.46 (1989, CSHL > Press). > > Primer3 primer TM = Primer3 uses the oligo melting temperature formula > given in Rychlik, Spencer and Rhoads, Nucleic Acids Research, vol 18, > num 12, pp 6409-6412 and Breslauer, Frank, Bloeker and Marky, Proc. > Natl. Acad. Sci. USA, vol 83, pp 3746-3750. This method uses > thermodynamics to calc TM. > > Primer3 does use the Sambrook method to predict the TM of the PCR > products but not the Primer TM. > > Would it be possible to update the BioPerl to match how Primer3 > calculates the Primer TM? > > Kind regards, > > John Herbert > > Cancer Research UK. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l From Marc.Logghe at devgen.com Tue Feb 17 04:36:58 2004 From: Marc.Logghe at devgen.com (Marc Logghe) Date: Tue Feb 17 13:48:27 2004 Subject: [Bioperl-l] ace.pm Message-ID: The thread (including patch) can be found here: http://bioperl.org/pipermail/bioperl-l/2002-December/010677.html The patch should still work, cvs version of the package did not change in bioperl. HTH, Marc > -----Original Message----- > From: Wes Barris [mailto:wes.barris@csiro.au] > Sent: dinsdag 17 februari 2004 4:36 > To: Jason Stajich > Cc: Bioperl Mailing List > Subject: Re: [Bioperl-l] ace.pm > > > Jason Stajich wrote: > > > > > People write code and modules to support the work they are doing, > > sometimes for a specific data set - so I suspect Robson > wrote this to > > support phrap ace format which has a convention of them > being ContigXX. > > > > You are welcome to make changes to code on your local > system to get it > > working and then post the diffs so they can be incorporated > back in. Why > > not try changing the code as you have noticed and seeing if > it works. It > > is a collaborative project and these modules are newish, so > give a try > > fixing things and then getting feedback on your fixes. > > I have modified one line in Bio/Assembly/IO/ace.pm as shown below: > > # Loading contig sequence (COntig sequence field) > # (/^CO Contig(\d+) (\d+) (\d+) (\d+) (\w+)/) && do { # > New contig > found! > (/^CO (\w+) (\d+) (\d+) (\d+) (\w+)/) && do { # New > contig found! > > The change will cause the contigID to be whatever the second field of > this line is (CO CL15Contig1 794 4 0 U). In this case, it would be > set to "CL15Contig1". > > > > > > -jason > > > > On Tue, 17 Feb 2004, Wes Barris wrote: > > > > > Hi, > > > > > > ACE files generated by an application called tgicl have "CO" > > > lines of the form: > > > > > > CO CL15Contig2 794 4 0 U > > > > > > This line is not parsed properly by the ace.pm bioperl module. > > > Notice this line from Bio/Assembly/IO/ace.pm . > > > > > > (/^CO Contig(\d+) (\d+) (\d+) (\d+) (\w+)/) && > do { # New > > > contig found! > > > > > > Bioperl expects the second "word" in the line to be > "Contig\d+" where > > > the number is used as the "contigID". Is there a reason why > > > "contigID" must be a number? Why can't it be the whole second > > > "word" of the "CO" line? > > > > > > > -- > > Jason Stajich > > Duke University > > jason at cgt.mc.duke.edu > > > > > -- > Wes Barris > E-Mail: Wes.Barris@csiro.au > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > From Jan.Aerts at wur.nl Tue Feb 17 09:42:03 2004 From: Jan.Aerts at wur.nl (Aerts, Jan) Date: Tue Feb 17 13:52:10 2004 Subject: [Bioperl-l] get_Seq_by_id: CONTIG found Message-ID: <7D030487F1A3D143A76F2A1E91F57035EF3D2E@scomp0010> Hi all, I'm trying to download a bunch of sequences from GenBank using the ID and get_Seq_by_id (see script below). This method works great, except when it hits a sequence that in fact is a scaffold (e.g. http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nucleotide&val=38322681). The message I get is: -------------------- WARNING --------------------- MSG: CONTIG found. GenBank get_Stream_by_acc about to run. --------------------------------------------------- Warning: unable to close filehandle FETCH properly. Is there a way to test if the ID refers to an ID refers to a contig instead of a 'regular' sequence? Thanks a lot, Jan Aerts use Bio::DB::GenBank; my @ids = (38524490,31745019,38322681); my $db = Bio::DB::GenBank->new(); foreach ( @ids ) { my $seq = $db->get_Seq_by_id($_); print ">", $seq->accession, '|', $seq->description, '|', $seq->keywords, "\n"; } From chapmanb at uga.edu Tue Feb 17 13:44:53 2004 From: chapmanb at uga.edu (Brad Chapman) Date: Tue Feb 17 13:57:50 2004 Subject: [Bioperl-l] Bio::SeqFeature::Primer Calculating the Primer TM In-Reply-To: References: Message-ID: <20040217184453.GH42800@evostick.agtec.uga.edu> Hey all; John: > > At the moment I don't have the code but I guess the easiest way could > > be to look at the source code of Primer3 and see what the C code looks > > like (I am assuming it is C) for calculating the Primer TM. Jason: > I believe biopython guys have dealt with this recently -- brad - can you > point us to working code/documentation for Tm calculation? Yes, Sebastian Bassi was working on this code for a while. He did come up with a version, but it turned out to not be the most current parameters, I believe. I don't think he finished the new version yet. But, there has been plenty of talk about it -- here's the relevant mails: Sebastian's original message: http://www.biopython.org/pipermail/biopython/2003-November/001701.html Implementations (for DNA and RNA): http://www.biopython.org/pipermail/biopython/2003-November/001742.html http://www.biopython.org/pipermail/biopython/2003-November/001745.html Mail from Peter Slickers with the information about the most current accepted parameter sets: http://www.biopython.org/pipermail/biopython/2003-November/001747.html Hope this helps you guys -- and reminds Sebastian we'd still eventually like to see this in Biopython :-). Brad From jason at cgt.duhs.duke.edu Tue Feb 17 07:46:25 2004 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Tue Feb 17 14:09:56 2004 Subject: [Bioperl-l] Bio::SeqFeature::Primer Calculating the Primer TM In-Reply-To: References: Message-ID: I believe biopython guys have dealt with this recently -- brad - can you point us to working code/documentation for Tm calculation? -jason On Tue, 17 Feb 2004, john herbert wrote: > Hello Rob, > Thanks for the reply. I agree about Primer3 documentation, had me > puzzled for a while before I sussed it. > > At the moment I don't have the code but I guess the easiest way could > be to look at the source code of Primer3 and see what the C code looks > like (I am assuming it is C) for calculating the Primer TM. Then try to > mimic it in Perl. If I get time soon, I will have a look. Alternatively > we could look at the references. > > If there are TM Gurus out there who have a better suggestion, please > mail us. > > Kind regards, > > John. > > >>> Rob Edwards 17/02/2004 03:40:40 >>> > I wrote this implementation of the Bio::SeqFeature::Primer module, and > > I am guilty as charged. I was confused by the primer3 documentation, > but hey, at least I was honest in the docs - I said that I couldn't > figure out what was going on. > > I don't know what the formula is that Primer3 used. If you have some > code for this I can update the module. > > Rob > > > > > On Feb 12, 2004, at 5:50 AM, john herbert wrote: > > > Hello BioPerl. > > I was thinking of using BioPerl to calculate the Tm of my primers. I > > looked from the documentation and found the method for getting a > > primer's Tm. my $tm = $primer->Tm; > > > > In the notes of this method, the author is confused by the fact that > a > > BioPerl calculated Tm never matches that of Primer3's prediction. > This > > is because Primer3 uses a different method to calculate a Primers TM > > than the method used in the BioPerl. > > > > BioPerl $primer->Tm = Calculated using: Tm = 81.5 + > 16.6(log10([Na+])) > > + .41*(%GC) - 600/length > > (Sambrook, Fritsch and Maniatis, Molecular Cloning, p 11.46 (1989, > CSHL > > Press). > > > > Primer3 primer TM = Primer3 uses the oligo melting temperature > formula > > given in Rychlik, Spencer and Rhoads, Nucleic Acids Research, vol > 18, > > num 12, pp 6409-6412 and Breslauer, Frank, Bloeker and Marky, Proc. > > Natl. Acad. Sci. USA, vol 83, pp 3746-3750. This method uses > > thermodynamics to calc TM. > > > > Primer3 does use the Sambrook method to predict the TM of the PCR > > products but not the Primer TM. > > > > Would it be possible to update the BioPerl to match how Primer3 > > calculates the Primer TM? > > > > Kind regards, > > > > John Herbert > > > > Cancer Research UK. > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l@portal.open-bio.org > > http://portal.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From Annie.Law at nrc-cnrc.gc.ca Tue Feb 17 14:42:41 2004 From: Annie.Law at nrc-cnrc.gc.ca (Law, Annie) Date: Tue Feb 17 14:49:03 2004 Subject: [Bioperl-l] New GO Parser and errors loading biosql database Message-ID: <10C94843061E094A98C02EB77CFC328722FE05@nrcmrdex1d.imsb.nrc.ca> Hi, I would appreciate help with the following. I have installed the newest bioperl-db and biosql schema from cvs. I tried to load the database with information from godatabase.org and got some errors listed further below (the Tables did not fill at all). Next I tried to load the database with Locuslink data from NCBI. 1)I got the LL file from NCBI and tried to load an empty datbase with a LL_tmpl file (for human) and It seemed to load properly and the tables were filling up but then it stopped after about 900 bioentries. I'm not sure what went wrong. There seem to be a complaint about duplicate entry but I don't think I should Modify the source file. [root@ data]# perl /root/bioperl-db/scripts/biosql/load_seqdatabase.pl --dbuser=root --dbpass=mss22 --dbname bioseqdb --namespace "LocusLink" -format locuslink /var/lib/mysql/LL_ _tmpl --dbpass=bioinf1 --dbname bioseqdb --namespace "LocusLink" -format locuslink /var/lib/mysql/LL_tmpl Loading /var/lib/mysql/LL_tmpl ... -------------------- WARNING --------------------- MSG: insert in Bio::DB::BioSQL::TermAdaptor (driver) failed, values were ("GO:0005699","kinetochore","","") FKs (6) Duplicate entry 'kinetochore-6' for key 2 --------------------------------------------------- Could not store 1063: ------------- EXCEPTION ------------- MSG: create: object (Bio::Annotation::OntologyTerm) failed to insert or to be found by unique key STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create /usr/lib/perl5/site_perl/5.8.0/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:207 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store /usr/lib/perl5/site_perl/5.8.0/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:253 STACK Bio::DB::Persistent::PersistentObject::store /usr/lib/perl5/site_perl/5.8.0/Bio/DB/Persistent/PersistentObject.pm:270 STACK Bio::DB::BioSQL::AnnotationCollectionAdaptor::store_children /usr/lib/perl5/site_perl/5.8.0/Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm: 219 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create /usr/lib/perl5/site_perl/5.8.0/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:215 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store /usr/lib/perl5/site_perl/5.8.0/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:253 STACK Bio::DB::Persistent::PersistentObject::store /usr/lib/perl5/site_perl/5.8.0/Bio/DB/Persistent/PersistentObject.pm:270 STACK Bio::DB::BioSQL::SeqAdaptor::store_children /usr/lib/perl5/site_perl/5.8.0/Bio/DB/BioSQL/SeqAdaptor.pm:226 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create /usr/lib/perl5/site_perl/5.8.0/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:215 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store /usr/lib/perl5/site_perl/5.8.0/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:253 STACK Bio::DB::Persistent::PersistentObject::store /usr/lib/perl5/site_perl/5.8.0/Bio/DB/Persistent/PersistentObject.pm:270 STACK (eval) /root/bioperl-db/scripts/biosql/load_seqdatabase.pl:517 STACK toplevel /root/bioperl-db/scripts/biosql/load_seqdatabase.pl:500 -------------------------------------- 2) Updating GO parser I saw that the GO parser was updated recently and I have located the code version 1.17.2.1 I downloaded the new version. I am using bioperl 1.4. Should I just take the new dagflat.pm and replace the old one or are there more steps involved? When I download whole Modules I need to use make commands. Also I saw that dagflat.pm requires graph.pm. Is this graph.pm part of the bioperl 1.4 package I couldn't seem to find it or do I need to download and install from CPAN. I searched CPAN for graph.pm and got several hits. Is this the one I need? http://search.cpan.org/~mverb/GDGraph-1.43/ Do I also need GD.pm? I think I saw somewhere that it is required? http://search.cpan.org/~lds/GD-2.11/GD.pm Although this could be a mistake Where is the best place to install GD and graph.pm (with dagflat.pm or the main perl library)? I'm not sure whether the main perl library is /usr/lib/perl5/5.8.0 or /usr/lib/perl5/site_perl/5.8.0/Bio 3) I installed Bioperl-db and downloaded the biosql schema successfully but when I tried to use the Load_ontology.pl I got some errors which seem to be saying that I am missing some main modules such as goflat (I recorded a script of the output). But I have goflat.pm. Am I calling the perl script incorrectly? Or are there still some modules I need to install? I'm not sure that I am using the correct Syntax for the format field. Thanks very much, Annie. [root@ data]# perl /root/bioperl-db/scripts/biosql/load_ontology.pl --dbuser=root --d dbpass=mss22 --dbname bioseqdb --noobsolete --namespace "Gene Ontology" --fmtargs "-defs_file,/root/Go.defs" --format goflat ./component.ontology ./process.ontology ./function.ontology Bio::OntologyIO: goflat cannot be found Exception ------------- EXCEPTION ------------- MSG: Failed to load module Bio::OntologyIO::goflat. Can't locate Graph/Directed.pm in @INC (@INC contains: /usr/lib/perl5/5.8.0/i386-linux-thread-multi /usr/lib/perl5/5.8.0 /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.0 /usr/lib/perl5/site_perl /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.0 /usr/lib/perl5/vendor_perl /usr/lib/perl5/5.8.0/i386-linux-thread-multi /usr/lib/perl5/5.8.0 .) at /usr/lib/perl5/site_perl/5.8.0/Bio/Ontology/SimpleGOEngine.pm line 95. BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.0/Bio/Ontology/SimpleGOEngine.pm line 95. Compilation failed in require at /usr/lib/perl5/site_perl/5.8.0/Bio/OntologyIO/dagflat.pm line 105. BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.0/Bio/OntologyIO/dagflat.pm line 105. Compilation failed in require at /usr/lib/perl5/site_perl/5.8.0/Bio/OntologyIO/goflat.pm line 105. BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.0/Bio/OntologyIO/goflat.pm line 105. Compilation failed in require at /usr/lib/perl5/site_perl/5.8.0/Bio/Root/Root.pm line 394. STACK Bio::Root::Root::_load_module /usr/lib/perl5/site_perl/5.8.0/Bio/Root/Root.pm:396 STACK (eval) /usr/lib/perl5/site_perl/5.8.0/Bio/OntologyIO.pm:255 STACK Bio::OntologyIO::_load_format_module /usr/lib/perl5/site_perl/5.8.0/Bio/OntologyIO.pm:254 STACK Bio::OntologyIO::new /usr/lib/perl5/site_perl/5.8.0/Bio/OntologyIO.pm:165 STACK toplevel /root/bioperl-db/scripts/biosql/load_ontology.pl:449 -------------------------------------- For more information about the OntologyIO system please see the docs. This includes ways of checking for formats at compile time, not run time Parsing input ... Can't call method "next_ontology" on an undefined value at /root/bioperl-db/scripts/biosql/load_ontology.pl line 455. From jason at cgt.duhs.duke.edu Tue Feb 17 16:03:27 2004 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Tue Feb 17 16:09:45 2004 Subject: [BioPython] Re: [Bioperl-l] Bio::SeqFeature::Primer Calculating the Primer TM (fwd) Message-ID: Sebastian's comments cross-posted. -- Jason Stajich Duke University jason at cgt.mc.duke.edu ---------- Forwarded message ---------- Date: Tue, 17 Feb 2004 18:10:43 -0300 From: Sebastian Bassi To: Brad Chapman Cc: biopython@biopython.org Subject: Re: [BioPython] Re: [Bioperl-l] Bio::SeqFeature::Primer Calculating the Primer TM Brad Chapman Escribio > Hope this helps you guys -- and reminds Sebastian we'd still > eventually like to see this in Biopython :-). I now. I am a little busy right now (have you hear of DNALinux?, I'm working on it and some academic projects also!). But I want to code the function. What is holding me now (appart from lack of time) is that I didn't find a working source code of Santalucia's formulae. I tried to follow the paper but I couldn't. An implamentation, even in C or pseudocode, would help me. Another thing that will be usful, would be some pre-made Tm calculation using Santalucia parameters. This will be useful to test my code. What I did was very close since it worked as EMBOSS DAN, but Santalucia is the way to go. Sebastian Bassi. PGP Key available. _________________________________________________________ _______________________________________________ BioPython mailing list - BioPython@biopython.org http://biopython.org/mailman/listinfo/biopython From pm66 at nyu.edu Tue Feb 17 15:18:23 2004 From: pm66 at nyu.edu (Philip MacMenamin) Date: Tue Feb 17 16:58:21 2004 Subject: [Bioperl-l] Bio::DB::GFF::Aggregator problem, new wormbase models. Message-ID: <200402172152.i1HLq31G015646@mx1.nyu.edu> I cannot get things to aggregate properly in the new wormbase models. Previously this worked: $aggregator = Bio::DB::GFF::Aggregator->new(-method => 'test', -sub_parts => ['UTR','CDS:curated'] ); (Or I could of more simply used the 'processed_transcript' aggregator that Linoln wrote.) And one feature, with the UTRs "aggregated" together with the curated CDS's, was returned: test:5_UTR(AH6.5), start->9524078 end->9526248 This could be fed to some kind of glyph, and would draw a nice picture of UTRs hanging off a coding seq. [--------------] [----------] [----> (You have to use your imaginatio a bit here) With the new models of wormbase, this is not the case, so I am re-writing code to accomadate these changes. Now I am returned for example 3 features: a 3 prime UTR, a prime, and a CDS. test:UTR(5_UTR:AH6.5)9524078 9524086 test:UTR(3_UTR:AH6.5)9525782 9526248 test:curated(AH6.5)9524087 9525781 The glyph then draws these things on two planes. [] [-------> [---------] [------------] [----------] My assumption is that a single feature can be rendered as a single glyph. And if you have >1 feature, then >1 glyphs are needed. Therefore, I assume the problem stems from my failure to aggregate UTR and CDS features. I have noticed that the new SQL GFF Db used to have two fmethods for UTRs, one for both 3 and 5 primes. The new one has only one. There are other changes of course, but I think this is important. So, essentially I want to draw UTRs with CDS's attached... with the new wormbase models. If anyone knows how to do this, i'd like to hear from them. Thanks :) -- Philip MacMenamin From todd.harris at cshl.edu Tue Feb 17 17:10:48 2004 From: todd.harris at cshl.edu (Todd Harris) Date: Tue Feb 17 17:17:14 2004 Subject: [Bioperl-l] Bio::DB::GFF::Aggregator problem, new wormbase models. In-Reply-To: <200402172152.i1HLq31G015646@mx1.nyu.edu> Message-ID: Hi Phillip - You need to aggregate the separate parts of the CDS. Create a wormbase_cds (or whatever you wish to call it), aggregating the following features using the CDS group: coding_exon,5_UTR,3_UTR. The following stanza should do the trick. $dbgff = (-adaptor => 'dbi::mysql', -dsn => 'dbi:mysql:database=your_database;host=your_host', -aggregators => [qw(wormabse_cds{coding_exon,5_UTR,3_UTR/CDS})], -user => 'your_username', -pass => 'your_dbgff_pass'); This should do the trick for properly aggregating genes under the new WormBase CDS class. Todd Harris > On 2/17/04 2:18 PM, Philip MacMenamin wrote: > I cannot get things to aggregate properly in the new wormbase models. > Previously this worked: > $aggregator = Bio::DB::GFF::Aggregator->new(-method => 'test', > -sub_parts => ['UTR','CDS:curated'] > ); > (Or I could of more simply used the 'processed_transcript' aggregator that > Linoln wrote.) > > And one feature, with the UTRs "aggregated" together with the curated CDS's, > was returned: > test:5_UTR(AH6.5), start->9524078 end->9526248 > This could be fed to some kind of glyph, and would draw a nice picture of > UTRs hanging off a coding seq. > > [--------------] [----------] [----> > (You have to use your imaginatio a bit here) > > With the new models of wormbase, this is not the case, so I am re-writing > code to accomadate these changes. > > Now I am returned for example 3 features: a 3 prime UTR, a prime, and a CDS. > test:UTR(5_UTR:AH6.5)9524078 9524086 > test:UTR(3_UTR:AH6.5)9525782 9526248 > test:curated(AH6.5)9524087 9525781 > > The glyph then draws these things on two planes. > [] [-------> > [---------] [------------] [----------] > > My assumption is that a single feature can be rendered as a single glyph. And > if you have >1 feature, then >1 glyphs are needed. Therefore, I assume the > problem stems from my failure to aggregate UTR and CDS features. > > I have noticed that the new SQL GFF Db used to have two fmethods for UTRs, > one for both 3 and 5 primes. The new one has only one. There are other > changes of course, but I think this is important. > > So, essentially I want to draw UTRs with CDS's attached... with the new > wormbase models. If anyone knows how to do this, i'd like to hear from them. > > Thanks :) From wes.barris at csiro.au Tue Feb 17 17:56:52 2004 From: wes.barris at csiro.au (Wes Barris) Date: Tue Feb 17 18:03:21 2004 Subject: [Bioperl-l] ace.pm In-Reply-To: References: Message-ID: <40329C34.9030203@csiro.au> Marc Logghe wrote: > The thread (including patch) can be found here: > http://bioperl.org/pipermail/bioperl-l/2002-December/010677.html > The patch should still work, cvs version of the package did not change > in bioperl. Will these changes make their way into the bioperl distribution? > HTH, > Marc > > > > -----Original Message----- > > From: Wes Barris [mailto:wes.barris@csiro.au] > > Sent: dinsdag 17 februari 2004 4:36 > > To: Jason Stajich > > Cc: Bioperl Mailing List > > Subject: Re: [Bioperl-l] ace.pm > > > > > > Jason Stajich wrote: > > > > > > > > People write code and modules to support the work they are doing, > > > sometimes for a specific data set - so I suspect Robson > > wrote this to > > > support phrap ace format which has a convention of them > > being ContigXX. > > > > > > You are welcome to make changes to code on your local > > system to get it > > > working and then post the diffs so they can be incorporated > > back in. Why > > > not try changing the code as you have noticed and seeing if > > it works. It > > > is a collaborative project and these modules are newish, so > > give a try > > > fixing things and then getting feedback on your fixes. > > > > I have modified one line in Bio/Assembly/IO/ace.pm as shown below: > > > > # Loading contig sequence (COntig sequence field) > > # (/^CO Contig(\d+) (\d+) (\d+) (\d+) (\w+)/) && do { # > > New contig > > found! > > (/^CO (\w+) (\d+) (\d+) (\d+) (\w+)/) && do { # New > > contig found! > > > > The change will cause the contigID to be whatever the second field of > > this line is (CO CL15Contig1 794 4 0 U). In this case, it would be > > set to "CL15Contig1". > > > > > > > > > > -jason > > > > > > On Tue, 17 Feb 2004, Wes Barris wrote: > > > > > > > Hi, > > > > > > > > ACE files generated by an application called tgicl have "CO" > > > > lines of the form: > > > > > > > > CO CL15Contig2 794 4 0 U > > > > > > > > This line is not parsed properly by the ace.pm bioperl module. > > > > Notice this line from Bio/Assembly/IO/ace.pm . > > > > > > > > (/^CO Contig(\d+) (\d+) (\d+) (\d+) (\w+)/) && > > do { # New > > > > contig found! > > > > > > > > Bioperl expects the second "word" in the line to be > > "Contig\d+" where > > > > the number is used as the "contigID". Is there a reason why > > > > "contigID" must be a number? Why can't it be the whole second > > > > "word" of the "CO" line? > > > > > > > > > > -- > > > Jason Stajich > > > Duke University > > > jason at cgt.mc.duke.edu > > > > > > > > > -- > > Wes Barris > > E-Mail: Wes.Barris@csiro.au > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l@portal.open-bio.org > > http://portal.open-bio.org/mailman/listinfo/bioperl-l > > > -- Wes Barris E-Mail: Wes.Barris@csiro.au From rrouse at biomail.ucsd.edu Tue Feb 17 18:14:23 2004 From: rrouse at biomail.ucsd.edu (Richard Rouse) Date: Tue Feb 17 18:20:34 2004 Subject: [Bioperl-l] searchio scripts Message-ID: I recent installed bioperl-1.4 and am having problems with the blast report parsers in /examples/searchio/ When I run: perl hitwriter.pl blastreport I get: Using SearchIO->new() 0 Blast report(s) processed. Output sent to file: >hitwriter.out I get the same result with rawwriter.pl, hspwriter.pl and custom_writer.pl although the htmlwriter.pl and the blast_example.pl work fine. Has anyone else encountered this problem and figured out how to fix it? Thanks, Richard From redwards at utmem.edu Tue Feb 17 20:21:02 2004 From: redwards at utmem.edu (Rob Edwards) Date: Tue Feb 17 20:27:28 2004 Subject: [Bioperl-l] Bio::SeqFeature::Primer Calculating the Primer TM In-Reply-To: <4032B065.2000908@genetics.utah.edu> References: <4032B065.2000908@genetics.utah.edu> Message-ID: Barry, Thanks for this code. I really appreciate having an alternate method for calculating this. When I wrote the module I just couldn't figure out why the numbers wouldn't match. It seems Tm calculations are not as straightforward as they should (?) be. I always use 60C for my PCR annealing step and it works 99% of the time :-) I added your code to Bio::SeqFeature::Primer in CVS and updated t/Primer.pm too so that it passes. I left the original calculation in the module, but renamed it Tm_estimate so that it can be there for comparative purposes. Rob From wes.barris at csiro.au Tue Feb 17 20:56:04 2004 From: wes.barris at csiro.au (Wes Barris) Date: Tue Feb 17 21:02:28 2004 Subject: [Bioperl-l] msf output Message-ID: <4032C634.3000507@csiro.au> Hi, Msf files produced by StackPACK have coordinates listed above each group of sequence data. Bioperl msf files do not have this. I have tested a one-line addition that would fix this: Bio/AlignIO/msf.pm: while( $count < $length ) { # there is another block to go! + $self->_print (sprintf("%22s%-27d%27d\n",' ',$count+1,$count+50)); foreach $name ( @arr ) { $self->_print (sprintf("%-20s ",$name)); If there is a more formal way of submitting suggestions, please let me know. -- Wes Barris E-Mail: Wes.Barris@csiro.au From jason at cgt.duhs.duke.edu Tue Feb 17 21:02:19 2004 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Tue Feb 17 21:08:38 2004 Subject: [Bioperl-l] searchio scripts In-Reply-To: References: Message-ID: presumably the report in blastreport has a hit which is better than the signif cutoff of 0.1 my $in = Bio::SearchIO->new( -format => 'blast', -signif => 0.1, -verbose=> 0 ); -jason On Tue, 17 Feb 2004, Richard Rouse wrote: > I recent installed bioperl-1.4 and am having problems with the blast report > parsers in /examples/searchio/ > > > When I run: > perl hitwriter.pl blastreport > I get: > > Using SearchIO->new() > > 0 Blast report(s) processed. > Output sent to file: >hitwriter.out > > I get the same result with rawwriter.pl, hspwriter.pl and custom_writer.pl > although the htmlwriter.pl and the blast_example.pl work fine. > > Has anyone else encountered this problem and figured out how to fix it? > > Thanks, > > Richard > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From barry.moore at genetics.utah.edu Tue Feb 17 19:23:01 2004 From: barry.moore at genetics.utah.edu (Barry Moore) Date: Tue Feb 17 21:12:11 2004 Subject: [Bioperl-l] Bio::SeqFeature::Primer Calculating the Primer TM In-Reply-To: References: Message-ID: <4032B065.2000908@genetics.utah.edu> John, Rob and others, Well I certainly am not a Tm guru, but I'll reply none the less. I have written a Tm calculator that uses thermodynamic parameters rather than "rule of thumb" calculations. Mine follows the formula (and modifications) described by Integrated DNA Technologies on their web site (http://www.idtdna.com/program/techbulletins/Calculating_Tm_(melting_temperature).asp ). This code uses the equation: Tm = (dH / (dS +R * ln(C)) - 273.15 found in Breslauer (1986) and apparently used by GCG and the Python guys where R is the molar gas constant, and C is the molar concentration of the oligo. It adds to that an adjustment of Na+ concentration as per Santa Lucia (1996) with some additional tweaking as described on the IDT web page above to give: Tm = (dH / (dS +R * ln(C)) - 273.15 + 12.0 * log[Na+] . It uses the nearest-neighbor thermodynamic parameter set of Allawi and SantaLucia (1997), but it looks like maybe it should be updated to the SantaLucia (1998) parameter set. I haven't read all the papers discussed in the various posts today, only the couple that my code is based on (and had plenty of trouble understanding all that was in those!), so I don't want to imply that the equations that I use are based on a thorough review of oligo thermodynamics literature, but the code seems to work, and it gives good Tm values. Theoretically I should get the exact same values as IDT's web calculator, but I don't. My values are always very close, but off by a fraction to a couple degrees. It may be due to a difference in parameter sets, although I'm using the same one that IDT references on their site. Rob, I morphed my code into your Primer Tm method, and tried it out. It seems to work fine. It requires one extra parameter (oligo concentration) that I just defaulted. If you want to use the code as is (or as a starting point) it is yours to do with as you see fit. I can update the "thermodynamic parameters" hash to the SantaLucia (1998) values if this code looks promising and there is general agreement those values are the better. I don't have CVS access, so I'll just post the modified method code at the very end of this message. I did a quick and dirty test to see how Tm values differ between your Primer Tm method, my code, and IDT's web calculator. They tend towards Tm(Rob) > Tm(IDT) > Tm(Barry). Here's the result: Oligo Primer.pm (Rob) Primer.pm (Barry) IDT ACCGATACCG 34.49709793 29.41129054 31.3 ACCCGATCTAGTAGA 49.03043126 41.9210458 43.3 CATGGAGAGGGTGCAAATCC 62.44709793 55.72210633 56.8 AAAGTAACCGAGAGAATCTGGAACA 62.29709793 56.7940798 57.7 GGCTTTTGAAGTGGCAGAAAGACTGGGGGT 71.76376459 67.19994018 68 CACTCGCCTGCTGGATGCAGAAGATGTGGATGTGC 76.18281221 70.5586708 71.2 CTCTCCAGATGAAAAGTCTGTAATCACTTATGTGTCTTCG 71.29709793 63.54486627 64.1 ATTTATGATGCCTTCCCTAAAGTTCCTGAGGGTGGAGAAGGGATC 75.69709793 69.5961597 70.1 AGTGCTACGGAAGTGGACTCCAGGTGGCAAGAATACCAAAGCCGAGTGGA 80.03709793 75.41693338 75.9 Here are the papers I've referenced: * Breslauer, K.J., Frank, R., Blocker, H., Marky, L.A. (1986) "Predicting DNA duplex stability from the base sequence." Proc.Natl. Acad. Sci. USA 83:3746-3750 * SantaLucia, Jr., J.S, Allawi, H.T., Seneviratne, P.A. (1996) "Improved nearest-neighbor parameters for predicting DNA duplex stability" Biochemistry 35:3555-3562. * Allawi, H.T., SantaLucia, J. Jr. (1997) "Thermodynamics and NMR of internal G.T mismatches in DNA." Biochemistry 36: 10581-10594 * SantaLucia J. Jr. (1998) "A unified view of polymer, dumbbell, and oligonucleotide DNA nearest-neighbor thermodynamics" PNAS 95: 1460-1465. Here is the new Tm method code: =head2 Tm() Title : Tm() Usage : $tm = $primer->Tm(-salt=>'0.05') Function: Calculates and returns the Tm (melting temperature) of the primer Returns : A scalar containing the Tm. Args : -salt set the Na+ concentration on which to base the calculation. (A parameter should be added to allow the oligo concentration to be set.) Notes : Calculation of Tm as per Allawi et. al Biochemistry 1997 36:10581-10594. Also see documentation at http://biotools.idtdna.com/analyzer/ as they use this formula and have a couple nice help pages. These Tm values will be about are about 0.5-3 degrees off from those of the idtdna web tool. I don't know why. =cut sub Tm { my ($self, %args) = @_; my $salt_conc = 0.05; #salt concentration (molar units) my $oligo_conc = 0.00000025; #oligo concentration (molar units) if ($args{'-salt'}) {$salt_conc = $args{'-salt'}} #accept object defined salt concentration #if ($args{'-oligo'}) {$oligo_conc = $args{'-oligo'}} #accept object defined oligo concentration my $seqobj = $self->seq(); my $length = $seqobj->length(); my $sequence = uc $seqobj->seq(); my @dinucleotides; my $enthalpy; my $entropy; #Break sequence string into an array of all possible dinucleotides while ($sequence =~ /(.)(?=(.))/g) { push @dinucleotides, $1.$2; } #Build a hash with the thermodynamic values my %thermo_values = ('AA' => {'enthalpy' => -7.9, 'entropy' => -22.2}, 'AC' => {'enthalpy' => -8.4, 'entropy' => -22.4}, 'AG' => {'enthalpy' => -7.8, 'entropy' => -21}, 'AT' => {'enthalpy' => -7.2, 'entropy' => -20.4}, 'CA' => {'enthalpy' => -8.5, 'entropy' => -22.7}, 'CC' => {'enthalpy' => -8, 'entropy' => -19.9}, 'CG' => {'enthalpy' => -10.6, 'entropy' => -27.2}, 'CT' => {'enthalpy' => -7.8, 'entropy' => -21}, 'GA' => {'enthalpy' => -8.2, 'entropy' => -22.2}, 'GC' => {'enthalpy' => -9.8, 'entropy' => -24.4}, 'GG' => {'enthalpy' => -8, 'entropy' => -19.9}, 'GT' => {'enthalpy' => -8.4, 'entropy' => -22.4}, 'TA' => {'enthalpy' => -7.2, 'entropy' => -21.3}, 'TC' => {'enthalpy' => -8.2, 'entropy' => -22.2}, 'TG' => {'enthalpy' => -8.5, 'entropy' => -22.7}, 'TT' => {'enthalpy' => -7.9, 'entropy' => -22.2}, 'A' => {'enthalpy' => 2.3, 'entropy' => 4.1}, 'C' => {'enthalpy' => 0.1, 'entropy' => -2.8}, 'G' => {'enthalpy' => 0.1, 'entropy' => -2.8}, 'T' => {'enthalpy' => 2.3, 'entropy' => 4.1} ); #Loop through dinucleotides and calculate cumulative enthalpy and entropy values for (@dinucleotides) { $enthalpy += $thermo_values{$_}{enthalpy}; $entropy += $thermo_values{$_}{entropy}; } #Account for initiation parameters $enthalpy += $thermo_values{substr($sequence, 0, 1)}{enthalpy}; $entropy += $thermo_values{substr($sequence, 0, 1)}{entropy}; $enthalpy += $thermo_values{substr($sequence, -1, 1)}{enthalpy}; $entropy += $thermo_values{substr($sequence, -1, 1)}{entropy}; #Symmetry correction $entropy -= 1.4; my $r = 1.987; #molar gas constant my $tm = ($enthalpy * 1000 / ($entropy + ($r * log($oligo_conc))) - 273.15 + (12* (log($salt_conc)/log(10)))); $self->{'Tm'}=$tm; return $tm; } From jason at cgt.duhs.duke.edu Tue Feb 17 21:16:45 2004 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Tue Feb 17 21:23:08 2004 Subject: [Bioperl-l] msf output In-Reply-To: <4032C634.3000507@csiro.au> References: <4032C634.3000507@csiro.au> Message-ID: Done. Thanks Wes! -jason On Wed, 18 Feb 2004, Wes Barris wrote: > Hi, > > Msf files produced by StackPACK have coordinates listed above each > group of sequence data. Bioperl msf files do not have this. I have > tested a one-line addition that would fix this: > > Bio/AlignIO/msf.pm: > while( $count < $length ) { > # there is another block to go! > + $self->_print (sprintf("%22s%-27d%27d\n",' ',$count+1,$count+50)); > foreach $name ( @arr ) { > $self->_print (sprintf("%-20s ",$name)); > > If there is a more formal way of submitting suggestions, please let > me know. > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From hlapp at gmx.net Wed Feb 18 04:19:30 2004 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed Feb 18 04:25:46 2004 Subject: [Bioperl-l] New GO Parser and errors loading biosql database In-Reply-To: <10C94843061E094A98C02EB77CFC328722FE05@nrcmrdex1d.imsb.nrc.ca> Message-ID: <8E53E59D-61F3-11D8-97D5-000A959EB4C4@gmx.net> On Tuesday, February 17, 2004, at 11:42 AM, Law, Annie wrote: > Hi, > > I would appreciate help with the following. > I have installed the newest bioperl-db and biosql schema from cvs. > I tried to load the database with information from godatabase.org and > got > some errors listed further below (the > Tables did not fill at all). > Next I tried to load the database with Locuslink data from NCBI. > > 1)I got the LL file from NCBI and tried to load an empty datbase with a > LL_tmpl file (for human) and > It seemed to load properly and the tables were filling up but then it > stopped after about 900 bioentries. > I'm not sure what went wrong. There seem to be a complaint about > duplicate > entry but I don't think I should > Modify the source file. > It should never be necessary to modify the source file. First of all, unless you're testing or debugging something and actually *want* to get thrown out upon the first error, you should always specify --safe, for load_ontology.pl as well as for load_seqdatabase.pl. This will roll back a sequence that fails to load, but will otherwise keep going. > [root@ data]# perl /root/bioperl-db/scripts/biosql/load_seqdatabase.pl > --dbuser=root > --dbpass=mss22 --dbname bioseqdb --namespace "LocusLink" -format > locuslink > /var/lib/mysql/LL_ > _tmpl --dbpass=bioinf1 --dbname bioseqdb --namespace "LocusLink" > -format > locuslink /var/lib/mysql/LL_tmpl > Loading /var/lib/mysql/LL_tmpl ... > > -------------------- WARNING --------------------- > MSG: insert in Bio::DB::BioSQL::TermAdaptor (driver) failed, values > were > ("GO:0005699","kinetochore","","") FKs (6) > Duplicate entry 'kinetochore-6' for key 2 > --------------------------------------------------- This basically means that there is already another term 'kinetochore' in the same ontology, but with a different GO id. I.e., the look-up by GO id failed for this one, prompting the system to insert the term as a new one, which then (unexpectedly) fails too because of the unique key violation. This is not atypical for annotation being a work in progress. Also, and actually more likely, you may have had remnants of previous data loads there. GO terms get merged with others, and then previously primary IDs either get retired or become secondary IDs. A database of annotated genes like LL may not be immediately up to date. Especially for LL the best thing is to always pre-load GO and other ontologies that sequences are associated (annotated) with. Also, it's not a bad idea to pre-load the NCBI taxonomy database using the script load_ncbi_taxonomy.pl in the biosql repository. > > 2) Updating GO parser > I saw that the GO parser was updated recently and I have located the > code > version 1.17.2.1 I downloaded the new version. I am using bioperl 1.4. > Should I just take the new dagflat.pm and replace the old one or are > there > more steps involved? Not really. There is also an updated test but you don't need that. > When I download whole Modules I need to use make > commands. > > Also I saw that dagflat.pm requires graph.pm. It's not actually dagflat that requires it but the OntologyEngineI implementation it populates behind the scenes (Bio::Ontology::SimpleGOEngine if you're curious). But as a consequence of all this, yes if you use the dagflat parser (and goflat and soflat are basically just other names for the same parser) you do need Graph.pm from CPAN. > Is this graph.pm part of the > bioperl 1.4 package I couldn't seem to find it or do I need to > download and > install from CPAN. You get it from CPAN. The name is Graph.pm. If the CPAN shell doesn't understand that, ask it to install Graph::Directed. > I searched CPAN for graph.pm and got several hits. Is > this the one I need? http://search.cpan.org/~mverb/GDGraph-1.43/ > Do I also need GD.pm? I think I saw somewhere that it is required? > http://search.cpan.org/~lds/GD-2.11/GD.pm > Although this could be a mistake > You do not need GD (or GDGraph or whatever) for bioperl-db. > Where is the best place to install GD and graph.pm (with dagflat.pm or > the > main perl library)? > I'm not sure whether the main perl library is /usr/lib/perl5/5.8.0 or > /usr/lib/perl5/site_perl/5.8.0/Bio The CPAN shell will do that automatically. Also, if you just say 'make install' in a perl module's root source directory, it will be installed in the right place. The only think to be careful about is to use the same perl for running 'perl Makefile.PL' that you otherwise use for running perl scripts. > > > 3) I installed Bioperl-db and downloaded the biosql schema > successfully but > when I tried to use the Load_ontology.pl I got some errors which seem > to be > saying that I am missing some main modules such as goflat (I recorded a > script of the output). But I have goflat.pm. > Am I calling the perl script incorrectly? Or are there still some > modules I > need to install? I'm not sure that I am using the correct > Syntax for the format field. The reason it is failing is because you don't have Graph.pm installed as the stack trace states: > Bio::OntologyIO: goflat cannot be found > Exception > ------------- EXCEPTION ------------- > MSG: Failed to load module Bio::OntologyIO::goflat. Can't locate > Graph/Directed.pm in @INC (@INC contains: The initial message that goflat.pm cannot be found is just a (wrong in this case) interpretation of the failure to dynamically load the module. -hilmar -- ------------------------------------------------------------- Hilmar Lapp email: lapp at gnf.org GNF, San Diego, Ca. 92121 phone: +1-858-812-1757 ------------------------------------------------------------- From lstein at cshl.edu Wed Feb 18 04:23:45 2004 From: lstein at cshl.edu (Lincoln Stein) Date: Wed Feb 18 04:45:36 2004 Subject: [Bioperl-l] Bio::DB::GFF::Aggregator problem, new wormbase models. In-Reply-To: References: Message-ID: <200402181123.45491.lstein@cshl.edu> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I'll make a new prepackaged aggregator for this as soon as WormBase makes the final and long-anticipated transition to real genes. Lincoln On Wednesday 18 February 2004 12:10 am, Todd Harris wrote: > Hi Phillip - > > You need to aggregate the separate parts of the CDS. Create a > wormbase_cds (or whatever you wish to call it), aggregating the > following features using the CDS group: coding_exon,5_UTR,3_UTR. > > The following stanza should do the trick. > > $dbgff = (-adaptor => 'dbi::mysql', > -dsn => > 'dbi:mysql:database=your_database;host=your_host', -aggregators => > [qw(wormabse_cds{coding_exon,5_UTR,3_UTR/CDS})], -user => > 'your_username', > -pass => 'your_dbgff_pass'); > > This should do the trick for properly aggregating genes under the > new WormBase CDS class. > > Todd Harris > > > On 2/17/04 2:18 PM, Philip MacMenamin wrote: > > > > I cannot get things to aggregate properly in the new wormbase > > models. Previously this worked: > > $aggregator = Bio::DB::GFF::Aggregator->new(-method => 'test', > > -sub_parts => ['UTR','CDS:curated'] > > ); > > (Or I could of more simply used the 'processed_transcript' > > aggregator that Linoln wrote.) > > > > And one feature, with the UTRs "aggregated" together with the > > curated CDS's, was returned: > > test:5_UTR(AH6.5), start->9524078 end->9526248 > > This could be fed to some kind of glyph, and would draw a nice > > picture of UTRs hanging off a coding seq. > > > > [--------------] [----------] [----> > > (You have to use your imaginatio a bit here) > > > > With the new models of wormbase, this is not the case, so I am > > re-writing code to accomadate these changes. > > > > Now I am returned for example 3 features: a 3 prime UTR, a prime, > > and a CDS. test:UTR(5_UTR:AH6.5)9524078 9524086 > > test:UTR(3_UTR:AH6.5)9525782 9526248 > > test:curated(AH6.5)9524087 9525781 > > > > The glyph then draws these things on two planes. > > [] [-------> > > [---------] [------------] [----------] > > > > My assumption is that a single feature can be rendered as a > > single glyph. And if you have >1 feature, then >1 glyphs are > > needed. Therefore, I assume the problem stems from my failure to > > aggregate UTR and CDS features. > > > > I have noticed that the new SQL GFF Db used to have two fmethods > > for UTRs, one for both 3 and 5 primes. The new one has only one. > > There are other changes of course, but I think this is important. > > > > So, essentially I want to draw UTRs with CDS's attached... with > > the new wormbase models. If anyone knows how to do this, i'd like > > to hear from them. > > > > Thanks :) > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l - -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (GNU/Linux) iD8DBQFAMy8h0CIvUP7P+AkRAtCSAJwPVPdXqs9rXSFYdCD8lVhsB/5wkACdEOrr EwcJXiat61tP3F5XJXA1j+c= =P0TB -----END PGP SIGNATURE----- From lstein at cshl.edu Wed Feb 18 04:57:06 2004 From: lstein at cshl.edu (Lincoln Stein) Date: Wed Feb 18 05:28:28 2004 Subject: [Bioperl-l] Re: BOSC 2004 In-Reply-To: <4031E4C5.5020409@egenetics.com> References: <4031E4C5.5020409@egenetics.com> Message-ID: <200402181157.06253.lstein@cshl.edu> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 http://open-bio.org/bosc2004/ If you scroll down the open bio page, you'll see a huge banner advertisement for BOSC. Maybe you have images turned off? Best, Lincoln On Tuesday 17 February 2004 11:54 am, you wrote: > Hi Lincoln > > Where can I find info about BOSC 2004? Seems there isn't anything > on open-bio.org - I'm looking for info on deadlines for poster > submissions, etc. > > Peter - -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (GNU/Linux) iD8DBQFAMzby0CIvUP7P+AkRAi/ZAJ9x6zi5v3Ehe45S81wAjDXTNp4eqACeO4Vu Relp13w9sW06zEVJcBpkFZ4= =0YAy -----END PGP SIGNATURE----- From steve_chervitz at affymetrix.com Wed Feb 18 05:27:57 2004 From: steve_chervitz at affymetrix.com (Steve Chervitz) Date: Wed Feb 18 05:34:11 2004 Subject: [Bioperl-l] searchio scripts In-Reply-To: References: Message-ID: <1EC43881-61FD-11D8-A988-000A95765236@affymetrix.com> Looks like there was a change in the Root::IO.pm module that affects the way these scripts process command-line arguments. As of bioperl-1.303, the SearchIO::blast module appears to be unable to read data from STDIN or files listed in @ARGV. This affects the scripts in examples/searchio and scripts/searchio. As a workaround, I'd recommend you iterate over @ARGV in your script and initialize the SearchIO object using the -file option to new(), as in: while (my $file = shift @ARGV) { my $in = Bio::SearchIO->new( -format => 'blast', -file => $file ); while ( my $result = $in->next_result() ) { # process result... } } As far as tracking down the cause, I've pinpointed the following change in Bio::Root::IO::_readline(): my $fh = $self->_fh or return; # revision 1.50 (bioperl-1.303) formerly this was: my $fh = $self->_fh || \*ARGV; # revision 1.49 (bioperl-1.302) This also appears to break SeqIO reading from STDIN. Try executing this at the top-level distribution dir for the 1.302 and 1.303 releases: perl -I. ./scripts/seq/translate_seq.PLS -format fasta < t/data/dna1.fa According to Lincoln's commit log, the Root::IO::_readline() change was necessary to get the GFF, SeqFeature, and Registry regression tests working. I tested these tests with the 1.49 version of IO.pm and the only one that was affected was SeqFeature.t. Specifically, test #6 which calls SeqFeature::Generic::gff_string() hangs and waits for input before proceeding. I'm not sure why this is... (getting late). BTW, platforms tested: Perl 5.6.1 and 5.8.0 on Linux (RH9) and Perl 5.8.1-RC3 on MacOS X (10.3.2). Steve On Feb 17, 2004, at 3:14 PM, Richard Rouse wrote: > I recent installed bioperl-1.4 and am having problems with the blast > report > parsers in /examples/searchio/ > > > When I run: > perl hitwriter.pl blastreport > I get: > > Using SearchIO->new() > > 0 Blast report(s) processed. > Output sent to file: >hitwriter.out > > I get the same result with rawwriter.pl, hspwriter.pl and > custom_writer.pl > although the htmlwriter.pl and the blast_example.pl work fine. > > Has anyone else encountered this problem and figured out how to fix it? > > Thanks, > > Richard > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > From Alexandre.Irrthum at icr.ac.uk Wed Feb 18 07:31:27 2004 From: Alexandre.Irrthum at icr.ac.uk (Alexandre Irrthum) Date: Wed Feb 18 07:37:50 2004 Subject: [Bioperl-l] AlignIO warning Message-ID: Hi there, The snippet of code shown below works fine (with bioperl 1.4), but it issues this warning when next_aln() is called: -------------------- WARNING --------------------- MSG: Must provide which type of BLAST was run (blastp,blastn, tblastn, tblastx, blastx) if you want strand information to get set properly for DNA query or subjects #!/usr/bin/perl use warnings; use strict; use Bio::Seq; use Bio::Tools::Run::StandAloneBlast; use Bio::AlignIO; my $seq1 = Bio::Seq->new(-display_id => 'Sequence1', -seq => 'AGGATAGGGCGGATAGGTAGCGCCGATTTACGCGATACGCG'); my $seq2 = Bio::Seq->new(-display_id => 'Sequence2', -seq => 'AGGATAGGGCAGATAGGTAGCGCCGATTTACGTGATACGCG'); my $factory = Bio::Tools::Run::StandAloneBlast->new(program => 'blastn', outfile => 'bl2seq.out'); $factory->bl2seq($seq1, $seq2); my $str = Bio::AlignIO->new(-file => 'bl2seq.out', -format => 'bl2seq'); my $aln = $str->next_aln(); ###### Warning issued here ###### foreach my $seq ($aln->each_seq()) { print $seq->seq(), "\n"; } How am I supposed to provide program name ? Thank you for your help. Alex From lstein at cshl.edu Wed Feb 18 07:57:24 2004 From: lstein at cshl.edu (Lincoln Stein) Date: Wed Feb 18 08:04:17 2004 Subject: [Bioperl-l] searchio scripts In-Reply-To: <1EC43881-61FD-11D8-A988-000A95765236@affymetrix.com> References: <1EC43881-61FD-11D8-A988-000A95765236@affymetrix.com> Message-ID: <200402181457.24179.lstein@cshl.edu> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Or do this: my $in = Bio::SearchIO->new(-format=>'blast',-fh=>\*ARGV); while (my $result = $in->next_result()) { ... } That might even be easier. Lincoln On Wednesday 18 February 2004 12:27 pm, Steve Chervitz wrote: > Looks like there was a change in the Root::IO.pm module that > affects the way these scripts process command-line arguments. As of > bioperl-1.303, the SearchIO::blast module appears to be unable to > read data from STDIN or files listed in @ARGV. This affects the > scripts in examples/searchio and scripts/searchio. > > As a workaround, I'd recommend you iterate over @ARGV in your > script and initialize the SearchIO object using the -file option to > new(), as in: > > while (my $file = shift @ARGV) { > my $in = Bio::SearchIO->new( -format => 'blast', > -file => $file > ); > while ( my $result = $in->next_result() ) { > # process result... > } > } > > As far as tracking down the cause, I've pinpointed the following > change in Bio::Root::IO::_readline(): > > my $fh = $self->_fh or return; # revision 1.50 > (bioperl-1.303) > > formerly this was: > > my $fh = $self->_fh || \*ARGV; # revision 1.49 > (bioperl-1.302) > > This also appears to break SeqIO reading from STDIN. Try executing > this at the top-level distribution dir for the 1.302 and 1.303 > releases: > > perl -I. ./scripts/seq/translate_seq.PLS -format fasta < > t/data/dna1.fa > > According to Lincoln's commit log, the Root::IO::_readline() change > was necessary to get the GFF, SeqFeature, and Registry regression > tests working. I tested these tests with the 1.49 version of IO.pm > and the only one that was affected was SeqFeature.t. Specifically, > test #6 which calls SeqFeature::Generic::gff_string() hangs and > waits for input before proceeding. I'm not sure why this is... > (getting late). > > BTW, platforms tested: Perl 5.6.1 and 5.8.0 on Linux (RH9) and Perl > 5.8.1-RC3 on MacOS X (10.3.2). > > Steve > > On Feb 17, 2004, at 3:14 PM, Richard Rouse wrote: > > I recent installed bioperl-1.4 and am having problems with the > > blast report > > parsers in /examples/searchio/ > > > > > > When I run: > > perl hitwriter.pl blastreport > > I get: > > > > Using SearchIO->new() > > > > 0 Blast report(s) processed. > > Output sent to file: >hitwriter.out > > > > I get the same result with rawwriter.pl, hspwriter.pl and > > custom_writer.pl > > although the htmlwriter.pl and the blast_example.pl work fine. > > > > Has anyone else encountered this problem and figured out how to > > fix it? > > > > Thanks, > > > > Richard > > > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l@portal.open-bio.org > > http://portal.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l - -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (GNU/Linux) iD8DBQFAM2E00CIvUP7P+AkRAurTAJ9gwb4Os0M5uDWhlE40JphLRIAG+gCfQ5Ji zXHLGwtfDAB2Np2nKBZkuw0= =IsKs -----END PGP SIGNATURE----- From sdavis2 at mail.nih.gov Wed Feb 18 08:17:37 2004 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Wed Feb 18 08:13:35 2004 Subject: [Bioperl-l] Blast results question Message-ID: I have a large number of blast results against the (human) genome. The query was a large number of oligos (from microarray) taken from various ESTs or full-length transcripts. I have many that are "broken" by splice sites in the genome resulting in two different "hits" near each other. Is there someone who has code or suggestions about how to "stitch" these hits back together? Thanks, Sean From nathanhaigh at ukonline.co.uk Wed Feb 18 08:39:36 2004 From: nathanhaigh at ukonline.co.uk (Nathan Haigh) Date: Wed Feb 18 08:45:58 2004 Subject: [Bioperl-l] gap characters in SimpleAlign objects Message-ID: I've been using the clustalw module for creating alignment, and I've just realised that when you output the alignment the gap character is a "." not a "-". This is most annoying because I am adding support to this module for generating trees via clustalw, and clustalw removes these "." characters. Is there a method for changing these gap characters to "-". I have seen the gap_char method in the SimpleAlign module, but this seems only to designate a particular character as a gap character, and does not actually change the character. Any ideas on how to do this substitution, and where in BioPerl does this assignment get made in the first place, since the default gap char for clustalw output is "-" not "." Thanks Nathan From john.herbert at clinical-pharmacology.oxford.ac.uk Wed Feb 18 08:48:34 2004 From: john.herbert at clinical-pharmacology.oxford.ac.uk (john herbert) Date: Wed Feb 18 08:55:02 2004 Subject: [Bioperl-l] Bio::SeqFeature::Primer Calculating the Primer TM Message-ID: Hello All. Thanks for all you suggestions and code. I did have a go myself but it was not until Barry supplied this code that I actually got it. Kind regards, John. >>> Barry Moore 18/02/2004 00:23:01 >>> John, Rob and others, Well I certainly am not a Tm guru, but I'll reply none the less. I have written a Tm calculator that uses thermodynamic parameters rather than "rule of thumb" calculations. Mine follows the formula (and modifications) described by Integrated DNA Technologies on their web site (http://www.idtdna.com/program/techbulletins/Calculating_Tm_(melting_temperature).asp ). This code uses the equation: Tm = (dH / (dS +R * ln(C)) - 273.15 found in Breslauer (1986) and apparently used by GCG and the Python guys where R is the molar gas constant, and C is the molar concentration of the oligo. It adds to that an adjustment of Na+ concentration as per Santa Lucia (1996) with some additional tweaking as described on the IDT web page above to give: Tm = (dH / (dS +R * ln(C)) - 273.15 + 12.0 * log[Na+] . It uses the nearest-neighbor thermodynamic parameter set of Allawi and SantaLucia (1997), but it looks like maybe it should be updated to the SantaLucia (1998) parameter set. I haven't read all the papers discussed in the various posts today, only the couple that my code is based on (and had plenty of trouble understanding all that was in those!), so I don't want to imply that the equations that I use are based on a thorough review of oligo thermodynamics literature, but the code seems to work, and it gives good Tm values. Theoretically I should get the exact same values as IDT's web calculator, but I don't. My values are always very close, but off by a fraction to a couple degrees. It may be due to a difference in parameter sets, although I'm using the same one that IDT references on their site. Rob, I morphed my code into your Primer Tm method, and tried it out. It seems to work fine. It requires one extra parameter (oligo concentration) that I just defaulted. If you want to use the code as is (or as a starting point) it is yours to do with as you see fit. I can update the "thermodynamic parameters" hash to the SantaLucia (1998) values if this code looks promising and there is general agreement those values are the better. I don't have CVS access, so I'll just post the modified method code at the very end of this message. I did a quick and dirty test to see how Tm values differ between your Primer Tm method, my code, and IDT's web calculator. They tend towards Tm(Rob) > Tm(IDT) > Tm(Barry). Here's the result: Oligo Primer.pm (Rob) Primer.pm (Barry) IDT ACCGATACCG 34.49709793 29.41129054 31.3 ACCCGATCTAGTAGA 49.03043126 41.9210458 43.3 CATGGAGAGGGTGCAAATCC 62.44709793 55.72210633 56.8 AAAGTAACCGAGAGAATCTGGAACA 62.29709793 56.7940798 57.7 GGCTTTTGAAGTGGCAGAAAGACTGGGGGT 71.76376459 67.19994018 68 CACTCGCCTGCTGGATGCAGAAGATGTGGATGTGC 76.18281221 70.5586708 71.2 CTCTCCAGATGAAAAGTCTGTAATCACTTATGTGTCTTCG 71.29709793 63.54486627 64.1 ATTTATGATGCCTTCCCTAAAGTTCCTGAGGGTGGAGAAGGGATC 75.69709793 69.5961597 70.1 AGTGCTACGGAAGTGGACTCCAGGTGGCAAGAATACCAAAGCCGAGTGGA 80.03709793 75.41693338 75.9 Here are the papers I've referenced: * Breslauer, K.J., Frank, R., Blocker, H., Marky, L.A. (1986) "Predicting DNA duplex stability from the base sequence." Proc.Natl. Acad. Sci. USA 83:3746-3750 * SantaLucia, Jr., J.S, Allawi, H.T., Seneviratne, P.A. (1996) "Improved nearest-neighbor parameters for predicting DNA duplex stability" Biochemistry 35:3555-3562. * Allawi, H.T., SantaLucia, J. Jr. (1997) "Thermodynamics and NMR of internal G.T mismatches in DNA." Biochemistry 36: 10581-10594 * SantaLucia J. Jr. (1998) "A unified view of polymer, dumbbell, and oligonucleotide DNA nearest-neighbor thermodynamics" PNAS 95: 1460-1465. Here is the new Tm method code: =head2 Tm() Title : Tm() Usage : $tm = $primer->Tm(-salt=>'0.05') Function: Calculates and returns the Tm (melting temperature) of the primer Returns : A scalar containing the Tm. Args : -salt set the Na+ concentration on which to base the calculation. (A parameter should be added to allow the oligo concentration to be set.) Notes : Calculation of Tm as per Allawi et. al Biochemistry 1997 36:10581-10594. Also see documentation at http://biotools.idtdna.com/analyzer/ as they use this formula and have a couple nice help pages. These Tm values will be about are about 0.5-3 degrees off from those of the idtdna web tool. I don't know why. =cut sub Tm { my ($self, %args) = @_; my $salt_conc = 0.05; #salt concentration (molar units) my $oligo_conc = 0.00000025; #oligo concentration (molar units) if ($args{'-salt'}) {$salt_conc = $args{'-salt'}} #accept object defined salt concentration #if ($args{'-oligo'}) {$oligo_conc = $args{'-oligo'}} #accept object defined oligo concentration my $seqobj = $self->seq(); my $length = $seqobj->length(); my $sequence = uc $seqobj->seq(); my @dinucleotides; my $enthalpy; my $entropy; #Break sequence string into an array of all possible dinucleotides while ($sequence =~ /(.)(?=(.))/g) { push @dinucleotides, $1.$2; } #Build a hash with the thermodynamic values my %thermo_values = ('AA' => {'enthalpy' => -7.9, 'entropy' => -22.2}, 'AC' => {'enthalpy' => -8.4, 'entropy' => -22.4}, 'AG' => {'enthalpy' => -7.8, 'entropy' => -21}, 'AT' => {'enthalpy' => -7.2, 'entropy' => -20.4}, 'CA' => {'enthalpy' => -8.5, 'entropy' => -22.7}, 'CC' => {'enthalpy' => -8, 'entropy' => -19.9}, 'CG' => {'enthalpy' => -10.6, 'entropy' => -27.2}, 'CT' => {'enthalpy' => -7.8, 'entropy' => -21}, 'GA' => {'enthalpy' => -8.2, 'entropy' => -22.2}, 'GC' => {'enthalpy' => -9.8, 'entropy' => -24.4}, 'GG' => {'enthalpy' => -8, 'entropy' => -19.9}, 'GT' => {'enthalpy' => -8.4, 'entropy' => -22.4}, 'TA' => {'enthalpy' => -7.2, 'entropy' => -21.3}, 'TC' => {'enthalpy' => -8.2, 'entropy' => -22.2}, 'TG' => {'enthalpy' => -8.5, 'entropy' => -22.7}, 'TT' => {'enthalpy' => -7.9, 'entropy' => -22.2}, 'A' => {'enthalpy' => 2.3, 'entropy' => 4.1}, 'C' => {'enthalpy' => 0.1, 'entropy' => -2.8}, 'G' => {'enthalpy' => 0.1, 'entropy' => -2.8}, 'T' => {'enthalpy' => 2.3, 'entropy' => 4.1} ); #Loop through dinucleotides and calculate cumulative enthalpy and entropy values for (@dinucleotides) { $enthalpy += $thermo_values{$_}{enthalpy}; $entropy += $thermo_values{$_}{entropy}; } #Account for initiation parameters $enthalpy += $thermo_values{substr($sequence, 0, 1)}{enthalpy}; $entropy += $thermo_values{substr($sequence, 0, 1)}{entropy}; $enthalpy += $thermo_values{substr($sequence, -1, 1)}{enthalpy}; $entropy += $thermo_values{substr($sequence, -1, 1)}{entropy}; #Symmetry correction $entropy -= 1.4; my $r = 1.987; #molar gas constant my $tm = ($enthalpy * 1000 / ($entropy + ($r * log($oligo_conc))) - 273.15 + (12* (log($salt_conc)/log(10)))); $self->{'Tm'}=$tm; return $tm; } From birney at ebi.ac.uk Wed Feb 18 08:54:30 2004 From: birney at ebi.ac.uk (Ewan Birney) Date: Wed Feb 18 09:00:46 2004 Subject: [Bioperl-l] gap characters in SimpleAlign objects In-Reply-To: Message-ID: On Wed, 18 Feb 2004, Nathan Haigh wrote: > I've been using the clustalw module for creating alignment, and I've just > realised that when you output the alignment the gap character is a "." not a > "-". > This is most annoying because I am adding support to this module for > generating trees via clustalw, and clustalw removes these "." characters. Is > there a method for changing these gap characters to "-". I have seen the > gap_char method in the SimpleAlign module, but this seems only to designate > a particular character as a gap character, and does not actually change the > character. > > Any ideas on how to do this substitution, and where in BioPerl does this > assignment get made in the first place, since the default gap char for > clustalw output is "-" not "." To fix (short term): Loop over the sequences making a new SimpleAlign object with LocatableSeqs and s/\./-/ on the seq strings How are you reading in Clustalw alignments? The Bio::AlignIO::clustalw doesn't touch the gap characters: foreach my $name ( sort { $order{$a} <=> $order{$b} } keys %alignments ) { if( $name =~ /(\S+):(\d+)-(\d+)/ ) { ($sname,$start,$end) = ($1,$2,$3); } else { ($sname, $start) = ($name,1); my $str = $alignments{$name}; $str =~ s/[^A-Za-z]//g; $end = length($str); } my $seq = new Bio::LocatableSeq('-seq' => $alignments{$name}, '-id' => $sname, '-start' => $start, '-end' => $end); ($alignments{$name} has no regex put on it earlier either) > > Thanks > Nathan > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > ----------------------------------------------------------------- Ewan Birney. Mobile: +44 (0)7970 151230, Work: +44 1223 494420 . ----------------------------------------------------------------- From jason at cgt.duhs.duke.edu Wed Feb 18 09:09:02 2004 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Wed Feb 18 09:15:19 2004 Subject: [Bioperl-l] get sequence failure (fwd) Message-ID: -- Jason Stajich Duke University jason at cgt.mc.duke.edu ---------- Forwarded message ---------- Date: Wed, 18 Feb 2004 09:42:21 +0100 (CET) From: "[iso-8859-1] william ritchie" To: Jason Stajich Subject: Re: [Bioperl-l] get sequence failure sorry, even by using refseq, I m getting ID unknown!! Thanks Yahoo! Mail : votre e-mail personnel et gratuit qui vous suit partout ! Cr?ez votre Yahoo! Mail sur http://fr.benefits.yahoo.com/ From jason at cgt.duhs.duke.edu Wed Feb 18 09:14:18 2004 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Wed Feb 18 09:21:09 2004 Subject: [Bioperl-l] gap characters in SimpleAlign objects In-Reply-To: References: Message-ID: $aln->map_chars('\.','-'); On Wed, 18 Feb 2004, Nathan Haigh wrote: > I've been using the clustalw module for creating alignment, and I've just > realised that when you output the alignment the gap character is a "." not a > "-". > This is most annoying because I am adding support to this module for > generating trees via clustalw, and clustalw removes these "." characters. Is > there a method for changing these gap characters to "-". I have seen the > gap_char method in the SimpleAlign module, but this seems only to designate > a particular character as a gap character, and does not actually change the > character. > > Any ideas on how to do this substitution, and where in BioPerl does this > assignment get made in the first place, since the default gap char for > clustalw output is "-" not "." > > Thanks > Nathan > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From jason at cgt.duhs.duke.edu Wed Feb 18 09:15:27 2004 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Wed Feb 18 09:21:51 2004 Subject: [Bioperl-l] gap characters in SimpleAlign objects In-Reply-To: References: Message-ID: It's easier than this - not sure why gaps are becoming '.' but I had to work around this in other places as well Coordinate::Pair. $aln->map_chars('\.','-') --jason On Wed, 18 Feb 2004, Ewan Birney wrote: > On Wed, 18 Feb 2004, Nathan Haigh wrote: > > > I've been using the clustalw module for creating alignment, and I've just > > realised that when you output the alignment the gap character is a "." not a > > "-". > > This is most annoying because I am adding support to this module for > > generating trees via clustalw, and clustalw removes these "." characters. Is > > there a method for changing these gap characters to "-". I have seen the > > gap_char method in the SimpleAlign module, but this seems only to designate > > a particular character as a gap character, and does not actually change the > > character. > > > > Any ideas on how to do this substitution, and where in BioPerl does this > > assignment get made in the first place, since the default gap char for > > clustalw output is "-" not "." > > To fix (short term): Loop over the sequences making a new SimpleAlign > object with LocatableSeqs and s/\./-/ on the seq strings > > > > How are you reading in Clustalw alignments? The Bio::AlignIO::clustalw > doesn't touch the gap characters: > > > foreach my $name ( sort { $order{$a} <=> $order{$b} } keys %alignments > ) { > if( $name =~ /(\S+):(\d+)-(\d+)/ ) { > ($sname,$start,$end) = ($1,$2,$3); > } else { > ($sname, $start) = ($name,1); > my $str = $alignments{$name}; > $str =~ s/[^A-Za-z]//g; > $end = length($str); > } > my $seq = new Bio::LocatableSeq('-seq' => $alignments{$name}, > '-id' => $sname, > '-start' => $start, > '-end' => $end); > > > > ($alignments{$name} has no regex put on it earlier either) > > > > > > > Thanks > > Nathan > > > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l@portal.open-bio.org > > http://portal.open-bio.org/mailman/listinfo/bioperl-l > > > > ----------------------------------------------------------------- > Ewan Birney. Mobile: +44 (0)7970 151230, Work: +44 1223 494420 > . > ----------------------------------------------------------------- > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From jason at cgt.duhs.duke.edu Wed Feb 18 09:18:43 2004 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Wed Feb 18 09:25:05 2004 Subject: [Bioperl-l] Blast results question In-Reply-To: References: Message-ID: You might want to try using est2genome/sim4/spidey/exonerate on the region where you have your multiple hits if you care about the splice sites/predicting a gene structure. use the start & end methods from a Search::Hit object to get the min/max location of the hits in the hit sequence $hit->start('hit'),$hit->end('hit'), extract this seq, and run one of the EST->genome aligners. Or are you just trying to locate this approximate region of the genome in the first place? -jason On Wed, 18 Feb 2004, Sean Davis wrote: > I have a large number of blast results against the (human) genome. The > query was a large number of oligos (from microarray) taken from various ESTs > or full-length transcripts. I have many that are "broken" by splice sites > in the genome resulting in two different "hits" near each other. Is there > someone who has code or suggestions about how to "stitch" these hits back > together? > > Thanks, > Sean > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From sdavis2 at mail.nih.gov Wed Feb 18 09:33:35 2004 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Wed Feb 18 09:29:38 2004 Subject: [Bioperl-l] Blast results question In-Reply-To: Message-ID: On 2/18/04 9:18 AM, "Jason Stajich" wrote: > You might want to try using est2genome/sim4/spidey/exonerate on the > region where you have your multiple hits if you care about the splice > sites/predicting a gene structure. use the start & end methods from a > Search::Hit object to get the min/max location of the hits in the hit > sequence $hit->start('hit'),$hit->end('hit'), extract this seq, and run > one of the EST->genome aligners. Or are you just trying to locate this > approximate region of the genome in the first place? You guessed it with the last question. To clarify, what I need to do is to determine whether there is a "best" hit against the genome. I am interested in hits with just a couple of mismatches or fewer, but over the entire length of the query sequence. Therefore, I need to know when I have a query hitting to a piece of genomic sequence where the only discontinuity is due to a splice site. > -jason > > On Wed, 18 Feb 2004, Sean Davis wrote: > >> I have a large number of blast results against the (human) genome. The >> query was a large number of oligos (from microarray) taken from various ESTs >> or full-length transcripts. I have many that are "broken" by splice sites >> in the genome resulting in two different "hits" near each other. Is there >> someone who has code or suggestions about how to "stitch" these hits back >> together? >> >> Thanks, >> Sean >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l@portal.open-bio.org >> http://portal.open-bio.org/mailman/listinfo/bioperl-l >> > > -- > Jason Stajich > Duke University > jason at cgt.mc.duke.edu > From jason at cgt.duhs.duke.edu Wed Feb 18 09:33:46 2004 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Wed Feb 18 09:40:06 2004 Subject: [Bioperl-l] AlignIO warning In-Reply-To: References: Message-ID: Short answer, add -report_type => 'reporttype' - I recognize documentation was lacking there - I am fixing that. Now that SearchIO can parse bl2seq reports, the AlignIO parser is just a shortcut convience for folks, if you look at the code you'll see this is the case. my $str = Bio::AlignIO->new(-file => 'bl2seq.out', -format => 'bl2seq', -report_type => 'blastn'); However you don't have to create the output file and re-feed it to AlignIO. If you have passed in '_READMETHOD' => 'BLAST' (which is the default for 1.4 StandAloneBlast) for initializing the factory object, then you get back a SearchIO object for bl2seq and blast alignments runs: my $searchio = $factory->bl2seq($seq1,$seq2); my $r = $searchio->next_result; my $hit = $r->next_hit; for my $hsp ( $hit->hsps ) { my $aln = $hsp->get_aln; # Bio::SimpleAlign object } In the TMTOWTDI - you can also get a Bio::Tools::BPbl2seq parser if you pass in 'BPlite' to _READMETHOD in StandAloneBlast and get back an object with slightly different API. This is the old way of parsing these reports now superceeded by SearchIO. -jason On Wed, 18 Feb 2004, Alexandre Irrthum wrote: > Hi there, > > The snippet of code shown below works fine (with bioperl 1.4), but it > issues this warning when next_aln() is called: > > > -------------------- WARNING --------------------- > MSG: Must provide which type of BLAST was run (blastp,blastn, tblastn, > tblastx, blastx) if you want strand information to get set properly for > DNA query or subjects > > > > #!/usr/bin/perl > > use warnings; > use strict; > use Bio::Seq; > use Bio::Tools::Run::StandAloneBlast; > use Bio::AlignIO; > > my $seq1 = Bio::Seq->new(-display_id => 'Sequence1', -seq => > 'AGGATAGGGCGGATAGGTAGCGCCGATTTACGCGATACGCG'); > my $seq2 = Bio::Seq->new(-display_id => 'Sequence2', -seq => > 'AGGATAGGGCAGATAGGTAGCGCCGATTTACGTGATACGCG'); > my $factory = Bio::Tools::Run::StandAloneBlast->new(program => 'blastn', > outfile => 'bl2seq.out'); > $factory->bl2seq($seq1, $seq2); > my $str = Bio::AlignIO->new(-file => 'bl2seq.out', -format => 'bl2seq'); > my $aln = $str->next_aln(); ###### Warning issued here ###### > foreach my $seq ($aln->each_seq()) { > print $seq->seq(), "\n"; > } > > How am I supposed to provide program name ? > > Thank you for your help. > > Alex > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From nathanhaigh at ukonline.co.uk Wed Feb 18 10:44:43 2004 From: nathanhaigh at ukonline.co.uk (Nathan Haigh) Date: Wed Feb 18 10:51:09 2004 Subject: [Bioperl-l] gap characters in SimpleAlign objects In-Reply-To: Message-ID: OK, I think I've figured out where my confusion lays: I thought that the default output format from clustalw would be clustalw format, but as it turns out it's gcg (MSF) which has '.' as it's gap characters. Ok, I've think I've figured out the problem (well at least part of it!): The are a couple of lines in clustalw.pm that retrieve the alignment that was generated by clustalw, but I think this may not have been updated since the addition of more alignment formats to AlignIO. As a result it defaults to MSF unless you have specified 'phylip' as the output format in the alignment factory parameters @params. As a result I have replaced the following lines: my $format= $output =~/phylip/i ? "phylip" : "MSF"; my $in = Bio::AlignIO->new(-file => $outfile, '-format' => $format); with $self->output('MSF') if !$self->output(); my $in = Bio::AlignIO->new(-file => $outfile, '-format' => $self->output()); This leaves the default file format as MSF (although I think clustalw would be a more obvious choice) but allows the user to specify any of the other supported formats. I will then use $aln->map_chars('\.','-') to change the gap characters around. The problem with this is that if you do not specify an output format, the default MSF is used (which uses '.' as gaps) and then when you create an output alignment stream in fasta format you get '.' as gaps (I'm pretty sure fasta format requires '-' as the gap symbol). Therefore, would it not be safer to check for the correct gap symbol in the fasta AlignIO module? Thanks Nathan > -----Original Message----- > From: Jason Stajich [mailto:jason@cgt.duhs.duke.edu] > Sent: 18 February 2004 14:15 > To: Ewan Birney > Cc: Nathan Haigh; bioperl-l@bioperl.org > Subject: Re: [Bioperl-l] gap characters in SimpleAlign objects > > It's easier than this - not sure why gaps are becoming '.' but I had to > work around this in other places as well Coordinate::Pair. > $aln->map_chars('\.','-') > > --jason > > On Wed, 18 Feb 2004, Ewan Birney wrote: > > > On Wed, 18 Feb 2004, Nathan Haigh wrote: > > > > > I've been using the clustalw module for creating alignment, and I've just > > > realised that when you output the alignment the gap character is a "." not a > > > "-". > > > This is most annoying because I am adding support to this module for > > > generating trees via clustalw, and clustalw removes these "." characters. Is > > > there a method for changing these gap characters to "-". I have seen the > > > gap_char method in the SimpleAlign module, but this seems only to designate > > > a particular character as a gap character, and does not actually change the > > > character. > > > > > > Any ideas on how to do this substitution, and where in BioPerl does this > > > assignment get made in the first place, since the default gap char for > > > clustalw output is "-" not "." > > > > To fix (short term): Loop over the sequences making a new SimpleAlign > > object with LocatableSeqs and s/\./-/ on the seq strings > > > > > > > > How are you reading in Clustalw alignments? The Bio::AlignIO::clustalw > > doesn't touch the gap characters: > > > > > > foreach my $name ( sort { $order{$a} <=> $order{$b} } keys %alignments > > ) { > > if( $name =~ /(\S+):(\d+)-(\d+)/ ) { > > ($sname,$start,$end) = ($1,$2,$3); > > } else { > > ($sname, $start) = ($name,1); > > my $str = $alignments{$name}; > > $str =~ s/[^A-Za-z]//g; > > $end = length($str); > > } > > my $seq = new Bio::LocatableSeq('-seq' => $alignments{$name}, > > '-id' => $sname, > > '-start' => $start, > > '-end' => $end); > > > > > > > > ($alignments{$name} has no regex put on it earlier either) > > > > > > > > > > > > Thanks > > > Nathan > > > > > > > > > _______________________________________________ > > > Bioperl-l mailing list > > > Bioperl-l@portal.open-bio.org > > > http://portal.open-bio.org/mailman/listinfo/bioperl-l > > > > > > > ----------------------------------------------------------------- > > Ewan Birney. Mobile: +44 (0)7970 151230, Work: +44 1223 494420 > > . > > ----------------------------------------------------------------- > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l@portal.open-bio.org > > http://portal.open-bio.org/mailman/listinfo/bioperl-l > > > > -- > Jason Stajich > Duke University > jason at cgt.mc.duke.edu From jegreenwood25 at hotmail.com Wed Feb 18 10:53:21 2004 From: jegreenwood25 at hotmail.com (Jonathan Greenwood) Date: Wed Feb 18 10:59:37 2004 Subject: [Bioperl-l] Clickable Glyphs... Message-ID: Hi, I've submitted my code with the email, what I'm trying to do is to render a Genbank file as a png file, I need to make each glyph clickable(I'm also displaying this page online)...any help with the new changes to Bio::Graphics::Panel would be appreciated...many thanks... Sincerely, Jonathan Greenwood email: jonathon@mgcheo.med.uottawa.ca code: #! /usr/local/bin/perl -wT use strict; use Bio::Graphics; use Bio::SeqIO; use Bio::SeqFeature::Generic; use CGI; use CGI::Pretty; my $file = 'x65306.gb'; my $io = Bio::SeqIO->new(-file=>$file); my $seq = $io->next_seq; my $wholeseq = Bio::SeqFeature::Generic->new(-start=>1, -end=>$seq->length); my @features = $seq->all_SeqFeatures; my $q = new CGI; # sort features by their primary tags my %sorted_features; for my $f (@features) { my $tag = $f->primary_tag; push @{$sorted_features{$tag}},$f; } print $q->header( 'text/html' ); print $q->start_html('A Vector Rendering'); my $panel = Bio::Graphics::Panel->new(-length => $seq->length, -width => 1000, -pad_left => 10, -pad_right => 10, -key_color => 'white', -key_spacing => 15, -key_style => 'bottom', -spacing => -0.25, -box_subparts => 'true' ); my ($url,$map,$mapname) = $panel->image_and_map(-root => '/webfiles/cgi-bin', -url => '/tmpimages', ); $panel->add_track($wholeseq, -glyph => 'arrow', -bump => +1, -double => 1, -tick => 2 ); $panel->add_track($wholeseq, -glyph => 'generic', -bgcolor => 'purple', -height => 12, -key => 'Whole Sequence', -title => 'Whole Sequence' ); # special feature if ($sorted_features{CDS}) { $panel->add_track($sorted_features{CDS}, -glyph => 'transcript2', -bgcolor => 'orange', -bump => +1, -height => 12, -key => 'CDS', -label => \&gene_label, -title => 'CDS', -link => 'feature1.html#CDS' ); delete $sorted_features{'CDS'}; } #general case my @colors = qw(wheat blue yellow green cyan chartreuse magenta gray); my $idx = 0; for my $tag (sort keys %sorted_features) { my $features = $sorted_features{$tag}; $panel->add_track($features, -glyph => 'generic', -bgcolor => $colors[$idx++ % @colors], -fgcolor => 'black', -font2color => 'red', -key => "${tag}s", -bump => +1, -height => 12, -label => \&gene_label, -description => \&generic_description, -title => \&gene_label, -link => 'feature1.html#$tag', ); } print $q->img({-src=>$url,-usemap=>"#$mapname"}); print $q->$map; print $q->($panel->png); print $q->exit_html; exit; sub gene_label { my $feature = shift; my @notes; foreach (qw(product gene)) { next unless $feature->has_tag($_); @notes = $feature->each_tag_value($_); last; } $notes[0]; } sub generic_description { my $feature = shift; my $description; foreach ($feature->all_tags) { my @values = $feature->each_tag_value($_); $description .= $_ eq 'note' ? "@values" : "$_=@values; "; } $description =~ s/; $//; # get rid of last $description; } _________________________________________________________________ The new MSN 8: smart spam protection and 2 months FREE* http://join.msn.com/?page=features/junkmail http://join.msn.com/?page=dept/bcomm&pgmarket=en-ca&RU=http%3a%2f%2fjoin.msn.com%2f%3fpage%3dmisc%2fspecialoffers%26pgmarket%3den-ca From pm66 at nyu.edu Wed Feb 18 11:41:46 2004 From: pm66 at nyu.edu (Philip MacMenamin) Date: Wed Feb 18 11:51:06 2004 Subject: [Bioperl-l] Bio::DB::GFF::Aggregator problem, new =?iso-8859-1?q?wormbase models=2E?= In-Reply-To: References: Message-ID: <200402181644.i1IGii05005378@mx2.nyu.edu> Thanks very much Todd... What version of wormbase are you using? I am using WS118. I am not able to get this aggregator to return me the UTR bits. For instance, I connect to the db using your agg: my $db = new Bio::DB::GFF(-adaptor=>'dbi::mysqlopt', # -dsn=>'dbi:mysql:wormbase110;host=localhost', -dsn=>'dbi:mysql:wormbase118;host=localhost', -user=>$user, -pass=>$passwd, -aggregator => [qw(wormabse_cds{coding_exon,5_UTR,3_UTR/CDS})], ) or die(); #...ask for a segment in the AH6.5 region: my $panelSeg = $db->segment(CDS=>'AH6.5'); #...make a searchSegment a little larger to pick everything: my $searchSeg =$db->segment($panelSeg->sourceseq, ($panelSeg->abs_start-1000), ($panelSeg->abs_end+1000)); #and, then get the features that wormabse_cds pulls: my @all_transcripts = $searchSeg->features('wormabse_cds'); foreach my $transcript ( @all_transcripts ) { print $transcript, $transcript->abs_start,' ', $transcript->abs_end,"\n"; } I assume this is the right way to do things. But, the problem is that this does not get my UTRs. This does: my @UTRs = $searchSeg->features('UTR:UTR'); foreach my $UTR (@UTRs) { print $UTR," ",$UTR->start," ",$UTR->end,"\n"; } But, these are not aggregated obviously. This is the output of the little script above: UTR:UTR(5_UTR:AH6.5) 9524078 9524086 UTR:UTR(3_UTR:AH6.5) 9525782 9526248 ... wormabse_cds:curated(AH6.5)9524087 9525781 And you can see that the wormabse_cds does not overlap with the UTRs. Sorry about this. I have been trying all sorts of things... it just keeps on missing the UTRs in the new wormbase models. And we can't re-sync the site here till this works. Philip. On Tuesday 17 February 2004 05:10 pm, Todd Harris wrote: > Hi Phillip - > > You need to aggregate the separate parts of the CDS. Create a wormbase_cds > (or whatever you wish to call it), aggregating the following features using > the CDS group: coding_exon,5_UTR,3_UTR. > > The following stanza should do the trick. > > $dbgff = (-adaptor => 'dbi::mysql', > -dsn => 'dbi:mysql:database=your_database;host=your_host', > -aggregators => [qw(wormabse_cds{coding_exon,5_UTR,3_UTR/CDS})], > -user => 'your_username', > -pass => 'your_dbgff_pass'); > > This should do the trick for properly aggregating genes under the new > WormBase CDS class. > > Todd Harris From laurichj at bioinfo.ucr.edu Wed Feb 18 11:49:50 2004 From: laurichj at bioinfo.ucr.edu (Josh Lauricha) Date: Wed Feb 18 11:56:06 2004 Subject: [Bioperl-l] Clickable Glyphs... In-Reply-To: References: Message-ID: <20040218164950.GA3094@bioinfo.ucr.edu> Not entirely sure what you are trying to do, but the way I've been doing the same sort of thing was with two scripts. The first generates the HTML, the second generates the PNG. To do this you create a panel as if you were going to make an image in both. But for the HTML you do: @boxes = $panel->boxes() rather than $panel->png(). You could do boxes() and png() on the same object if you don't mind having temp images laying around (typically insecure). Or have a switch argument passed via GET telling it to do the HTMl or the PNG: My scripts are for a webbased tree displayer (kind of like forester), a sequence displayer that highlights the glyph you click on in the sequence (changes the text color) and a blast results image with clickable HSPs. All done basically the same way (well, the tree is done with graphics modules not yet in bioperl). On Wed 02/18/04 10:53, Jonathan Greenwood wrote: > Hi, I've submitted my code with the email, what I'm trying to do is to > render a Genbank file as a png file, I need to make each glyph > clickable(I'm also displaying this page online)...any help with the new > changes to Bio::Graphics::Panel would be appreciated...many thanks... > > Sincerely, > > Jonathan Greenwood > email: jonathon@mgcheo.med.uottawa.ca > > code: > #! /usr/local/bin/perl -wT > > use strict; > use Bio::Graphics; > use Bio::SeqIO; > use Bio::SeqFeature::Generic; > use CGI; > use CGI::Pretty; > > my $file = 'x65306.gb'; > my $io = Bio::SeqIO->new(-file=>$file); > my $seq = $io->next_seq; > my $wholeseq = Bio::SeqFeature::Generic->new(-start=>1, > > -end=>$seq->length); > my @features = $seq->all_SeqFeatures; > my $q = new CGI; > > # sort features by their primary tags > my %sorted_features; > for my $f (@features) { > my $tag = $f->primary_tag; > push @{$sorted_features{$tag}},$f; > } > > print $q->header( 'text/html' ); > print $q->start_html('A Vector Rendering'); > > my $panel = Bio::Graphics::Panel->new(-length => $seq->length, > -width => 1000, > -pad_left => 10, > -pad_right => 10, > -key_color => 'white', > -key_spacing => 15, > -key_style => 'bottom', > -spacing => -0.25, > -box_subparts => 'true' > ); > > my ($url,$map,$mapname) = $panel->image_and_map(-root => > '/webfiles/cgi-bin', > -url => '/tmpimages', > ); > > $panel->add_track($wholeseq, > -glyph => 'arrow', > -bump => +1, > -double => 1, > -tick => 2 > ); > > $panel->add_track($wholeseq, > -glyph => 'generic', > -bgcolor => 'purple', > -height => 12, > -key => 'Whole Sequence', > -title => 'Whole Sequence' > ); > > # special feature > if ($sorted_features{CDS}) { > $panel->add_track($sorted_features{CDS}, > -glyph => 'transcript2', > -bgcolor => 'orange', > -bump => +1, > -height => 12, > -key => 'CDS', > -label => \&gene_label, > -title => 'CDS', > -link => 'feature1.html#CDS' > ); > delete $sorted_features{'CDS'}; > } > > #general case > my @colors = qw(wheat blue yellow green cyan chartreuse magenta gray); > my $idx = 0; > for my $tag (sort keys %sorted_features) { > my $features = $sorted_features{$tag}; > $panel->add_track($features, > -glyph => 'generic', > -bgcolor => $colors[$idx++ % @colors], > -fgcolor => 'black', > -font2color => 'red', > -key => "${tag}s", > -bump => +1, > -height => 12, > -label => \&gene_label, > -description => \&generic_description, > -title => \&gene_label, > -link => 'feature1.html#$tag', > ); > } > > print $q->img({-src=>$url,-usemap=>"#$mapname"}); > print $q->$map; > print $q->($panel->png); > > print $q->exit_html; > > exit; > > sub gene_label { > my $feature = shift; > my @notes; > foreach (qw(product gene)) { > next unless $feature->has_tag($_); > @notes = $feature->each_tag_value($_); > last; > } > $notes[0]; > } > > sub generic_description { > my $feature = shift; > my $description; > foreach ($feature->all_tags) { > my @values = $feature->each_tag_value($_); > $description .= $_ eq 'note' ? "@values" : "$_=@values; "; > } > $description =~ s/; $//; # get rid of last > $description; > } > > _________________________________________________________________ > The new MSN 8: smart spam protection and 2 months FREE* > http://join.msn.com/?page=features/junkmail > http://join.msn.com/?page=dept/bcomm&pgmarket=en-ca&RU=http%3a%2f%2fjoin.msn.com%2f%3fpage%3dmisc%2fspecialoffers%26pgmarket%3den-ca > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > -- ---------------------------- | Josh Lauricha | | laurichj@bioinfo.ucr.edu | | Bioinformatics, UCR | |--------------------------| From steve_chervitz at affymetrix.com Wed Feb 18 15:16:13 2004 From: steve_chervitz at affymetrix.com (Steve Chervitz) Date: Wed Feb 18 15:22:24 2004 Subject: [Bioperl-l] searchio scripts In-Reply-To: <200402181457.24179.lstein@cshl.edu> References: <1EC43881-61FD-11D8-A988-000A95765236@affymetrix.com> <200402181457.24179.lstein@cshl.edu> Message-ID: <4C662059-624F-11D8-A988-000A95765236@affymetrix.com> Good tip, Lincoln. But regardless, the change in IO::_readline's behavior means that any script that depended on its pre-1.303 default-to-STDIN behavior is now broken. This could be a lot since the code in examples and scripts exploited this. I received three messages about it yesterday, so I fear there could be many others out there scratching their heads, especially considering that the default-to-STDIN behavior has been around since the early days of SeqIO. From the SeqIO docs: > $seqIO = Bio::SeqIO->new(-format => $format); > .... > If neither a filehandle nor a filename is specified, then the module > will read from the @ARGV array or STDIN, using the familiar <> > semantics. Relying on a default behavior of a dependent module (Root::IO) always troubled me. It seems a better design to make it explicit in your script where you expect your input to come from. Typing "-fh=>\*ARGV" or putting an @ARGV loop around your script is extra work, but I think it's a change for the better. (BTW, this situation also exposes a weakness in the test code which didn't test the default _readline behavior -- I guess doing this is difficult within the Perl test framework). The issue remains: What to do about backwards compatibility? Some options: 1. Fix all of the scripts, examples, POD docs, bptutorial etc. to not rely on default STDIN/@ARGV reading behavior of _readline and release these as part of bioperl-1.4.1. 2. Revert _readline to it's old behavior and add a new method in IO.pm that has the new behavior (_readline2). Update any module/script that needs the new _readline behaviour to use _readline2. #2 is the backward-compatible route, but uglier from a software engineering perspective. #1 breaks backward compatibility. Given the legacy of the old _readline behaviour, I'm favoring #2. Just seems more politic. We could still update the scripts and docs to discourage the old _readline behaviour. Thoughts? Steve On Feb 18, 2004, at 4:57 AM, Lincoln Stein wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Or do this: > > my $in = Bio::SearchIO->new(-format=>'blast',-fh=>\*ARGV); > while (my $result = $in->next_result()) { > ... > } > > That might even be easier. > > Lincoln > > On Wednesday 18 February 2004 12:27 pm, Steve Chervitz wrote: >> Looks like there was a change in the Root::IO.pm module that >> affects the way these scripts process command-line arguments. As of >> bioperl-1.303, the SearchIO::blast module appears to be unable to >> read data from STDIN or files listed in @ARGV. This affects the >> scripts in examples/searchio and scripts/searchio. >> >> As a workaround, I'd recommend you iterate over @ARGV in your >> script and initialize the SearchIO object using the -file option to >> new(), as in: >> >> while (my $file = shift @ARGV) { >> my $in = Bio::SearchIO->new( -format => 'blast', >> -file => $file >> ); >> while ( my $result = $in->next_result() ) { >> # process result... >> } >> } >> >> As far as tracking down the cause, I've pinpointed the following >> change in Bio::Root::IO::_readline(): >> >> my $fh = $self->_fh or return; # revision 1.50 >> (bioperl-1.303) >> >> formerly this was: >> >> my $fh = $self->_fh || \*ARGV; # revision 1.49 >> (bioperl-1.302) >> >> This also appears to break SeqIO reading from STDIN. Try executing >> this at the top-level distribution dir for the 1.302 and 1.303 >> releases: >> >> perl -I. ./scripts/seq/translate_seq.PLS -format fasta < >> t/data/dna1.fa >> >> According to Lincoln's commit log, the Root::IO::_readline() change >> was necessary to get the GFF, SeqFeature, and Registry regression >> tests working. I tested these tests with the 1.49 version of IO.pm >> and the only one that was affected was SeqFeature.t. Specifically, >> test #6 which calls SeqFeature::Generic::gff_string() hangs and >> waits for input before proceeding. I'm not sure why this is... >> (getting late). >> >> BTW, platforms tested: Perl 5.6.1 and 5.8.0 on Linux (RH9) and Perl >> 5.8.1-RC3 on MacOS X (10.3.2). >> >> Steve >> >> On Feb 17, 2004, at 3:14 PM, Richard Rouse wrote: >>> I recent installed bioperl-1.4 and am having problems with the >>> blast report >>> parsers in /examples/searchio/ >>> >>> >>> When I run: >>> perl hitwriter.pl blastreport >>> I get: >>> >>> Using SearchIO->new() >>> >>> 0 Blast report(s) processed. >>> Output sent to file: >hitwriter.out >>> >>> I get the same result with rawwriter.pl, hspwriter.pl and >>> custom_writer.pl >>> although the htmlwriter.pl and the blast_example.pl work fine. >>> >>> Has anyone else encountered this problem and figured out how to >>> fix it? >>> >>> Thanks, >>> >>> Richard >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l@portal.open-bio.org >>> http://portal.open-bio.org/mailman/listinfo/bioperl-l >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l@portal.open-bio.org >> http://portal.open-bio.org/mailman/listinfo/bioperl-l > > - -- > Lincoln D. Stein > Cold Spring Harbor Laboratory > 1 Bungtown Road > Cold Spring Harbor, NY 11724 > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.2.1 (GNU/Linux) > > iD8DBQFAM2E00CIvUP7P+AkRAurTAJ9gwb4Os0M5uDWhlE40JphLRIAG+gCfQ5Ji > zXHLGwtfDAB2Np2nKBZkuw0= > =IsKs > -----END PGP SIGNATURE----- > From barry.moore at genetics.utah.edu Wed Feb 18 16:46:22 2004 From: barry.moore at genetics.utah.edu (Barry Moore) Date: Wed Feb 18 16:52:41 2004 Subject: [Bioperl-l] Pretty Output of Alignments Message-ID: <4033DD2E.8050304@genetics.utah.edu> Are there modules in Bioperl that produce shaded and colored output of multiple sequence alignments in various formats (PS, RTF, HTML) similar to what can be made with tools like BOXSHADE, TEXshade, Alscript. I'm pretty sure that the answer is no, but thought I'd check to be sure. I know that there are Bioperl wrappers for the EMBOSS and PISE versions these tools, but wanted something that could be tweaked for more control over the output than those allow. From barry.moore at genetics.utah.edu Wed Feb 18 16:49:01 2004 From: barry.moore at genetics.utah.edu (Barry Moore) Date: Wed Feb 18 16:55:17 2004 Subject: [Bioperl-l] Pretty Output of Alignments Message-ID: <4033DDCD.4000806@genetics.utah.edu> Oopps! Forgot to sign that post. Are there modules in Bioperl that produce shaded and colored output of multiple sequence alignments in various formats (PS, RTF, HTML) similar to what can be made with tools like BOXSHADE, TEXshade, Alscript. I'm pretty sure that the answer is no, but thought I'd check to be sure. I know that there are Bioperl wrappers for the EMBOSS and PISE versions these tools, but wanted something that could be tweaked for more control over the output than those allow. Barry Moore Dept. of Human Genetics University of Utah From qdong at genome.stanford.edu Wed Feb 18 17:09:27 2004 From: qdong at genome.stanford.edu (Stan Dong) Date: Wed Feb 18 17:18:20 2004 Subject: [Bioperl-l] SGD GFF3 file available soon Message-ID: <1DE37948-625F-11D8-89C8-000A956A0A36@genome.stanford.edu> Hi, I am a programmer at Saccharomyces Genome Database ( SGD, http://www.yeastgenome.org/ ). I am working on developing a flat file in GFF3 format ( http://song.sourceforge.net/gff3-jan04.shtml ) to represent sequence features of yeast genome and it will soon be released on our ftp site. This is very useful because quite a few open source softwares can take this file format as input such as Gbrowse, Chado etc. I would like comments from people who are interested in doing similar things and those who have good/not-so-good experience on GFF3 to share with. For me, it took a while to get the specification done especially make the third column (type) fully compatible with Sequence Ontology (SO). One thing I liked about GFF3 is the last column (attributes) where you can put all kinds of useful information such as in our case GO annotation and a nice description of a feature. An example file of SGD GFF3 can be viewed here. ftp://genome-ftp.stanford.edu/pub/people/curator/GFF3Example.txt Thanks, Stan Dong Programmer, SGD From jason at cgt.duhs.duke.edu Wed Feb 18 17:30:45 2004 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Wed Feb 18 17:37:07 2004 Subject: [Bioperl-l] SGD GFF3 file available soon In-Reply-To: <1DE37948-625F-11D8-89C8-000A956A0A36@genome.stanford.edu> References: <1DE37948-625F-11D8-89C8-000A956A0A36@genome.stanford.edu> Message-ID: Stan - I am very much looking forward to this - up till now I have had to reformat the .tab file myself just to get a working GFF3 where Bio::DB:GFF aggregators would behave properly. Looks like what you have for the gene/CDS sets and will give it a try in my db/scripts. Will be very happy to see it all consolidated as you have done so nice work. -jason On Wed, 18 Feb 2004, Stan Dong wrote: > Hi, > > I am a programmer at Saccharomyces Genome Database ( SGD, > http://www.yeastgenome.org/ ). I am working on developing a flat file > in GFF3 format ( http://song.sourceforge.net/gff3-jan04.shtml ) to > represent sequence features of yeast genome and it will soon be > released on our ftp site. This is very useful because quite a few open > source softwares can take this file format as input such as Gbrowse, > Chado etc. > > I would like comments from people who are interested in doing similar > things and those who have good/not-so-good experience on GFF3 to share > with. For me, it took a while to get the specification done especially > make the third column (type) fully compatible with Sequence Ontology > (SO). One thing I liked about GFF3 is the last column (attributes) > where you can put all kinds of useful information such as in our case > GO annotation and a nice description of a feature. An example file of > SGD GFF3 can be viewed here. > > ftp://genome-ftp.stanford.edu/pub/people/curator/GFF3Example.txt > > Thanks, > > Stan Dong > Programmer, SGD > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From rrouse at biomail.ucsd.edu Wed Feb 18 18:09:44 2004 From: rrouse at biomail.ucsd.edu (Richard Rouse) Date: Wed Feb 18 18:15:52 2004 Subject: [Bioperl-l] searchio scripts In-Reply-To: <4C662059-624F-11D8-A988-000A95765236@affymetrix.com> Message-ID: I tried Steve's suggestion by putting this right above: while ( my $blast = $in->next_result() ) { Then putting another } at the end of the script. Doing this and then running a large blast output file, I got: Using SearchIO->new() Report 1: Bio::Search::Result::BlastResult=HASH(0x87d7fb8) Report 2: Bio::Search::Result::BlastResult=HASH(0x8dfea60) ------------- EXCEPTION ------------- MSG: Trouble in ResultTableWriter::_set_row_data_func() eval: ------------- EXCEPTION ------------- MSG: Can't get identical or conserved data: no data. STACK Bio::Search::Hit::GenericHit::matches ../..//Bio/Search/Hit/GenericHit.pm:852 STACK Bio::Search::Hit::GenericHit::frac_identical ../..//Bio/Search/Hit/GenericHit.pm:1043 STACK (eval) (eval 310):1 STACK Bio::SearchIO::Writer::ResultTableWriter::__ANON__ ../..//Bio/SearchIO/Writer/ResultTableWriter.pm:327 STACK Bio::SearchIO::Writer::HitTableWriter::to_string ../..//Bio/SearchIO/Writer/HitTableWriter.pm:267 STACK Bio::SearchIO::write_result ../..//Bio/SearchIO.pm:321 STACK Bio::SearchIO::blast::write_result ../..//Bio/SearchIO/blast.pm:1495 STACK toplevel new.mod.hitwriter.pl:106 -------------------------------------- STACK Bio::SearchIO::Writer::ResultTableWriter::__ANON__ ../..//Bio/SearchIO/Writer/ResultTableWriter.pm:329 STACK Bio::SearchIO::Writer::HitTableWriter::to_string ../..//Bio/SearchIO/Writer/HitTableWriter.pm:267 STACK Bio::SearchIO::write_result ../..//Bio/SearchIO.pm:321 STACK Bio::SearchIO::blast::write_result ../..//Bio/SearchIO/blast.pm:1495 STACK toplevel new.mod.hitwriter.pl:106 I tried Lincoln's suggestion as well. In this case I added: my $in = Bio::SearchIO->new(-format=>'blast',-fh=>\*ARGV); above while ( my $blast = $in->next_result() ) { This script just runs getting no result. By the way I am running Suse linux 9.0, perl 5.8.1 Thanks, Richard -----Original Message----- From: Steve Chervitz [mailto:steve_chervitz@affymetrix.com] Sent: Wednesday, February 18, 2004 12:16 PM To: Lincoln Stein Cc: Richard Rouse; Bioperl Subject: Re: [Bioperl-l] searchio scripts Good tip, Lincoln. But regardless, the change in IO::_readline's behavior means that any script that depended on its pre-1.303 default-to-STDIN behavior is now broken. This could be a lot since the code in examples and scripts exploited this. I received three messages about it yesterday, so I fear there could be many others out there scratching their heads, especially considering that the default-to-STDIN behavior has been around since the early days of SeqIO. From the SeqIO docs: > $seqIO = Bio::SeqIO->new(-format => $format); > .... > If neither a filehandle nor a filename is specified, then the module > will read from the @ARGV array or STDIN, using the familiar <> > semantics. Relying on a default behavior of a dependent module (Root::IO) always troubled me. It seems a better design to make it explicit in your script where you expect your input to come from. Typing "-fh=>\*ARGV" or putting an @ARGV loop around your script is extra work, but I think it's a change for the better. (BTW, this situation also exposes a weakness in the test code which didn't test the default _readline behavior -- I guess doing this is difficult within the Perl test framework). The issue remains: What to do about backwards compatibility? Some options: 1. Fix all of the scripts, examples, POD docs, bptutorial etc. to not rely on default STDIN/@ARGV reading behavior of _readline and release these as part of bioperl-1.4.1. 2. Revert _readline to it's old behavior and add a new method in IO.pm that has the new behavior (_readline2). Update any module/script that needs the new _readline behaviour to use _readline2. #2 is the backward-compatible route, but uglier from a software engineering perspective. #1 breaks backward compatibility. Given the legacy of the old _readline behaviour, I'm favoring #2. Just seems more politic. We could still update the scripts and docs to discourage the old _readline behaviour. Thoughts? Steve On Feb 18, 2004, at 4:57 AM, Lincoln Stein wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Or do this: > > my $in = Bio::SearchIO->new(-format=>'blast',-fh=>\*ARGV); > while (my $result = $in->next_result()) { > ... > } > > That might even be easier. > > Lincoln > > On Wednesday 18 February 2004 12:27 pm, Steve Chervitz wrote: >> Looks like there was a change in the Root::IO.pm module that >> affects the way these scripts process command-line arguments. As of >> bioperl-1.303, the SearchIO::blast module appears to be unable to >> read data from STDIN or files listed in @ARGV. This affects the >> scripts in examples/searchio and scripts/searchio. >> >> As a workaround, I'd recommend you iterate over @ARGV in your >> script and initialize the SearchIO object using the -file option to >> new(), as in: >> >> while (my $file = shift @ARGV) { >> my $in = Bio::SearchIO->new( -format => 'blast', >> -file => $file >> ); >> while ( my $result = $in->next_result() ) { >> # process result... >> } >> } >> >> As far as tracking down the cause, I've pinpointed the following >> change in Bio::Root::IO::_readline(): >> >> my $fh = $self->_fh or return; # revision 1.50 >> (bioperl-1.303) >> >> formerly this was: >> >> my $fh = $self->_fh || \*ARGV; # revision 1.49 >> (bioperl-1.302) >> >> This also appears to break SeqIO reading from STDIN. Try executing >> this at the top-level distribution dir for the 1.302 and 1.303 >> releases: >> >> perl -I. ./scripts/seq/translate_seq.PLS -format fasta < >> t/data/dna1.fa >> >> According to Lincoln's commit log, the Root::IO::_readline() change >> was necessary to get the GFF, SeqFeature, and Registry regression >> tests working. I tested these tests with the 1.49 version of IO.pm >> and the only one that was affected was SeqFeature.t. Specifically, >> test #6 which calls SeqFeature::Generic::gff_string() hangs and >> waits for input before proceeding. I'm not sure why this is... >> (getting late). >> >> BTW, platforms tested: Perl 5.6.1 and 5.8.0 on Linux (RH9) and Perl >> 5.8.1-RC3 on MacOS X (10.3.2). >> >> Steve >> >> On Feb 17, 2004, at 3:14 PM, Richard Rouse wrote: >>> I recent installed bioperl-1.4 and am having problems with the >>> blast report >>> parsers in /examples/searchio/ >>> >>> >>> When I run: >>> perl hitwriter.pl blastreport >>> I get: >>> >>> Using SearchIO->new() >>> >>> 0 Blast report(s) processed. >>> Output sent to file: >hitwriter.out >>> >>> I get the same result with rawwriter.pl, hspwriter.pl and >>> custom_writer.pl >>> although the htmlwriter.pl and the blast_example.pl work fine. >>> >>> Has anyone else encountered this problem and figured out how to >>> fix it? >>> >>> Thanks, >>> >>> Richard >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l@portal.open-bio.org >>> http://portal.open-bio.org/mailman/listinfo/bioperl-l >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l@portal.open-bio.org >> http://portal.open-bio.org/mailman/listinfo/bioperl-l > > - -- > Lincoln D. Stein > Cold Spring Harbor Laboratory > 1 Bungtown Road > Cold Spring Harbor, NY 11724 > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.2.1 (GNU/Linux) > > iD8DBQFAM2E00CIvUP7P+AkRAurTAJ9gwb4Os0M5uDWhlE40JphLRIAG+gCfQ5Ji > zXHLGwtfDAB2Np2nKBZkuw0= > =IsKs > -----END PGP SIGNATURE----- > From barry.moore at genetics.utah.edu Wed Feb 18 19:30:23 2004 From: barry.moore at genetics.utah.edu (Barry Moore) Date: Wed Feb 18 19:36:34 2004 Subject: [Bioperl-l] Bio::Graphics::Browser::Markup Message-ID: <4034039F.1080107@genetics.utah.edu> Todd Harris recently pointed me to Bio::Graphics::Browser::Markup, and I can't find the damn thing anywhere. I've looked in the bioperl CVS, bioperl online docs and in my local installation. Google searches don't turn up anything, but suggest that I should be finding a Bio/Graphics/Browser/Markup.pm module - which I don't. Someone please enlighten the poor simple boy from Utah. Barry From jason at cgt.duhs.duke.edu Wed Feb 18 20:36:05 2004 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Wed Feb 18 20:42:22 2004 Subject: [Bioperl-l] Bio::Graphics::Browser::Markup In-Reply-To: <4034039F.1080107@genetics.utah.edu> References: <4034039F.1080107@genetics.utah.edu> Message-ID: Hey Barry - It is part of Gbrowse... http://sourceforge.net/cvs/?group_id=27707 (pass is empty, just hit return) cvs -d:pserver:anonymous@cvs.sourceforge.net:/cvsroot/gmod login cvs -z3 -d:pserver:anonymous@cvs.sourceforge.net:/cvsroot/gmod co Generic-Genome-Browser -jason On Wed, 18 Feb 2004, Barry Moore wrote: > Todd Harris recently pointed me to Bio::Graphics::Browser::Markup, and I > can't find the damn thing anywhere. I've looked in the bioperl CVS, > bioperl online docs and in my local installation. Google searches don't > turn up anything, but suggest that I should be finding a > Bio/Graphics/Browser/Markup.pm module - which I don't. Someone please > enlighten the poor simple boy from Utah. > > Barry > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From rkh at gene.com Wed Feb 18 16:31:05 2004 From: rkh at gene.com (Reece Hart) Date: Wed Feb 18 21:36:35 2004 Subject: [Bioperl-l] Bio::Prospect:: -- a Perl interface to Prospect protein threading Message-ID: <1077139865.3514.687.camel@tallac> bioperl-l gang: We're pleased to announce the first public release of Bio::Prospect::, a Perl API for protein fold recognition with Prospect PRO. A mini-manuscript describing the module is available at http://prdownloads.sourceforge.net/prospect-if/manuscript.pdf?download. Abstract We present Bio::Prospect::, an object-oriented Perl Application Programming Interface (API) to the PROSPECT protein threading application. The Bio::Prospect:: modules facilitate executing the program, parsing the results, generating a homology model from an alignment, preparing a model for display with RasMol, and reconciling multiple pairwise alignments as a single multiple sequence alignment. The Bio::Prospect:: modules provide for local and remote execution of PROSPECT via a consistent interface. PROSPECT results may be represented with the full-featured Thread class or as a space-efficient distillation of results with the ThreadSummary class; instances of both classes may be serialized for network transmission of results from remote execution. LICENSE: The module is released under the Academic Free License v. 2.0 (http://www.opensource.org/licenses/afl-2.0.php). AVAILABILITY: The project is hosted on SourceForge (http://www.sourceforge.net/projects/prospect-if/). A perl install package is available through CPAN (http://search.cpan.org/~reece/). Example scripts are included. REQUIREMENTS: This module requires Prospect PRO (http://www.bioinformaticssolutions.com/products/prospect.php) and several other perl modules, all available from CPAN. Comments, bug reports, patches, and code contributions are encouraged. Happy Threading, Reece Hart and David Cavanaugh -- Reece Hart, Ph.D. rkh@gene.com, http://www.gene.com/ Genentech, Inc. 650/225-6133 (voice), -5389 (fax) Bioinformatics and Protein Engineering 1 DNA Way, MS-93 http://www.in-machina.com/~reece/ South San Francisco, CA 94080-4990 reece@in-machina.com, GPG: 0x25EC91A0 From neil.saunders at unsw.edu.au Wed Feb 18 21:55:24 2004 From: neil.saunders at unsw.edu.au (Neil Saunders) Date: Wed Feb 18 22:01:47 2004 Subject: [Bioperl-l] Re: Bio::Prospect In-Reply-To: <200402190238.i1J2av9T015567@portal.open-bio.org> References: <200402190238.i1J2av9T015567@portal.open-bio.org> Message-ID: <20040219025524.GA15515@psychro> > We're pleased to announce the first public release of Bio::Prospect::, a > Perl API for protein fold recognition with Prospect PRO. A > Comments, bug reports, patches, and code contributions are encouraged. This looks very useful. It would be even more so if it worked with the freely-available Prospect 2.0, rather than the commercial Prospect Pro. Any chance of this? Neil -- School of Biotechnology and Biomolecular Sciences, The University of New South Wales, Sydney 2052, Australia http://psychro.bioinformatics.unsw.edu.au/neil/index.php From cain at cshl.org Wed Feb 18 22:14:58 2004 From: cain at cshl.org (Scott Cain) Date: Wed Feb 18 22:21:13 2004 Subject: [Bioperl-l] SGD GFF3 file available soon In-Reply-To: <200402190237.i1J2av9R015567@portal.open-bio.org> References: <200402190237.i1J2av9R015567@portal.open-bio.org> Message-ID: <1077160498.1473.72.camel@localhost.localdomain> Stan, In your sample GFF, the seqid in the first column has to correspond to some ID, usually also defined in the same GFF file. For instance, if the features in the GFF file are all on chromosome I, the first column of all of those lines would have the same ID as the ID declared for chromosome I. For example: I SGD chromosome 1 230211 . . . ID=I;description=Sequence "I" I SGD telomere 1 801 . - 0 ID=TEL01L;description=I left telomeric region;db_xref=SGD:S0028862 I SGD repeat_family 1 62 . - 0 ID=TEL01L-TR;name=Telomeric Repeat;description=I left telomere TG(1-3);db_xref=SGD:S0028864 ...etc... Sorry I didn't point that out before--when I looked at the Excel sheet you sent me before, I didn't see all of it (I am too used to working with plain text files). Scott -------------Original Message--------------- > Date: Wed, 18 Feb 2004 14:09:27 -0800 > From: Stan Dong > Subject: [Bioperl-l] SGD GFF3 file available soon > To: bioperl-l@bioperl.org > Message-ID: <1DE37948-625F-11D8-89C8-000A956A0A36@genome.stanford.edu> > Content-Type: text/plain; charset=US-ASCII; format=flowed > > Hi, > > I am a programmer at Saccharomyces Genome Database ( SGD, > http://www.yeastgenome.org/ ). I am working on developing a flat file > in GFF3 format ( http://song.sourceforge.net/gff3-jan04.shtml ) to > represent sequence features of yeast genome and it will soon be > released on our ftp site. This is very useful because quite a few open > source softwares can take this file format as input such as Gbrowse, > Chado etc. > > I would like comments from people who are interested in doing similar > things and those who have good/not-so-good experience on GFF3 to share > with. For me, it took a while to get the specification done especially > make the third column (type) fully compatible with Sequence Ontology > (SO). One thing I liked about GFF3 is the last column (attributes) > where you can put all kinds of useful information such as in our case > GO annotation and a nice description of a feature. An example file of > SGD GFF3 can be viewed here. > > ftp://genome-ftp.stanford.edu/pub/people/curator/GFF3Example.txt > > Thanks, > > Stan Dong > Programmer, SGD -- ------------------------------------------------------------------------ Scott Cain, Ph. D. cain@cshl.org GMOD Coordinator (http://www.gmod.org/) 216-392-3087 Cold Spring Harbor Laboratory From cain at cshl.org Wed Feb 18 22:20:48 2004 From: cain at cshl.org (Scott Cain) Date: Wed Feb 18 22:27:03 2004 Subject: [Bioperl-l] Bio::Graphics::Browser::Markup In-Reply-To: <200402190237.i1J2av9R015567@portal.open-bio.org> References: <200402190237.i1J2av9R015567@portal.open-bio.org> Message-ID: <1077160848.1477.79.camel@localhost.localdomain> Or if you don't want to deal with the anonymous cvs server (which can be quite slow at times), you can download a nightly build from CVS at http://www.gmod.org/Generic-Genome-Browser.tar.gz I would normally suggest that you go to the download page from http://www.gmod.org/, but I am in the process of preparing a new release. Scott --------------Original Message------------------- > Date: Wed, 18 Feb 2004 20:36:05 -0500 (EST) > From: Jason Stajich > Subject: Re: [Bioperl-l] Bio::Graphics::Browser::Markup > To: Barry Moore > Cc: bioperl > Message-ID: > > Content-Type: TEXT/PLAIN; charset=US-ASCII > > > Hey Barry - > > It is part of Gbrowse... > http://sourceforge.net/cvs/?group_id=27707 > > (pass is empty, just hit return) > cvs -d:pserver:anonymous@cvs.sourceforge.net:/cvsroot/gmod login > > cvs -z3 -d:pserver:anonymous@cvs.sourceforge.net:/cvsroot/gmod co Generic-Genome-Browser > > -jason > On Wed, 18 Feb 2004, Barry Moore wrote: > > > Todd Harris recently pointed me to Bio::Graphics::Browser::Markup, and I > > can't find the damn thing anywhere. I've looked in the bioperl CVS, > > bioperl online docs and in my local installation. Google searches don't > > turn up anything, but suggest that I should be finding a > > Bio/Graphics/Browser/Markup.pm module - which I don't. Someone please > > enlighten the poor simple boy from Utah. > > > > Barry > > > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l@portal.open-bio.org > > http://portal.open-bio.org/mailman/listinfo/bioperl-l > > > > -- > Jason Stajich > Duke University > jason at cgt.mc.duke.edu > -- ------------------------------------------------------------------------ Scott Cain, Ph. D. cain@cshl.org GMOD Coordinator (http://www.gmod.org/) 216-392-3087 Cold Spring Harbor Laboratory From f69p at hotmail.com Wed Feb 18 16:54:30 2004 From: f69p at hotmail.com (shawn) Date: Wed Feb 18 23:58:41 2004 Subject: [Bioperl-l] A D-r-u-g more potent than VIAG-RA?! Message-ID: <1077141270-5270@excite.com> "I went from about 6in. to over 7.5 in 90days!" - JL - Tulsa, OK http://drlaurent.com/mm/index.php?pid=eph9106 "NOT only the SIZE increased..but also the feeling!" - ER - Dallas, TX http://drlaurent.com/mm/index.php?pid=eph9106 "I grew about 2in - and more so..remain rock-hard during love making. I owe it all to Maxaman! Not bad for a 56 year old." MT - Pensacola, FL http://drlaurent.com/mm/index.php?pid=eph9106 buffalo scicchitano glkenyon dwgormly felix jeanie collons wachtel ccins002 dagreco joon evans galloway jambunathan bdudock rswillia volker lebien nehorayoff sharan bobm bleich kimbra formal Get off this list by writing to getmeoff731@mail.com From pacers21image at hotmail.com Thu Feb 19 03:10:26 2004 From: pacers21image at hotmail.com (antoine) Date: Thu Feb 19 01:23:30 2004 Subject: [Bioperl-l] The Drug that puts VIAGR@ to shame! Message-ID: <1077178226-22105@excite.com> Here is an fantastic way to please your lady. You can be ready for up to thirty-six hours. The results are far greater than any other product. http://medsfactory.com/sv/index.php?pid=eph9106 dollars campbellcutie energy larry yomama petunia action laura oliviercharity e-mail tootsie supra cannon research From qdong at genome.stanford.edu Thu Feb 19 01:42:32 2004 From: qdong at genome.stanford.edu (Stan Dong) Date: Thu Feb 19 01:48:50 2004 Subject: [Bioperl-l] SGD GFF3 file available soon In-Reply-To: <1077160498.1473.72.camel@localhost.localdomain> Message-ID: Hi Scott, In my examples, I use arabic number in the seqid column to indicate chromosome number. So I should put 'ID=1' in the attribute column of the first line which represents the whole chromosome. Since these IDs need to be unique within the scope of the GFF file, I think it's better to use a more descriptive name like 'chr01' in this case (and 'ID=chr01' in the attribute column). Thanks a lot for your suggestion, -Stan On Wed, 18 Feb 2004, Scott Cain wrote: > Stan, > > In your sample GFF, the seqid in the first column has to correspond to > some ID, usually also defined in the same GFF file. For instance, if > the features in the GFF file are all on chromosome I, the first column > of all of those lines would have the same ID as the ID declared for > chromosome I. For example: > > I SGD chromosome 1 230211 . . . ID=I;description=Sequence "I" > I SGD telomere 1 801 . - 0 ID=TEL01L;description=I left telomeric region;db_xref=SGD:S0028862 > I SGD repeat_family 1 62 . - 0 ID=TEL01L-TR;name=Telomeric Repeat;description=I left telomere TG(1-3);db_xref=SGD:S0028864 > ...etc... > > Sorry I didn't point that out before--when I looked at the Excel sheet > you sent me before, I didn't see all of it (I am too used to working > with plain text files). > > Scott > > -------------Original Message--------------- > > Date: Wed, 18 Feb 2004 14:09:27 -0800 > > From: Stan Dong > > Subject: [Bioperl-l] SGD GFF3 file available soon > > To: bioperl-l@bioperl.org > > Message-ID: <1DE37948-625F-11D8-89C8-000A956A0A36@genome.stanford.edu> > > Content-Type: text/plain; charset=US-ASCII; format=flowed > > > > Hi, > > > > I am a programmer at Saccharomyces Genome Database ( SGD, > > http://www.yeastgenome.org/ ). I am working on developing a flat file > > in GFF3 format ( http://song.sourceforge.net/gff3-jan04.shtml ) to > > represent sequence features of yeast genome and it will soon be > > released on our ftp site. This is very useful because quite a few open > > source softwares can take this file format as input such as Gbrowse, > > Chado etc. > > > > I would like comments from people who are interested in doing similar > > things and those who have good/not-so-good experience on GFF3 to share > > with. For me, it took a while to get the specification done especially > > make the third column (type) fully compatible with Sequence Ontology > > (SO). One thing I liked about GFF3 is the last column (attributes) > > where you can put all kinds of useful information such as in our case > > GO annotation and a nice description of a feature. An example file of > > SGD GFF3 can be viewed here. > > > > ftp://genome-ftp.stanford.edu/pub/people/curator/GFF3Example.txt > > > > Thanks, > > > > Stan Dong > > Programmer, SGD > > -- > ------------------------------------------------------------------------ > Scott Cain, Ph. D. cain@cshl.org > GMOD Coordinator (http://www.gmod.org/) 216-392-3087 > Cold Spring Harbor Laboratory > From lstein at cshl.edu Thu Feb 19 04:06:57 2004 From: lstein at cshl.edu (Lincoln Stein) Date: Thu Feb 19 04:13:28 2004 Subject: [Bioperl-l] Bio::Graphics::Browser::Markup In-Reply-To: <4034039F.1080107@genetics.utah.edu> References: <4034039F.1080107@genetics.utah.edu> Message-ID: <200402191106.57903.lstein@cshl.edu> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 It's in Generic Genome Browser. http://www.gmod.org/ggb/ For those who aren't familiar with the module, it lets you markup arbitrary strings (alignments, FASTA files, etc) with HTML using a simple stylesheet system. You can markup with colors, text styles, and arbitrary text. Overlapping mark-up works properly (e.g. text styles and colors are additive, and overlapping colors mix properly using HSV addition). Lincoln On Thursday 19 February 2004 02:30 am, Barry Moore wrote: > Todd Harris recently pointed me to Bio::Graphics::Browser::Markup, > and I can't find the damn thing anywhere. I've looked in the > bioperl CVS, bioperl online docs and in my local installation. > Google searches don't turn up anything, but suggest that I should > be finding a > Bio/Graphics/Browser/Markup.pm module - which I don't. Someone > please enlighten the poor simple boy from Utah. > > Barry > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l - -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (GNU/Linux) iD8DBQFANHyx0CIvUP7P+AkRAlIbAJ49AIjh4pFjtPu3hY9liHWpDnw5sQCgoXz9 X6BwmN2OXLkL3AraWkoqr3A= =RYHR -----END PGP SIGNATURE----- From lstein at cshl.edu Thu Feb 19 05:08:43 2004 From: lstein at cshl.edu (Lincoln Stein) Date: Thu Feb 19 05:15:06 2004 Subject: [Bioperl-l] Clickable Glyphs... In-Reply-To: References: Message-ID: <200402191208.44083.lstein@cshl.edu> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Get the latest CVS version of bioperl-live and read the section of the Bio::Graphics::Panel manual page labeled "Creating Imagemaps." Essentially what you need to do is to replace the section after you create the panel with this: my ($url,$map,$mapname) - $panel->image_and_map( -root => '/var/www/html', -url => '/tmpimages'); print $q->header(),$q->start_html('A Bitmap Rendering'); print $q->img({-src=>$url,-usemap=>"#$mapname"); print $map; print $q->end_html; I'm frankly more fond of the function-oriented CGI calls, so I would bring in the standard functions and then: print header(), start_html('A Bitmap Rendering'), img({-src=>$url,-usemap=>"#$mapname"), $map, end_html(); Lincoln On Wednesday 18 February 2004 05:53 pm, Jonathan Greenwood wrote: > Hi, I've submitted my code with the email, what I'm trying to do is > to render a Genbank file as a png file, I need to make each glyph > clickable(I'm also displaying this page online)...any help with the > new changes to Bio::Graphics::Panel would be appreciated...many > thanks... > > Sincerely, > > Jonathan Greenwood > email: jonathon@mgcheo.med.uottawa.ca > > code: > #! /usr/local/bin/perl -wT > > use strict; > use Bio::Graphics; > use Bio::SeqIO; > use Bio::SeqFeature::Generic; > use CGI; > use CGI::Pretty; > > my $file = 'x65306.gb'; > my $io = Bio::SeqIO->new(-file=>$file); > my $seq = $io->next_seq; > my $wholeseq = Bio::SeqFeature::Generic->new(-start=>1, > > -end=>$seq->length); > my @features = $seq->all_SeqFeatures; > my $q = new CGI; > > # sort features by their primary tags > my %sorted_features; > for my $f (@features) { > my $tag = $f->primary_tag; > push @{$sorted_features{$tag}},$f; > } > > print $q->header( 'text/html' ); > print $q->start_html('A Vector Rendering'); > > my $panel = Bio::Graphics::Panel->new(-length => $seq->length, > -width => 1000, > -pad_left => 10, > -pad_right => 10, > -key_color => 'white', > -key_spacing => 15, > -key_style => 'bottom', > -spacing => -0.25, > -box_subparts => 'true' > ); > > my ($url,$map,$mapname) = $panel->image_and_map(-root => > '/webfiles/cgi-bin', > -url => '/tmpimages', > ); > > $panel->add_track($wholeseq, > -glyph => 'arrow', > -bump => +1, > -double => 1, > -tick => 2 > ); > > $panel->add_track($wholeseq, > -glyph => 'generic', > -bgcolor => 'purple', > -height => 12, > -key => 'Whole Sequence', > -title => 'Whole Sequence' > ); > > # special feature > if ($sorted_features{CDS}) { > $panel->add_track($sorted_features{CDS}, > -glyph => 'transcript2', > -bgcolor => 'orange', > -bump => +1, > -height => 12, > -key => 'CDS', > -label => \&gene_label, > -title => 'CDS', > -link => 'feature1.html#CDS' > ); > delete $sorted_features{'CDS'}; > } > > #general case > my @colors = qw(wheat blue yellow green cyan chartreuse magenta > gray); my $idx = 0; > for my $tag (sort keys %sorted_features) { > my $features = $sorted_features{$tag}; > $panel->add_track($features, > -glyph => 'generic', > -bgcolor => $colors[$idx++ % @colors], > -fgcolor => 'black', > -font2color => 'red', > -key => "${tag}s", > -bump => +1, > -height => 12, > -label => \&gene_label, > -description => \&generic_description, > -title => \&gene_label, > -link => 'feature1.html#$tag', > ); > } > > print $q->img({-src=>$url,-usemap=>"#$mapname"}); > print $q->$map; > print $q->($panel->png); > > print $q->exit_html; > > exit; > > sub gene_label { > my $feature = shift; > my @notes; > foreach (qw(product gene)) { > next unless $feature->has_tag($_); > @notes = $feature->each_tag_value($_); > last; > } > $notes[0]; > } > > sub generic_description { > my $feature = shift; > my $description; > foreach ($feature->all_tags) { > my @values = $feature->each_tag_value($_); > $description .= $_ eq 'note' ? "@values" : "$_=@values; "; > } > $description =~ s/; $//; # get rid of last > $description; > } > > _________________________________________________________________ > The new MSN 8: smart spam protection and 2 months FREE* > http://join.msn.com/?page=features/junkmail > http://join.msn.com/?page=dept/bcomm&pgmarket=en-ca&RU=http%3a%2f%2 >fjoin.msn.com%2f%3fpage%3dmisc%2fspecialoffers%26pgmarket%3den-ca > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l - -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (GNU/Linux) iD8DBQFANIss0CIvUP7P+AkRAmh/AJ9SaY4MIZPS5vW5gE5xzaw7AzrjaQCdHJdE S+2+MS2vScLrVTd+C3V4mME= =MBei -----END PGP SIGNATURE----- From awitney at sghms.ac.uk Thu Feb 19 07:05:26 2004 From: awitney at sghms.ac.uk (Adam Witney) Date: Thu Feb 19 07:12:41 2004 Subject: [Bioperl-l] Subject length using BPlite.pm Message-ID: Hi, I am using BPlite.pm to parse a BLAST output, is it possible to get the Subject length? (that?s the length of the whole subject sequence, not just the part involved in the hsp) Thanks for any help Adam -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From todd.harris at cshl.edu Thu Feb 19 09:39:12 2004 From: todd.harris at cshl.edu (Todd Harris) Date: Thu Feb 19 09:45:32 2004 Subject: [Bioperl-l] Bio::Graphics::Browser::Markup In-Reply-To: Message-ID: Whoops! My apologies. Thanks for the correction, JS. Yep, it's part of GBrowse, not bioperl. t > On 2/18/04 7:36 PM, Jason Stajich wrote: > > Hey Barry - > > It is part of Gbrowse... > http://sourceforge.net/cvs/?group_id=27707 > > (pass is empty, just hit return) > cvs -d:pserver:anonymous@cvs.sourceforge.net:/cvsroot/gmod login > > cvs -z3 -d:pserver:anonymous@cvs.sourceforge.net:/cvsroot/gmod co > Generic-Genome-Browser > > -jason > On Wed, 18 Feb 2004, Barry Moore wrote: > >> Todd Harris recently pointed me to Bio::Graphics::Browser::Markup, and I >> can't find the damn thing anywhere. I've looked in the bioperl CVS, >> bioperl online docs and in my local installation. Google searches don't >> turn up anything, but suggest that I should be finding a >> Bio/Graphics/Browser/Markup.pm module - which I don't. Someone please >> enlighten the poor simple boy from Utah. >> >> Barry >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l@portal.open-bio.org >> http://portal.open-bio.org/mailman/listinfo/bioperl-l >> > > -- > Jason Stajich > Duke University > jason at cgt.mc.duke.edu > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > From jason at cgt.duhs.duke.edu Thu Feb 19 10:56:11 2004 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Thu Feb 19 11:03:00 2004 Subject: [Bioperl-l] Subject length using BPlite.pm In-Reply-To: References: Message-ID: BTW SearchIO is the only supported Blast parser so I will always suggest moving to SearchIO::blast for your parsing needs.... But I don't think Ian/Peter put a method call in there so you have to use $sbjct->{'LENGTH'} where sbjct came from the nextSbjct call like: use Bio::Tools::BPlite; my $report = new Bio::Tools::BPlite(-fh=>\*STDIN); while(my $sbjct = $report->nextSbjct) { } -jason On Thu, 19 Feb 2004, Adam Witney wrote: > > Hi, > > I am using BPlite.pm to parse a BLAST output, is it possible to get the > Subject length? (that¹s the length of the whole subject sequence, not just > the part involved in the hsp) > > Thanks for any help > > Adam > > > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From jegreenwood25 at hotmail.com Thu Feb 19 13:50:05 2004 From: jegreenwood25 at hotmail.com (Jonathan Greenwood) Date: Thu Feb 19 13:56:19 2004 Subject: [Bioperl-l] More help.... Message-ID: Hi, this combines CGI and BioPerl, what i need to do is open a Genbank file and then parse(write out) the features. But I need these parsed features to go into a textbox for editing, and then be able to save the data I have just edited...Please Help!!! Many thanks...the code is enclosed with the email.... Jonathan Greenwood email: jonathon@mgcheo.med.uottawa.ca Code: #! /usr/local/bin/perl -wT use strict; use CGI qw / :standard /; use CGI::Pretty; use Bio::SeqIO; use Bio::SeqFeature::Generic; use Bio::Location::Simple; use Bio::Location::SplitLocationI; my @features = read_file(param('file')) if param('file'); print header, start_html('Plasmid Feature Editor'); print h1('Plasmid Feature Editor'); print p('Load up a Genbank file to work with, then edit the features in the text box.'); print start_multipart_form(), table({-cellpadding => 10}, TR({-class=> 'resultsbody'}, td(textarea('-name' => 'editarea', '-value' => (@features), '-rows' => 20, '-cols' => 70, '-override' => (@features) || (param('clear')), ), ), ), TR({-class=>'resultstitle'}, td(filefield(-name => 'uploaded_file', -length => 40), ), td(submit(-name => 'submit_button', -value => 'Click to display features'), ), ), TR({-class=>'resultstitle'}, td(submit(-name => 'save_button', -value => 'Click here to save your work'), ), td(reset(), ), ), ), end_form; print end_html; exit; sub read_file { my $fh = param('uploaded_file'); my $gb_parser = Bio::SeqIO->new(-fh=>$fh,-format=>'genbank'); my @features; while (my $seq = $gb_parser->next_seq) { push @features,$seq->get_all_SeqFeatures(); } return @features; } _________________________________________________________________ MSN 8 helps eliminate e-mail viruses. Get 2 months FREE*. http://join.msn.com/?page=features/virus&pgmarket=en-ca&RU=http%3a%2f%2fjoin.msn.com%2f%3fpage%3dmisc%2fspecialoffers%26pgmarket%3den-ca From reece at in-machina.com Thu Feb 19 11:34:55 2004 From: reece at in-machina.com (Reece Hart) Date: Thu Feb 19 14:32:53 2004 Subject: [Bioperl-l] Re: Bio::Prospect In-Reply-To: <20040219025524.GA15515@psychro> References: <200402190238.i1J2av9T015567@portal.open-bio.org> <20040219025524.GA15515@psychro> Message-ID: <1077208495.3514.744.camel@tallac> On Wed, 2004-02-18 at 18:55, Neil Saunders wrote: > > We're pleased to announce the first public release of Bio::Prospect::, a > > Perl API for protein fold recognition with Prospect PRO. A > > Comments, bug reports, patches, and code contributions are encouraged. I'm glad you think it might be useful. Apparently, you should have received the Bioinformatics Applications Note instead of the obviously non-perl'ing, non-threading referees who did get it. ;-) Oh well. I believe that Prospect 2.0 and Pro are really the same product. We started developing this a long time ago (with version 1, then overhauled it for a 2.0 prerelease). I'd appreciate knowing if you use it. or don't use it because it's too difficult to install, etc. Good luck, Reece -- Reece Hart, Ph.D. rkh@gene.com, http://www.gene.com/ Genentech, Inc. 650/225-6133 (voice), -5389 (fax) Bioinformatics and Protein Engineering 1 DNA Way, MS-93 http://www.in-machina.com/~reece/ South San Francisco, CA 94080-4990 reece@in-machina.com, GPG: 0x25EC91A0 From Annie.Law at nrc-cnrc.gc.ca Thu Feb 19 15:50:10 2004 From: Annie.Law at nrc-cnrc.gc.ca (Law, Annie) Date: Thu Feb 19 15:56:54 2004 Subject: [Bioperl-l] New GO Parser and errors loading biosql database Message-ID: <10C94843061E094A98C02EB77CFC328722FE08@nrcmrdex1d.imsb.nrc.ca> Hi Hilmar, Thanks for the tips. I got the GO to go. I have some questions about the GO loading result, bioperl-db make test and overall Order of loading a database. 1) I installed the Graph module and loading of the GO information into an empty databse seems to run okay in the safe mode. However, many of the entries are not able to be inserted (roughly 200). Mostly complaining about how the column name cannot be null. However, I'm not sure if it is related to The make test errors I am having with bioperl-db that I have listed below or if this is an acceptable result. In general how should a user gauge how successful a load of the database was? I guess you can sort of look at the total number of expected number entries. -------------------- WARNING --------------------- MSG: insert in Bio::DB::BioSQL::TermAdaptor (driver) failed, values were ("BBD_pathwayID:C1cyc","","","") FKs (2) Column 'name' cannot be null --------------------------------------------------- Could not store BBD_pathwayID:C1cyc (): ------------- EXCEPTION ------------- MSG: create: object (Bio::Ontology::GOterm) failed to insert or to be found by unique key STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create /usr/lib/perl5/site_perl/5.8.0/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:207 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store /usr/lib/perl5/site_perl/5.8.0/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:253 STACK Bio::DB::Persistent::PersistentObject::store /usr/lib/perl5/site_perl/5.8.0/Bio/DB/Persistent/PersistentObject.pm:270 STACK (eval) /root/bioperl-db/scripts/biosql/load_ontology.pl:508 STACK toplevel /root/bioperl-db/scripts/biosql/load_ontology.pl:490 -------------------------------------- -------------------- WARNING --------------------- MSG: insert in Bio::DB::BioSQL::TermAdaptor (driver) failed, values were ("BBD_pathwayID:abs","","","") FKs (3) Column 'name' cannot be null --------------------------------------------------- Could not store BBD_pathwayID:abs (): ------------- EXCEPTION ------------- MSG: create: object (Bio::Ontology::GOterm) failed to insert or to be found by unique key STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create /usr/lib/perl5/site_perl/5.8.0/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:207 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store /usr/lib/perl5/site_perl/5.8.0/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:253 STACK Bio::DB::Persistent::PersistentObject::store /usr/lib/perl5/site_perl/5.8.0/Bio/DB/Persistent/PersistentObject.pm:270 STACK (eval) /root/bioperl-db/scripts/biosql/load_ontology.pl:508 STACK toplevel /root/bioperl-db/scripts/biosql/load_ontology.pl:490 -------------------------------------- 2) I have a question about The make test bioperl-db results which may be related to the results that I am getting. I seem to be having problems with the make test for bioperl-db. I downloaded the tarball from the CVS website and installed it. I looked at the documentation and I created User biosql which has been given all the permissions it needs. I also renamed the files as stated in the steps below. In the t directory of bioperl-db $ cd t $ cp DBHarness.conf.example DBHarness.biosql.conf $ cp DBHarness.conf.example DBHarness.markerdb.conf I also put a copy of those file in the bioperl-db in the home directory since that was documented for the newest version Of bioperl-db. I did a make test in the bioperl-db directory and go the following results. Most of the tests seem to fail. I am not sure why. [root@microarray bioperl-db]# maket test PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e" "test_harness(0, 'blib/lib', 'blib/arch')" t/*.t t/cluster.......install_driver(mysql) failed: Can't load '/usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi/auto/DBD/mysql/mys ql.so' for module DBD::mysql: /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi/auto/DBD/mysql/mysq l.so: undefined symbol: mysql_ssl_set at /usr/lib/perl5/5.8.0/i386-linux-thread-multi/DynaLoader.pm line 229. at (eval 4) line 3 Compilation failed in require at (eval 4) line 3. Perhaps a required shared library or dll isn't installed where expected at t/DBTestHarness.pm line 211 t/cluster.......dubious Test returned status 255 (wstat 65280, 0xff00) DIED. FAILED tests 1-160 Failed 160/160 tests, 0.00% okay t/comment.......install_driver(mysql) failed: Can't load '/usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi/auto/DBD/mysql/mys ql.so' for module DBD::mysql: /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi/auto/DBD/mysql/mysq l.so: undefined symbol: mysql_ssl_set at /usr/lib/perl5/5.8.0/i386-linux-thread-multi/DynaLoader.pm line 229. at (eval 4) line 3 Compilation failed in require at (eval 4) line 3. Perhaps a required shared library or dll isn't installed where expected at t/DBTestHarness.pm line 211 t/comment.......dubious Test returned status 255 (wstat 65280, 0xff00) DIED. FAILED tests 1-11 Failed 11/11 tests, 0.00% okay t/dbadaptor.....install_driver(mysql) failed: Can't load '/usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi/auto/DBD/mysql/mys ql.so' for module DBD::mysql: /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi/auto/DBD/mysql/mysq l.so: undefined symbol: mysql_ssl_set at /usr/lib/perl5/5.8.0/i386-linux-thread-multi/DynaLoader.pm line 229. at (eval 5) line 3 Compilation failed in require at (eval 5) line 3. Perhaps a required shared library or dll isn't installed where expected at t/DBTestHarness.pm line 211 t/swiss.........dubious Test returned status 255 (wstat 65280, 0xff00) DIED. FAILED tests 1-52 Failed 52/52 tests, 0.00% okay Failed Test Stat Wstat Total Fail Failed List of Failed ---------------------------------------------------------------------------- --- t/cluster.t 255 65280 160 160 100.00% 1-160 t/comment.t 255 65280 11 11 100.00% 1-11 t/dbadaptor.t 255 65280 6 6 100.00% 1-6 t/dblink.t 255 65280 18 18 100.00% 1-18 t/ensembl.t 255 65280 15 15 100.00% 1-15 t/fuzzy2.t 255 65280 21 21 100.00% 1-21 t/genbank.t 255 65280 18 18 100.00% 1-18 t/locuslink.t 255 65280 110 110 100.00% 1-110 t/ontology.t 255 65280 302 302 100.00% 1-302 t/remove.t 255 65280 59 59 100.00% 1-59 t/seqfeature.t 255 65280 48 48 100.00% 1-48 t/simpleseq.t 255 65280 27 27 100.00% 1-27 t/species.t 255 65280 65 65 100.00% 1-65 t/swiss.t 255 65280 52 52 100.00% 1-52 Failed 14/15 test scripts, 6.67% okay. 912/930 subtests failed, 1.94% okay. make: *** [test_dynamic] Error 2 3) Previously when I did a make test for the Bioperl 1.4 installation most of the tests passed 97% I'm not sure whether the errors are expected or not Here are the results of the make test. I only cut out the beginning of the test and the summary at the end. Installation of bioperl ------------- EXCEPTION ------------- MSG: Failed to load module Bio::SeqIO::game. Can't locate IO/String.pm in @INC (@INC contains: . t /root/bioperl-1.4/blib/lib /root/bioperl-1.4/blib/arch /usr/lib/perl5/5.8.0/i386-linux-thread-multi /usr/lib/perl5/5.8.0 /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.0 /usr/lib/perl5/site_perl /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.0 /usr/lib/perl5/vendor_perl) at Bio/SeqIO/game/gameWriter.pm line 63. BEGIN failed--compilation aborted at Bio/SeqIO/game/gameWriter.pm line 63. Compilation failed in require at Bio/SeqIO/game.pm line 77. BEGIN failed--compilation aborted at Bio/SeqIO/game.pm line 77. Compilation failed in require at /root/bioperl-1.4/blib/lib/Bio/Root/Root.pm line 394. STACK Bio::Root::Root::_load_module /root/bioperl-1.4/blib/lib/Bio/Root/Root.pm:396 STACK (eval) /root/bioperl-1.4/blib/lib/Bio/SeqIO.pm:549 STACK Bio::SeqIO::_load_format_module /root/bioperl-1.4/blib/lib/Bio/SeqIO.pm:548 STACK Bio::SeqIO::new /root/bioperl-1.4/blib/lib/Bio/SeqIO.pm:377 STACK (eval) /root/bioperl-1.4/blib/lib/bptutorial.pl:4027 STACK main::__ANON__ /root/bioperl-1.4/blib/lib/bptutorial.pl:4025 STACK main::run_examples /root/bioperl-1.4/blib/lib/bptutorial.pl:4152 STACK toplevel t/tutorial.t:23 -------------------------------------- For more information about the SeqIO system please see the SeqIO docs. This includes ways of checking for formats at compile time, not run time Can't call method "next_seq" on an undefined value at /root/bioperl-1.4/blib/lib/bptutorial.pl line 4035. Failed Test Stat Wstat Total Fail Failed List of Failed ---------------------------------------------------------------------------- --- t/BioDBGFF.t 133 133 100.00% 1-133 t/ESEfinder.t 2 512 12 0 0.00% ?? t/GuessSeqFormat.t 2 512 46 46 100.00% 1-46 t/SeqFeature.t 25 6400 74 0 0.00% ?? t/tutorial.t 2 512 21 3 14.29% 19-21 22 subtests skipped. Failed 5/179 test scripts, 97.21% okay. 182/8122 subtests failed, 97.76% okay. make: *** [test_dynamic] Error 29 #end of installation of bioperl 4) Also, hopefully when I get this all running I would like to know what is the best order for loading the database. I know you mentionned that the GO database information should be loaded before the locuslink information. Here is the list of proposed order of entering information into the database. Can you use load_seqdatabase.pl for loading unigene information? 1. load NCBI taxonomy database with load_ncbi_taxonomy.pl 2. GO information 3. load locuslink database information 4. unigene information which I also had problems with loading information in [root@ bioperl-1.4]#perl /root/bioperl-db/scripts/biosql/load_seqdatabase.pl --dbuser=root --dbpass=ms22 --dbname bioseqdb --namespace "Unigene" -format unigene /root/bioperl--1.4/unigenedata/Hs.data Loading /root/bioperl-1.4/unigenedata/Hs.data ... Bio::SeqIO: unigene cannot be found Exception ------------- EXCEPTION ------------- MSG: Failed to load module Bio::SeqIO::unigene. Can't locate Bio/SeqIO/unigene.pm in @INC (@INC contains: /usr/lib/perl5/5.8.0/i386-linux-thread-multi /usr/lib/perl5/5.8.0 /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.0 /usr/lib/perl5/site_perl /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.0 /usr/lib/perl5/vendor_perl /usr/lib/perl5/5.8.0/i386-linux-thread-multi /usr/lib/perl5/5.8.0 .) at /usr/lib/perl5/site_perl/5.8.0/Bio/Root/Root.pm line 394. STACK Bio::Root::Root::_load_module /usr/lib/perl5/site_perl/5.8.0/Bio/Root/Root.pm:396 STACK (eval) /usr/lib/perl5/site_perl/5.8.0/Bio/SeqIO.pm:549 STACK Bio::SeqIO::_load_format_module /usr/lib/perl5/site_perl/5.8.0/Bio/SeqIO.pm:548 STACK Bio::SeqIO::new /usr/lib/perl5/site_perl/5.8.0/Bio/SeqIO.pm:377 STACK toplevel /root/bioperl-db/scripts/biosql/load_seqdatabase.pl:436 -------------------------------------- For more information about the SeqIO system please see the SeqIO docs. This includes ways of checking for formats at compile time, not run time Can't call method "next_seq" on an undefined value at /root/bioperl-db/scripts/biosql/load_seqdatabase.pl line 460. Thanks very much, Annie. From barry.moore at genetics.utah.edu Thu Feb 19 17:46:38 2004 From: barry.moore at genetics.utah.edu (Barry Moore) Date: Thu Feb 19 17:52:55 2004 Subject: [Fwd: Re: [Bioperl-l] Bio::SeqFeature::Primer Calculating the Primer TM] In-Reply-To: <403517B9.6000102@asalup.org> References: <4033C7A5.8000805@genetics.utah.edu> <403517B9.6000102@asalup.org> Message-ID: <40353CCE.2080502@genetics.utah.edu> Sebastian, These lines account for initiation of duplex formation: #Account for initiation parameters $enthalpy += $thermo_values{substr($sequence, 0, 1)}{enthalpy}; $entropy += $thermo_values{substr($sequence, 0, 1)}{entropy}; $enthalpy += $thermo_values{substr($sequence, -1, 1)}{enthalpy}; $entropy += $thermo_values{substr($sequence, -1, 1)}{entropy}; However your question made me go back and look at the 1997 and 1998 SantaLucia papers again, and I realized that I have applied the symmetry correction incorrectly. Symmetry correction should only be applied to self complimentary oligos. The code could be modified to identify these and apply symmetry correction, but short of that the correction should probably just be removed since most oligos (especially ones used in molecular biology) won't be self complimentary. Rob, that could be fixed by replacing this line: $entropy -= 1.4; with something like this line: #$entropy -= 1.4; #Should only be applied to self-complimentary oligos, so add code to test self complimentarity before applying this line to Tm calculations Barry Moore Dept. Human Genetics University of Utah Sebastian Bassi wrote: > Barry Moore wrote: > >> Sebastian, >> >> You may have picked this up from the bioperl list if you follow >> that, but since it sounded like you were on the python side, I >> thought I'd pass it along. > > > Yes. I am working on this in BioPython. > What it seems this code is lacking is some correction for "douplex > inicialization". That is stated in Santalucia paper. > From steve at trutane.net Thu Feb 19 19:33:33 2004 From: steve at trutane.net (Steve Chervitz Trutane) Date: Thu Feb 19 19:39:36 2004 Subject: [Bioperl-l] searchio scripts In-Reply-To: References: Message-ID: <6A08F6BD-633C-11D8-A988-000A95765236@trutane.net> On Feb 18, 2004, at 3:09 PM, Richard Rouse wrote: > I tried Steve's suggestion by putting this right above: > while ( my $blast = $in->next_result() ) { > > Then putting another } at the end of the script. > > Doing this and then running a large blast output file, I got: > > Using SearchIO->new() > > Report 1: Bio::Search::Result::BlastResult=HASH(0x87d7fb8) > > Report 2: Bio::Search::Result::BlastResult=HASH(0x8dfea60) The ugly 'HASH' output is because we're no longer overloading "". You'll have to call to_string() on the BlastResult object to get prettier output. > > ------------- EXCEPTION ------------- > MSG: Trouble in ResultTableWriter::_set_row_data_func() eval: This is a bug in either the parsing code and/or GenericHit. I get the same trouble parsing /t/data/blast.report in the Bioperl distribution. Was the report you were parsing a TBLASTN? Could you file a bug report on this at http://bugzilla.bioperl.org/ ? Thanks. Steve > ------------- EXCEPTION ------------- > MSG: Can't get identical or conserved data: no data. > STACK Bio::Search::Hit::GenericHit::matches > ../..//Bio/Search/Hit/GenericHit.pm:852 > STACK Bio::Search::Hit::GenericHit::frac_identical > ../..//Bio/Search/Hit/GenericHit.pm:1043 > STACK (eval) (eval 310):1 > STACK Bio::SearchIO::Writer::ResultTableWriter::__ANON__ > ../..//Bio/SearchIO/Writer/ResultTableWriter.pm:327 > STACK Bio::SearchIO::Writer::HitTableWriter::to_string > ../..//Bio/SearchIO/Writer/HitTableWriter.pm:267 > STACK Bio::SearchIO::write_result ../..//Bio/SearchIO.pm:321 > STACK Bio::SearchIO::blast::write_result > ../..//Bio/SearchIO/blast.pm:1495 > STACK toplevel new.mod.hitwriter.pl:106 > > -------------------------------------- > > > > STACK Bio::SearchIO::Writer::ResultTableWriter::__ANON__ > ../..//Bio/SearchIO/Writer/ResultTableWriter.pm:329 > STACK Bio::SearchIO::Writer::HitTableWriter::to_string > ../..//Bio/SearchIO/Writer/HitTableWriter.pm:267 > STACK Bio::SearchIO::write_result ../..//Bio/SearchIO.pm:321 > STACK Bio::SearchIO::blast::write_result > ../..//Bio/SearchIO/blast.pm:1495 > STACK toplevel new.mod.hitwriter.pl:106 > > I tried Lincoln's suggestion as well. In this case I added: > > my $in = Bio::SearchIO->new(-format=>'blast',-fh=>\*ARGV); > > above > > while ( my $blast = $in->next_result() ) { > > > This script just runs getting no result. > > By the way I am running Suse linux 9.0, perl 5.8.1 > > Thanks, > Richard From MRBATESALANN at netscape.net Thu Feb 19 19:58:11 2004 From: MRBATESALANN at netscape.net (MRBATESALANN@netscape.net) Date: Thu Feb 19 20:04:05 2004 Subject: [Bioperl-l] REPLY SOON Message-ID: Dear Friend, As you read this, I don't want you to feel sorry for me, because, I believe everyone will die someday. My name is BATES ALAN a merchant in Dubai, in the U.A.E.I have been diagnosed with Esophageal cancer. It has defiled all forms of medical treatment, and right now I have only about a few months to live, according to medical experts. I have not particularly lived my life so well, as I never really cared for anyone(not even myself)but my business. Though I am very rich, I was never generous, I was always hostile to people and only focused on my business as that was the only thing I cared for. But now I regret all this as I now know that there is more to life than just wanting to have or make all the money in the world. I believe when God gives me a second chance to come to this world I would live my life a different way from how I have lived it. Now that God has called me, I have willed and given most of my property and assets to my immediate and extended family members as well as a few close friends. I want God to be merciful to me and accept my soul so, I have decided to give alms to charity organizations, as I want this to be one of the last good deeds I do on earth. So far, I have distributed money to some charity organizations in the U.A.E, Algeria and Malaysia. Now that my health has deteriorated so badly, I cannot do this myself anymore. I once asked members of my family to close one of my accounts and distribute the money which I have there to charity organization in Bulgaria and Pakistan, they refused and kept the money to themselves. Hence, I do not trust them anymore, as they seem not to be contended with what I have left for them. The last of my money which no one knows of is the huge cash deposit of eighteen million dollars $18,000,000,00 that I have with a finance/Security Company abroad. I will want you to help me collect this deposit and dispatched it to charity organizations. I have set aside 10% for you and for your time. God be with you. BATES ALAN From rrouse at biomail.ucsd.edu Thu Feb 19 20:44:10 2004 From: rrouse at biomail.ucsd.edu (Richard Rouse) Date: Thu Feb 19 20:50:14 2004 Subject: [Bioperl-l] searchio scripts In-Reply-To: <6A08F6BD-633C-11D8-A988-000A95765236@trutane.net> Message-ID: Steve, The report I was parsing was a BLASTN. I'll submit a bug report. Richard -----Original Message----- From: Steve Chervitz Trutane [mailto:steve@trutane.net] Sent: Thursday, February 19, 2004 4:34 PM To: rrouse@biomail.ucsd.edu Cc: Bioperl Subject: Re: [Bioperl-l] searchio scripts On Feb 18, 2004, at 3:09 PM, Richard Rouse wrote: > I tried Steve's suggestion by putting this right above: > while ( my $blast = $in->next_result() ) { > > Then putting another } at the end of the script. > > Doing this and then running a large blast output file, I got: > > Using SearchIO->new() > > Report 1: Bio::Search::Result::BlastResult=HASH(0x87d7fb8) > > Report 2: Bio::Search::Result::BlastResult=HASH(0x8dfea60) The ugly 'HASH' output is because we're no longer overloading "". You'll have to call to_string() on the BlastResult object to get prettier output. > > ------------- EXCEPTION ------------- > MSG: Trouble in ResultTableWriter::_set_row_data_func() eval: This is a bug in either the parsing code and/or GenericHit. I get the same trouble parsing /t/data/blast.report in the Bioperl distribution. Was the report you were parsing a TBLASTN? Could you file a bug report on this at http://bugzilla.bioperl.org/ ? Thanks. Steve > ------------- EXCEPTION ------------- > MSG: Can't get identical or conserved data: no data. > STACK Bio::Search::Hit::GenericHit::matches > ../..//Bio/Search/Hit/GenericHit.pm:852 > STACK Bio::Search::Hit::GenericHit::frac_identical > ../..//Bio/Search/Hit/GenericHit.pm:1043 > STACK (eval) (eval 310):1 > STACK Bio::SearchIO::Writer::ResultTableWriter::__ANON__ > ../..//Bio/SearchIO/Writer/ResultTableWriter.pm:327 > STACK Bio::SearchIO::Writer::HitTableWriter::to_string > ../..//Bio/SearchIO/Writer/HitTableWriter.pm:267 > STACK Bio::SearchIO::write_result ../..//Bio/SearchIO.pm:321 > STACK Bio::SearchIO::blast::write_result > ../..//Bio/SearchIO/blast.pm:1495 > STACK toplevel new.mod.hitwriter.pl:106 > > -------------------------------------- > > > > STACK Bio::SearchIO::Writer::ResultTableWriter::__ANON__ > ../..//Bio/SearchIO/Writer/ResultTableWriter.pm:329 > STACK Bio::SearchIO::Writer::HitTableWriter::to_string > ../..//Bio/SearchIO/Writer/HitTableWriter.pm:267 > STACK Bio::SearchIO::write_result ../..//Bio/SearchIO.pm:321 > STACK Bio::SearchIO::blast::write_result > ../..//Bio/SearchIO/blast.pm:1495 > STACK toplevel new.mod.hitwriter.pl:106 > > I tried Lincoln's suggestion as well. In this case I added: > > my $in = Bio::SearchIO->new(-format=>'blast',-fh=>\*ARGV); > > above > > while ( my $blast = $in->next_result() ) { > > > This script just runs getting no result. > > By the way I am running Suse linux 9.0, perl 5.8.1 > > Thanks, > Richard From sbassi at asalup.org Thu Feb 19 21:25:14 2004 From: sbassi at asalup.org (Sebastian Bassi) Date: Thu Feb 19 21:32:12 2004 Subject: [Bioperl-l] Tm calculation Message-ID: <4035700A.3050302@asalup.org> Hello, I suscribed to this list because I was told there is a thread about Tm. I did a Tm function for Biopython, based on the EMBOSS DAN. The good thing is that it performed exactly as DAN. The problem was that DAN formulae was dated (it was previous to Santalucia work). I have some questions: -Regarding "#$entropy -= 1.4; #Should only be applied to self-complimentary oligos, so add code to test self complimentarity before applying this line to Tm calculations " How do you define self complementary oligos? Is this a self complementary oligo: AAACCCTAGGGTTT? What about this: AAACCCTCAGGGTTT? -Does the proposed version test for overriding pairs? I mean, take this sequence: ACCCGTGAGCTG. How many CC pairs that the program detects? The right anwser should be 2. But if you are using a standard string find function, it may overlook the overrriding CC and detect only one. I had this problem in one of my atemps. So I had to made my own find string function (somebody from biopython mailing list coded this function at my request, so I didn't actually wrote it). I'm asking this instead of looking at the code because I don't know enought Perl (that's why I work on Python :) Sorry for my English, it is not my native languaje! -- Best regards, //=\ Sebastian Bassi - Diplomado en Ciencia y Tecnologia, UNQ //=\ \=// IT Manager Advanta Seeds - Balcarce Research Center - \=// //=\ Pro secretario ASALUP - www.asalup.org - PGP key available //=\ \=// E-mail: sbassi@genesdigitales.com - ICQ UIN: 3356556 - \=// http://Bioinformatica.info From hlapp at gmx.net Fri Feb 20 03:37:17 2004 From: hlapp at gmx.net (Hilmar Lapp) Date: Fri Feb 20 03:43:26 2004 Subject: [Bioperl-l] New GO Parser and errors loading biosql database In-Reply-To: <10C94843061E094A98C02EB77CFC328722FE08@nrcmrdex1d.imsb.nrc.ca> Message-ID: On Thursday, February 19, 2004, at 12:50 PM, Law, Annie wrote: > However, many of the entries are not able to be > inserted (roughly 200). > Mostly complaining about how the column name cannot be null. However, > I'm > not sure if it is related to > The make test errors I am having with bioperl-db that I have listed > below or > if this is an acceptable result. > In general how should a user gauge how successful a load of the > database > was? I guess you can sort > of look at the total number of expected number entries. It's always a good idea to look over the errors and check whether there are any that just don't make sense. The one below is an example: > > -------------------- WARNING --------------------- > MSG: insert in Bio::DB::BioSQL::TermAdaptor (driver) failed, values > were > ("BBD_pathwayID:C1cyc","","","") FKs (2) > Column 'name' cannot be null 'BBD_pathwayID:C1cyc' is *not* a GO term (all GO terms have identifiers that start with GO:). It's in fact a dbxref of a term that erroneously ends up as a term because in the 1.4 release of bioperl a bug had been introduced into the dagflat parser (which the GO parser basically is identical to). I strongly recommend you upgrade at a minimum the module Bio/OntologyIO/dagflat.pm with the one from cvs (tag branch-1-4). Alternatively, update the entire bioperl distribution from cvs (again, use branch-1-4). Doing so will get rid of most if not all of the errors. Generally speaking, there should be no or only a few terms that fail to load, and if any fail then they should only fail because of column width constraints or something similar. > > 2) I have a question about The make test bioperl-db results which may > be > related to the results that I am getting. I seem to be having problems > with > the make test for bioperl-db. I downloaded the tarball from the CVS > website > and installed it. > I looked at the documentation and I created User biosql which has been > given > all the permissions it needs. I also renamed the files as stated in > the > steps below. In the t directory of bioperl-db $ cd t $ cp > DBHarness.conf.example DBHarness.biosql.conf $ cp > DBHarness.conf.example > DBHarness.markerdb.conf You do not need to create DBHarness.markerdb.conf anymore. It's not used. > > I also put a copy of those file in the bioperl-db in the home directory > since that was documented for the newest version Of bioperl-db. Not sure where you found that. The only place where this file needs to reside is in the t/ directory. > I did a make test in the bioperl-db directory and go the following > results. > Most of the tests seem to fail. I am not sure why. Generally speaking, just read the error message. It often says why, and so does it here. > > [root@microarray bioperl-db]# maket test > PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e" > "test_harness(0, 'blib/lib', 'blib/arch')" t/*.t > t/cluster.......install_driver(mysql) failed: Can't load > '/usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi/auto/DBD/ > mysql/mys > ql.so' for module DBD::mysql: > /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi/auto/DBD/ > mysql/mysq > l.so: undefined symbol: mysql_ssl_set at > /usr/lib/perl5/5.8.0/i386-linux-thread-multi/DynaLoader.pm line 229. > at > (eval 4) line 3 Compilation failed in require at (eval 4) line 3. > Perhaps a > required shared library or dll isn't installed where expected at > t/DBTestHarness.pm line 211 This says that your DBI driver could not be loaded. It has nothing to do with bioperl-db. You have either not or not successfully installed the mysql DBI driver, or you have installed it at a non-standard location, or you have installed it under another version of perl. Make sure the tests for the DBD::mysql module pass before trying to use the driver. Obviously, if the DBI driver can't be loaded, none of the tests will succeed, as then no database connection can be opened. > > 3) Previously when I did a make test for the Bioperl 1.4 installation > most > of the tests passed 97% I'm not sure whether the errors are expected > or not > Generally, *all* tests of a stable bioperl distribution (which 1.4 is) are supposed to pass. If one or more don't, then chances are high that something is wrong. > Here are the results of the make test. I only cut out the beginning > of the > test and the summary at the end. Installation of bioperl > > ------------- EXCEPTION ------------- > MSG: Failed to load module Bio::SeqIO::game. Can't locate IO/String.pm The message pretty much says it all. Bioperl does depend at a lot of places on IO::String, so I'd strongly recommend you go ahead and install it. > > 4) Also, hopefully when I get this all running I would like to know > what is > the best order for loading the database. I know you mentionned that > the GO > database information should be loaded before the locuslink > information. Here > is the list of proposed order of entering information into the > database. > Can you use load_seqdatabase.pl for loading unigene information? Yes you can. Make sure you read the POD of load_seqdatabase.pl to see how. > 1. load NCBI taxonomy database with load_ncbi_taxonomy.pl > 2. GO information The only things for which order matters are those which are referenced, but provided only in an incomplete manner, by annotated data sources. Hence, species information and any ontology that your data source uses for annotation should be loaded in advance so that upon loading of the annotated sequences the referenced entities are found by look-up. > 3. load locuslink database information > 4. unigene information which I also had problems with loading > information > in > [root@ bioperl-1.4]#perl > /root/bioperl-db/scripts/biosql/load_seqdatabase.pl > --dbuser=root --dbpass=ms22 --dbname bioseqdb > --namespace "Unigene" -format unigene > /root/bioperl--1.4/unigenedata/Hs.data > Loading /root/bioperl-1.4/unigenedata/Hs.data ... > Bio::SeqIO: unigene cannot be found > Exception > ------------- EXCEPTION ------------- > MSG: Failed to load module Bio::SeqIO::unigene. Can't locate > Bio/SeqIO/unigene.pm in @INC (@INC contains: The message pretty much says it. The indicated module, which is the bioperl unigene parser, fails to load. The reason is most likely that you didn't install bioperl, or installed in a location that is not in Perl's default search path. If the latter is the case, you need to setup the PERL5LIB environment variable prior to running any code that uses those modules. -hilmar -- ------------------------------------------------------------- Hilmar Lapp email: lapp at gnf.org GNF, San Diego, Ca. 92121 phone: +1-858-812-1757 ------------------------------------------------------------- From brian_osborne at cognia.com Fri Feb 20 08:04:52 2004 From: brian_osborne at cognia.com (Brian Osborne) Date: Fri Feb 20 08:11:08 2004 Subject: [Bioperl-l] Version 1.4 for Windows Message-ID: O|B|F - Open Bioinformatics Foundation Update: Bioperl 1.4 for Windows February 20, 2004 ------------------------------------------------------------------------ http://news.open-bio.org/archives/2004_02.html#000068 ------------------------------------------------------------------------ Bioperl version 1.4 for Windows is available. Thanks once again to Nigam Shah for creating and testing the PPM and PPD files. -- From vince.forgetta at staff.mcgill.ca Fri Feb 20 10:14:40 2004 From: vince.forgetta at staff.mcgill.ca (Vince Forgetta) Date: Fri Feb 20 10:25:10 2004 Subject: [Bioperl-l] MSG: acc does not exist, but acc is OK and bioperl version is 1.2.2 Message-ID: <40362460.8050004@staff.mcgill.ca> Hi all, I am trying to use bioperl to retrieve a refseq accession e.g. NM_003000, but it throws the following error: ------------- EXCEPTION ------------- MSG: acc does not exist STACK Bio::DB::WebDBSeqI::get_Seq_by_acc /usr/lib/perl5/site_perl/5.8.0/Bio/DB/WebDBSeqI.pm:177 My code is: use Bio::DB::RefSeq; $gb = new Bio::DB::RefSeq; my $seq; $seq = $gb->get_Seq_by_acc('$accession'); I have version 1.2.2 so it's not a problem of changing "protein" to "nucleotide" in GenBank.pm. When I change get_Seq_by_acc in WebDBSeqI.pm like so (added "$seqid" and "ARRAY" to distinguish between the error messages): sub get_Seq_by_acc { my ($self,$seqid) = @_; $self->_sleep; my $seqio = $self->get_Stream_by_acc($seqid); $self->throw("acc $seqid does not exist") if( ! defined $seqio ); my @seqs; while( my $seq = $seqio->next_seq() ) { push @seqs, $seq; } $self->throw("ARRAY acc does not exist") unless @seqs; if( wantarray ) { return @seqs } else { return shift @seqs } } I get: ------------- EXCEPTION ------------- MSG: ARRAY acc does not exist STACK Bio::DB::WebDBSeqI::get_Seq_by_acc /usr/lib/perl5/site_perl/5.8.0/Bio/DB/WebDBSeqI.pm:177 so is the problem that the retreival of the sequence fails? Very odd. Thanks for the help. Vince -- Vincenzo Forgetta Bioinformatics McGill University and Genome Quebec Innovation Centre 740 Dr. Penfield Avenue Room 7211 Montreal, Quebec Canada, H3A 1A4 Tel: 514-398-3311 00476 Email: vince.forgetta@staff.mcgill.ca From ak at ebi.ac.uk Fri Feb 20 10:39:03 2004 From: ak at ebi.ac.uk (Andreas Kahari) Date: Fri Feb 20 10:45:19 2004 Subject: [Bioperl-l] MSG: acc does not exist, but acc is OK and bioperl version is 1.2.2 In-Reply-To: <40362460.8050004@staff.mcgill.ca> References: <40362460.8050004@staff.mcgill.ca> Message-ID: <20040220153903.GA18842@ebi.ac.uk> On Fri, Feb 20, 2004 at 10:14:40AM -0500, Vince Forgetta wrote: [cut] > ------------- EXCEPTION ------------- > MSG: acc does not exist > STACK Bio::DB::WebDBSeqI::get_Seq_by_acc > /usr/lib/perl5/site_perl/5.8.0/Bio/DB/WebDBSeqI.pm:177 > > My code is: > > use Bio::DB::RefSeq; > $gb = new Bio::DB::RefSeq; > my $seq; > $seq = $gb->get_Seq_by_acc('$accession'); Perl does not interpolate variables in within single quotes. You probably want to say $seq = $gb->get_Seq_by_acc($accession); Cheers, Andreas -- | {} | Andreas K?h?ri |()()| |{}{}| EMBL, European Bioinformatics Institute | () | | {} | Wellcome Trust Genome Campus, Hinxton |()()| |{}{}| Cambridge, CB10 1SD | () | | {} | United Kingdom |()()| From vince.forgetta at staff.mcgill.ca Fri Feb 20 10:40:48 2004 From: vince.forgetta at staff.mcgill.ca (Vince Forgetta) Date: Fri Feb 20 10:54:16 2004 Subject: [Bioperl-l] MSG: acc does not exist, but acc is OK and bioperl version is 1.2.2 In-Reply-To: <20040220153903.GA18842@ebi.ac.uk> References: <40362460.8050004@staff.mcgill.ca> <20040220153903.GA18842@ebi.ac.uk> Message-ID: <40362A80.8000001@staff.mcgill.ca> I had originally put $accession without single quotes and got the same error. I tried the single quotes as a debugging step. I still have the same problem when I remove them. The problem seems to be sporadic. some days I can get accessions and other days I run into problems. Could this be the problem: http://bioperl.org/Core/Latest/faq.html#Q2.3 Thanks. Andreas Kahari wrote: >On Fri, Feb 20, 2004 at 10:14:40AM -0500, Vince Forgetta wrote: >[cut] > > >>------------- EXCEPTION ------------- >>MSG: acc does not exist >>STACK Bio::DB::WebDBSeqI::get_Seq_by_acc >>/usr/lib/perl5/site_perl/5.8.0/Bio/DB/WebDBSeqI.pm:177 >> >>My code is: >> >> use Bio::DB::RefSeq; >> $gb = new Bio::DB::RefSeq; >> my $seq; >> $seq = $gb->get_Seq_by_acc('$accession'); >> >> > >Perl does not interpolate variables in within single quotes. >You probably want to say > > $seq = $gb->get_Seq_by_acc($accession); > > >Cheers, >Andreas > > > -- Vincenzo Forgetta Computational Biology McGill University and Genome Quebec Innovation Centre 740 Dr. Penfield Avenue Room 7211 Montreal, Quebec Canada, H3A 1A4 Tel: 514-398-3311 00476 Email: vince.forgetta@staff.mcgill.ca From jason at cgt.duhs.duke.edu Fri Feb 20 11:14:32 2004 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Fri Feb 20 11:21:04 2004 Subject: [Bioperl-l] MSG: acc does not exist, but acc is OK and bioperl version is 1.2.2 In-Reply-To: <40362A80.8000001@staff.mcgill.ca> References: <40362460.8050004@staff.mcgill.ca> <20040220153903.GA18842@ebi.ac.uk> <40362A80.8000001@staff.mcgill.ca> Message-ID: You can see if it is the case by going to http://www.ebi.ac.uk/cgi-bin/dbfetch and plugging in your accession. refseq is available for download from ncbi site - you will find this faster and more reliable than most webbased sequence server I expect. -jason On Fri, 20 Feb 2004, Vince Forgetta wrote: > I had originally put $accession without single quotes and got the same > error. I tried the single quotes as a debugging step. I still have the > same problem when I remove them. > > The problem seems to be sporadic. some days I can get accessions and > other days I run into problems. Could this be the problem: > > http://bioperl.org/Core/Latest/faq.html#Q2.3 > > Thanks. > > Andreas Kahari wrote: > > >On Fri, Feb 20, 2004 at 10:14:40AM -0500, Vince Forgetta wrote: > >[cut] > > > > > >>------------- EXCEPTION ------------- > >>MSG: acc does not exist > >>STACK Bio::DB::WebDBSeqI::get_Seq_by_acc > >>/usr/lib/perl5/site_perl/5.8.0/Bio/DB/WebDBSeqI.pm:177 > >> > >>My code is: > >> > >> use Bio::DB::RefSeq; > >> $gb = new Bio::DB::RefSeq; > >> my $seq; > >> $seq = $gb->get_Seq_by_acc('$accession'); > >> > >> > > > >Perl does not interpolate variables in within single quotes. > >You probably want to say > > > > $seq = $gb->get_Seq_by_acc($accession); > > > > > >Cheers, > >Andreas > > > > > > > > > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From vince.forgetta at staff.mcgill.ca Fri Feb 20 11:18:27 2004 From: vince.forgetta at staff.mcgill.ca (Vince Forgetta) Date: Fri Feb 20 11:29:22 2004 Subject: [Bioperl-l] MSG: acc does not exist, but acc is OK and bioperl version is 1.2.2 In-Reply-To: References: <40362460.8050004@staff.mcgill.ca> <20040220153903.GA18842@ebi.ac.uk> <40362A80.8000001@staff.mcgill.ca> Message-ID: <40363353.6080303@staff.mcgill.ca> Thanks a bunch ! Turns out that EBI doesn't have NM_003000, but NCBI does. I thought they were the same thing ! I'll just DL refseq from NCBI. Vince Jason Stajich wrote: >You can see if it is the case by going to >http://www.ebi.ac.uk/cgi-bin/dbfetch >and plugging in your accession. > >refseq is available for download from ncbi site - you will find this >faster and more reliable than most webbased sequence server I expect. > >-jason >On Fri, 20 Feb 2004, Vince Forgetta wrote: > > > >>I had originally put $accession without single quotes and got the same >>error. I tried the single quotes as a debugging step. I still have the >>same problem when I remove them. >> >>The problem seems to be sporadic. some days I can get accessions and >>other days I run into problems. Could this be the problem: >> >>http://bioperl.org/Core/Latest/faq.html#Q2.3 >> >>Thanks. >> >>Andreas Kahari wrote: >> >> >> >>>On Fri, Feb 20, 2004 at 10:14:40AM -0500, Vince Forgetta wrote: >>>[cut] >>> >>> >>> >>> >>>>------------- EXCEPTION ------------- >>>>MSG: acc does not exist >>>>STACK Bio::DB::WebDBSeqI::get_Seq_by_acc >>>>/usr/lib/perl5/site_perl/5.8.0/Bio/DB/WebDBSeqI.pm:177 >>>> >>>>My code is: >>>> >>>> use Bio::DB::RefSeq; >>>> $gb = new Bio::DB::RefSeq; >>>> my $seq; >>>> $seq = $gb->get_Seq_by_acc('$accession'); >>>> >>>> >>>> >>>> >>>Perl does not interpolate variables in within single quotes. >>>You probably want to say >>> >>> $seq = $gb->get_Seq_by_acc($accession); >>> >>> >>>Cheers, >>>Andreas >>> >>> >>> >>> >>> >> >> >> > >-- >Jason Stajich >Duke University >jason at cgt.mc.duke.edu > > > -- Vincenzo Forgetta Computational Biology McGill University and Genome Quebec Innovation Centre 740 Dr. Penfield Avenue Room 7211 Montreal, Quebec Canada, H3A 1A4 Tel: 514-398-3311 00476 Email: vince.forgetta@staff.mcgill.ca From cjfields at uiuc.edu Fri Feb 20 13:39:03 2004 From: cjfields at uiuc.edu (Chris Fields) Date: Fri Feb 20 13:45:21 2004 Subject: [Bioperl-l] Version 1.4 for Windows In-Reply-To: References: Message-ID: <6.0.0.22.2.20040220122738.01bd1480@express.cites.uiuc.edu> There are two additional dependencies listed (HTML-Entities and IO-Scalar) that PPM 3.1 can't locate, although they are part of HTML-Parser and IO-stringy, listed as separate dependencies. I think this is confusing PPM. When using PPM, typing "install bioperl" only installs ver. 1.2. Asking it to "install Bioperl-1.4" fails b/c it can't find the two dependencies listed above. Any workarounds? Chris At 07:04 AM 2/20/2004, you wrote: >O|B|F - Open Bioinformatics Foundation Update: Bioperl 1.4 for Windows > > February 20, 2004 > > >------------------------------------------------------------------------ > >http://news.open-bio.org/archives/2004_02.html#000068 > > > >------------------------------------------------------------------------ > >Bioperl version 1.4 for Windows is available. Thanks once again >to Nigam Shah for creating and testing the PPM and PPD files. > >-- > > > >_______________________________________________ >Bioperl-l mailing list >Bioperl-l@portal.open-bio.org >http://portal.open-bio.org/mailman/listinfo/bioperl-l __________________________________ Chris Fields - Postdoctoral Researcher Lab of Dr. Robert Switzer Address: University of Illinois at Urbana-Champaign Dept. of Biochemistry - 323 RAL 600 S. Mathews Ave. Urbana, IL 61801 Phone : (217) 333-7098 Fax : (217) 244-5858 From sjmiller at email.arizona.edu Fri Feb 20 14:45:01 2004 From: sjmiller at email.arizona.edu (Susan J. Miller) Date: Fri Feb 20 14:51:13 2004 Subject: [Bioperl-l] Problem with Bio::Factor::EMBOSS Message-ID: <403663BD.8070301@email.arizona.edu> We just installed bioperl-run-1.4 (on a sun4u sparc SUNW,Ultra-4 running Solaris8), and I am not able to pass the value zero as a parameter to the EMBOSS tools - when I do I get an error message saying "MISSING MANDATORY ATTRIBUTE". I've tried a couple different EMBOSS programs, and the result is the same. I've also tried various forms of zero (0, '0', a variable contining zero)... My code: ========================================================================== use Bio::Factory::EMBOSS; @files = glob("*.fasta"); foreach $f (@files) { $emb_fac = Bio::Factory::EMBOSS->new(-verbose => 1); $rev = $emb_fac->program('revseq'); $rev->run({'-sequence' => "$f", -outseq => "SE1", -sbegin1 => '0', -send1 => '100'}); # '0' gives error! $mut = $emb_fac->program('msbar'); $mut->run({'-sequence' => "$f", '-count' => '100', '-point' => '1', '-block' => '1', '-codon' => 0, '-outseq' => "$f.mut"}); } ========================================================================== Both revseq and msbar work if I pass a non-zero value. With 0, here is the verbose output: ========================================================================== $VAR1 = { '-outseq' => 'SE1', '-send1' => '100', '-sbegin1' => '0', '-sequence' => 'Seq1.cgi' }; Input attr: outseq => SE1 Input attr: send1 => 100 Input attr: sbegin1 => 0 Input attr: sequence => Seq1.cgi Command line: revseq -outseq SE1 -send1 100 -sbegin1 -sequence Seq1.cgi -auto Died: value required for -sbegin1 $VAR1 = { '-outseq' => 'Seq1.cgi.mut', '-count' => '100', '-codon' => 0, '-sequence' => 'Seq1.cgi', '-block' => '1', '-point' => '1' }; Input attr: outseq => Seq1.cgi.mut Input attr: count => 100 Input attr: codon => 0 Input attr: sequence => Seq1.cgi Input attr: block => 1 Input attr: point => 1 -------------------------------------- MISSING MANDATORY ATTRIBUTE: -codon -------------------------------------- $VAR1 = { 'category' => 'mandatory', 'values' => '0(None)1(Any of the following)2(Insertions)3(Deletions)4(Changes)5(Duplications)6(Moves)', 'descr' => 'Types of codon mutations to perform. These are only done if the sequence is nucleic.', 'unnamed' => 0, 'default' => '0' }; ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: Program msbar needs attribute [-codon]! STACK: Error::throw STACK: Bio::Root::Root::throw /usr/local/lib/perl5/site_perl/5.6.0/Bio/Root/Root.pm:342 STACK: Bio::Tools::Run::EMBOSSApplication::run /usr/local/lib/perl5/site_perl/5.6.0/Bio/Tools/Run/EMBOSSApplication.pm:229 STACK: ./ex4b.pl:28 ----------------------------------------------------------- ========================================================================== Is this a bug? Is there a work-around? -- Thanks, -susan Susan J. Miller Biotechnology Computing Facility Arizona Research Laboratories Bio West 228 University of Arizona Tucson, AZ 85721 (520) 626-2597 From jason at cgt.duhs.duke.edu Fri Feb 20 14:53:58 2004 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Fri Feb 20 15:00:14 2004 Subject: [Bioperl-l] Problem with Bio::Factor::EMBOSS In-Reply-To: <403663BD.8070301@email.arizona.edu> References: <403663BD.8070301@email.arizona.edu> Message-ID: Presumably changing line 226 to + unless (defined $input->{$attr}) { from - unless ( $input->{$attr}) { will fix it - can you let us know if it does. -jason On Fri, 20 Feb 2004, Susan J. Miller wrote: > We just installed bioperl-run-1.4 (on a sun4u sparc SUNW,Ultra-4 running > Solaris8), and I am not able to pass the value zero as a parameter to > the EMBOSS tools - when I do I get an error message saying "MISSING > MANDATORY ATTRIBUTE". I've tried a couple different EMBOSS programs, > and the result is the same. I've also tried various forms of zero (0, > '0', a variable contining zero)... > > My code: > ========================================================================== > use Bio::Factory::EMBOSS; > > @files = glob("*.fasta"); > foreach $f (@files) { > $emb_fac = Bio::Factory::EMBOSS->new(-verbose => 1); > > $rev = $emb_fac->program('revseq'); > $rev->run({'-sequence' => "$f", -outseq => "SE1", > -sbegin1 => '0', -send1 => '100'}); # '0' gives error! > > $mut = $emb_fac->program('msbar'); > $mut->run({'-sequence' => "$f", '-count' => '100', > '-point' => '1', '-block' => '1', > '-codon' => 0, > '-outseq' => "$f.mut"}); > } > ========================================================================== > Both revseq and msbar work if I pass a non-zero value. With 0, here is > the verbose output: > ========================================================================== > $VAR1 = { > '-outseq' => 'SE1', > '-send1' => '100', > '-sbegin1' => '0', > '-sequence' => 'Seq1.cgi' > }; > Input attr: outseq => SE1 > Input attr: send1 => 100 > Input attr: sbegin1 => 0 > Input attr: sequence => Seq1.cgi > Command line: revseq -outseq SE1 -send1 100 -sbegin1 -sequence Seq1.cgi > -auto > Died: value required for -sbegin1 > > $VAR1 = { > '-outseq' => 'Seq1.cgi.mut', > '-count' => '100', > '-codon' => 0, > '-sequence' => 'Seq1.cgi', > '-block' => '1', > '-point' => '1' > }; > Input attr: outseq => Seq1.cgi.mut > Input attr: count => 100 > Input attr: codon => 0 > Input attr: sequence => Seq1.cgi > Input attr: block => 1 > Input attr: point => 1 > -------------------------------------- > MISSING MANDATORY ATTRIBUTE: -codon > -------------------------------------- > $VAR1 = { > 'category' => 'mandatory', > 'values' => '0(None)1(Any of the > following)2(Insertions)3(Deletions)4(Changes)5(Duplications)6(Moves)', > 'descr' => 'Types of codon mutations to perform. These are > only done if the sequence is nucleic.', > 'unnamed' => 0, > 'default' => '0' > }; > > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: Program msbar needs attribute [-codon]! > > STACK: Error::throw > STACK: Bio::Root::Root::throw > /usr/local/lib/perl5/site_perl/5.6.0/Bio/Root/Root.pm:342 > STACK: Bio::Tools::Run::EMBOSSApplication::run > /usr/local/lib/perl5/site_perl/5.6.0/Bio/Tools/Run/EMBOSSApplication.pm:229 > STACK: ./ex4b.pl:28 > ----------------------------------------------------------- > ========================================================================== > > > Is this a bug? Is there a work-around? > > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From shawnh at stanford.edu Fri Feb 20 14:58:21 2004 From: shawnh at stanford.edu (Shawn Hoon) Date: Fri Feb 20 15:04:32 2004 Subject: [Bioperl-l] quick question Message-ID: <22541E64-63DF-11D8-A0E8-000A95783436@stanford.edu> Anybody have a quick way of parsing concatenated clustalw files? I could split the files up but wonder if anybody had a quick solution. I don't think the AlignIO parse seems to handle this. shawn From jason at cgt.duhs.duke.edu Fri Feb 20 15:10:25 2004 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Fri Feb 20 15:16:39 2004 Subject: [Bioperl-l] quick question In-Reply-To: <22541E64-63DF-11D8-A0E8-000A95783436@stanford.edu> References: <22541E64-63DF-11D8-A0E8-000A95783436@stanford.edu> Message-ID: if there is a clustalw header separating the concatenated alignments AlignIO is supposed to handle it. We might want to change the code below in AlignIO::clustalw to ignore blank lines... my $first_line; if( defined ($first_line = $self->_readline ) && $first_line !~ /CLUSTAL/ ) { $self->warn("trying to parse a file which does not start with a CLUSTAL header"); } On Fri, 20 Feb 2004, Shawn Hoon wrote: > Anybody have a quick way of parsing concatenated clustalw files? > I could split the files up but wonder if anybody had a quick solution. > I don't think the AlignIO parse seems to handle this. > > shawn > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From MRBATESALANN at netscape.net Fri Feb 20 15:23:31 2004 From: MRBATESALANN at netscape.net (MRBATESALANN@netscape.net) Date: Fri Feb 20 15:30:38 2004 Subject: [Bioperl-l] REPLY SOON Message-ID: Dear Friend, As you read this, I don't want you to feel sorry for me, because, I believe everyone will die someday. My name is BATES ALAN a merchant in Dubai, in the U.A.E.I have been diagnosed with Esophageal cancer. It has defiled all forms of medical treatment, and right now I have only about a few months to live, according to medical experts. I have not particularly lived my life so well, as I never really cared for anyone(not even myself)but my business. Though I am very rich, I was never generous, I was always hostile to people and only focused on my business as that was the only thing I cared for. But now I regret all this as I now know that there is more to life than just wanting to have or make all the money in the world. I believe when God gives me a second chance to come to this world I would live my life a different way from how I have lived it. Now that God has called me, I have willed and given most of my property and assets to my immediate and extended family members as well as a few close friends. I want God to be merciful to me and accept my soul so, I have decided to give alms to charity organizations, as I want this to be one of the last good deeds I do on earth. So far, I have distributed money to some charity organizations in the U.A.E, Algeria and Malaysia. Now that my health has deteriorated so badly, I cannot do this myself anymore. I once asked members of my family to close one of my accounts and distribute the money which I have there to charity organization in Bulgaria and Pakistan, they refused and kept the money to themselves. Hence, I do not trust them anymore, as they seem not to be contended with what I have left for them. The last of my money which no one knows of is the huge cash deposit of eighteen million dollars $18,000,000,00 that I have with a finance/Security Company abroad. I will want you to help me collect this deposit and dispatched it to charity organizations. I have set aside 10% for you and for your time. God be with you. BATES ALAN From shawnh at stanford.edu Fri Feb 20 18:00:31 2004 From: shawnh at stanford.edu (Shawn Hoon) Date: Fri Feb 20 18:06:42 2004 Subject: [Bioperl-l] quick question In-Reply-To: References: <22541E64-63DF-11D8-A0E8-000A95783436@stanford.edu> Message-ID: <954193CE-63F8-11D8-A0E8-000A95783436@stanford.edu> On Feb 20, 2004, at 12:10 PM, Jason Stajich wrote: > if there is a clustalw header separating the concatenated alignments > AlignIO is supposed to handle it. > Maybe I'm wrong, but there is no catch in the clustalw module to break if one sees the header so it keeps going.. in anycase, i have committed a fix that seems to work. thanks shawn > We might want to change the code below in AlignIO::clustalw to ignore > blank lines... > > my $first_line; > if( defined ($first_line = $self->_readline ) > && $first_line !~ /CLUSTAL/ ) { > $self->warn("trying to parse a file which does not start with > a CLUSTAL header"); > } > > On Fri, 20 Feb 2004, Shawn Hoon wrote: > >> Anybody have a quick way of parsing concatenated clustalw files? >> I could split the files up but wonder if anybody had a quick solution. >> I don't think the AlignIO parse seems to handle this. >> >> shawn >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l@portal.open-bio.org >> http://portal.open-bio.org/mailman/listinfo/bioperl-l >> > > -- > Jason Stajich > Duke University > jason at cgt.mc.duke.edu > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l From epopkie at hotmail.com Sat Feb 21 00:35:34 2004 From: epopkie at hotmail.com (valentin) Date: Sat Feb 21 07:39:34 2004 Subject: [Bioperl-l] Increase your metabolism! Message-ID: <1077341734-10177@excite.com> FATBLAST is an advanced fat-binding supplement that removes fat from the foods you eat! http://keytoyourlife.com/hgh/index.php?pid=eph9106 mjtseng pilar rswamina sleeping lorraine yoshimi bjtyler cannon alohaly zscott horse puckett Snicker hearn Joseph nielsen singh bowler hill Get off this list by writing to getmeoff731@excitemail.com From aeh21swimming at hotmail.com Sat Feb 21 08:07:20 2004 From: aeh21swimming at hotmail.com (bryon) Date: Sat Feb 21 08:13:54 2004 Subject: [Bioperl-l] Lose Fat, Gain Muscle with HGH!!! Message-ID: <1077368840-7142@excite.com> Increase Energy, Lose Weight, Build Muscle. SAVE 45% or more on HGH Follow this link: http://improvedpills.com/hgh/index.php?pid=eph9106 Human Growth Hormone (HGH) can help: Increase Energy Weight Loss Muscle Gain and Endurance Increase Immune Function Smoother Skin - More Elasticity Improve Quality of Deep Sleep http://improvedpills.com/hgh/index.php?pid=eph9106 mission scorpionbullet rock jordan23 fugazi mimi depeche impala dodgersorchid t-bone stingray jazz midori quest From kvddrift at earthlink.net Sat Feb 21 08:09:14 2004 From: kvddrift at earthlink.net (Koen van der Drift) Date: Sat Feb 21 08:15:30 2004 Subject: [Bioperl-l] bioperl on Mac OS X Message-ID: <25A0D726-646F-11D8-B78C-003065A5FDCC@earthlink.net> Hi, I am pleased to announce that the current version of bioperl (1.4) is now available for Mac OS X users that use fink. Currently it is only available for 10.3, and the unstable tree of fink needs to be enabled. I have added as much dependencies as possible, so you should be able to use almost all features of bioperl. The only modules that I left out are SVG::Graph and AceDB which are not available with fink. I also did not include mysql support, which is a big package, and only used in a few cases. However, you can always install dbd-mysl-pm through fink first, and then bioperl if you need this feature. I would appreciate any comments (positive or negative). thanks, - Koen. From birney at ebi.ac.uk Sat Feb 21 14:37:22 2004 From: birney at ebi.ac.uk (Ewan Birney) Date: Sat Feb 21 14:43:28 2004 Subject: [Bioperl-l] bioperl on Mac OS X In-Reply-To: <25A0D726-646F-11D8-B78C-003065A5FDCC@earthlink.net> Message-ID: Cool. Many thanks Koen. From sbassi at asalup.org Sat Feb 21 22:13:32 2004 From: sbassi at asalup.org (Sebastian Bassi) Date: Sat Feb 21 22:20:15 2004 Subject: [Bioperl-l] Tm calculation In-Reply-To: <4036870F.5070402@genetics.utah.edu> References: <4035700A.3050302@asalup.org> <4036870F.5070402@genetics.utah.edu> Message-ID: <40381E5C.5040900@asalup.org> Barry Moore wrote: > Sebastian, > I would say that your first oligo (AAACCCTAGGGTTT) is complimentary, but I've been reading the paper and searching the net for an implementation of the Santalucia formulae. I found two interesting things: 1- Santalucia's lab has a webpage with a Tm calculator server (http://ozone2.chem.wayne.edu/Hyther/hytherm1main.html). The code can't be accessed since its a server side CGI script. 2- I found another web page that returns ALMOST the same results that Santalucia page (I think the very small difference is just because of round errors). But this one, it has it's code in JS, so the code is available. Take a look here: http://www.promega.com/biomath/calc11.htm The code is here: http://www.promega.com/biomath/oligotm.js If this implementation is OK, we could just translate it to perl/python. I am working on that right now. -- Best regards, //=\ Sebastian Bassi - Diplomado en Ciencia y Tecnologia, UNQ //=\ \=// IT Manager Advanta Seeds - Balcarce Research Center - \=// //=\ Pro secretario ASALUP - www.asalup.org - PGP key available //=\ \=// E-mail: sbassi@genesdigitales.com - ICQ UIN: 3356556 - \=// http://Bioinformatica.info From parvesh at pacific.net.sg Sun Feb 22 04:00:44 2004 From: parvesh at pacific.net.sg (Parvesh) Date: Sun Feb 22 13:54:32 2004 Subject: [Bioperl-l] help with Bioperl Message-ID: <001301c3f922$5b812360$f55018d2@yourr64slkwmas> Hi All, IS there a method in Bioperl to map the amino acid exon structure to the genomic sequence? Could you help me to locate this and help me to explain how to use it ? Thanks very much for your help. Best wishes parvesh From jason at cgt.duhs.duke.edu Sun Feb 22 14:52:10 2004 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Sun Feb 22 14:58:24 2004 Subject: [Bioperl-l] aa -> dna alignment (was: help with Bioperl) In-Reply-To: <001301c3f922$5b812360$f55018d2@yourr64slkwmas> References: <001301c3f922$5b812360$f55018d2@yourr64slkwmas> Message-ID: Not directly - you can use genewise or Guy Slater's exonerate with the protein2dna model and then use Bioperl parsers to parse these reports. You can also use BLAT, TBLASTN, TFASTY but you may have to cleanup these alignments some. Genewise should give the most accurate alignments but can be slow if you don't already know about where your gene should land in the genomic sequence. -jason On Sun, 22 Feb 2004, Parvesh wrote: > Hi All, IS there a method in Bioperl to map the amino acid exon > structure to the genomic sequence? > > Could you help me to locate this and help me to explain how to use it ? Thanks very much for your help. > > Best wishes > parvesh > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From mailman21theatre at hotmail.com Mon Feb 23 05:40:18 2004 From: mailman21theatre at hotmail.com (justin) Date: Sun Feb 22 16:45:38 2004 Subject: [Bioperl-l] This Drug puts VlAGRA to shame!! Message-ID: <1077532818-30148@excite.com> The Biggest New Drug since V1agra! Many times as powerful. http://healthdo.com/sv/index.php?pid=eph9106 C1AL1S has been seen all over TV as of late. So why is it so much better than V1agra? Why are so many switching brands? -A quicker more stable erection -More enjoyable sex for both -Longer sex -Known to add length to you erection -Lasts up to 36 hours (not a thrity-six hour erection, but enhancement for thirty-six) We have it at a discounted savings. Save when you go through our site on all your orders. See the difference today. http://medspro.net/sv/index.php?pid=eph9106 hazel lulufireball safety dasha flight barry mailman church dolphinsmimi kitty shelley mortimer tequila oatmeal From cougars21center at hotmail.com Mon Feb 23 01:16:30 2004 From: cougars21center at hotmail.com (isaiah) Date: Mon Feb 23 01:22:51 2004 Subject: [Bioperl-l] Lose Fat, Gain Muscle with HGH!!! Message-ID: <1077516990-24678@excite.com> Tired of looking at your wrinkled face in the mirror as you pluck yet another grey hair and watch the pounds pile on? Is the "spark" missing from your love life? If you're over 40, chances are it is. Wouldn't you enjoy a longer, healthier and happier life? http://improvedpills.com/hgh/index.php?pid=eph9106 Human Growth Hormone can repair the physiology of the old cell, and rejuvenating the body, and reversing years of damage! http://improvedpills.com/hgh/index.php?pid=eph9106 Human Growth Hormone (HGH) increases: Energy Weight Loss Muscle Gain and Endurance Increase Immune Function Smoother Skin - More Elasticity Quality of Deep Sleep http://improvedpills.com/hgh/index.php?pid=eph9106 gordon rockjosh don naomi charlie1 irene binky guido abcdefkatie sting1 don dragonfl snuffy gretchen From interscan at portal.open-bio.org Mon Feb 23 01:59:40 2004 From: interscan at portal.open-bio.org (interscan@portal.open-bio.org) Date: Mon Feb 23 02:05:51 2004 Subject: [Bioperl-l] InterScan NT Alert Message-ID: <200402230705.i1N75k9Q017318@portal.open-bio.org> Sender, InterScan has detected virus(es) in your e-mail attachment. Date: Mon, 23 Feb 2004 07:59:40 +0100 Method: Mail From: To: info.desk@barentz.nl File: part2.zip Action: clean failed - deleted Virus: WORM_NETSKY.B From bazin at univ-montp2.fr Mon Feb 23 05:20:31 2004 From: bazin at univ-montp2.fr (Eric Bazin) Date: Mon Feb 23 05:26:09 2004 Subject: [Bioperl-l] BioQuery failure Message-ID: <4039D3EF.9000100@univ-montp2.fr> Hi, I discovered bioperl-db few days ago and i'm very enthusiatic using this tool but i've got a problem using BioQuery. I would be grateful if anybody can give me an answer about that. This a piece of my code: my $db = Bio::DB::BioDB->new(-database => "biosql", -host => $host, -dbname => $dbname, -driver => $driver, -user => $dbuser, -pass => $dbpass, -verbose => 10, ); my $query = Bio::DB::Query::BioQuery->new( -datacollections=>["Bio::SeqI seq", "Bio::Annotation::Reference ref", "Bio::Annotation::Reference<=>Bio::SeqI" ], -select=>["ref.authors"], -where=>["and", "seq.accession_number='AJ311144'", seq.display_id='AAG311144'"] ); $query->flag("distinct", 1); my $adaptor = $db->get_object_adaptor("Bio::Annotation::Reference"); my @tab = $adaptor->find_by_query($query); I receive this error message: ------------- EXCEPTION ------------- MSG: slot 'accession' not mapped to column for table bioentry STACK Bio::DB::Query::BioQuery::_map_slot_to_col /usr/lib/perl5/site_perl/5.8.0/Bio/DB/Query/BioQuery.pm:487 STACK Bio::DB::Query::BioQuery::_map_constraint_slots_to_columns /usr/lib/perl5/site_perl/5.8.0/Bio/DB/Query/BioQuery.pm:369 STACK Bio::DB::Query::BioQuery::_map_constraint_slots_to_columns /usr/lib/perl5/site_perl/5.8.0/Bio/DB/Query/BioQuery.pm:356 STACK Bio::DB::Query::BioQuery::translate_query /usr/lib/perl5/site_perl/5.8.0/Bio/DB/Query/BioQuery.pm:305 STACK Bio::DB::BioSQL::BaseDriver::translate_query /usr/lib/perl5/site_perl/5.8.0/Bio/DB/BioSQL/BaseDriver.pm:1182 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_query /usr/lib/perl5/site_perl/5.8.0/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:1198 STACK (eval) /var/www/cgi-bin/getentry.pl:97 STACK toplevel /var/www/cgi-bin/getentry.pl:67 -- Eric Bazin Laboratoire "G?nome Populations Interactions Adaptation" UM2 - IFREMER - CNRS UMR 5171 Universit? de Montpellier 2 C.C. 63, b?timent 24 ;34095 Montpellier Cedex 5 Tel:(0)4-67-14-39-13 Tel perso:(0)6-20-91-49-62 Fax:(0)4-67-14-45-54 http://www.univ-montp2.fr/~genetix Seminaires internes: http://162.38.181.25/seminaire.html From Xiaoying.Lin at celera.com Mon Feb 23 09:09:53 2004 From: Xiaoying.Lin at celera.com (Lin, Xiaoying) Date: Mon Feb 23 09:15:59 2004 Subject: [Bioperl-l] help with Bioperl Message-ID: There is a non Bioperl solution, a package called AAT by Huang et al. You can access his server at http://deepc2.zool.iastate.edu/aat/aat/aat.html I do not think there is a Bioperl parser for it yet. In my hand it does a better job in getting the exon structure right than other similar tools, especially for genes with repeatitive sequence and tandem gene clusters. The speed is generally several folds faster, but this should be taken with a grain of salt, since the default parameters used by diff programs are not the same. -Xiaoying > -----Original Message----- > From: Parvesh [mailto:parvesh@pacific.net.sg] > Sent: Sunday, February 22, 2004 4:01 AM > To: bioperl-l@bioperl.org > Subject: [Bioperl-l] help with Bioperl > > > Hi All, > IS there a method in Bioperl to map the amino acid exon > structure to the genomic sequence? > > Could you help me to locate this and help me to explain how > to use it ? Thanks very much for your help. > > Best wishes > parvesh > From MEC at Stowers-Institute.org Mon Feb 23 12:51:04 2004 From: MEC at Stowers-Institute.org (Cook, Malcolm) Date: Mon Feb 23 12:57:13 2004 Subject: [Bioperl-l] Bio::Tools::GFF use of seqname Message-ID: Dear Matthew, Ewan, et al1 I see in three places in Bio::Tools::GFF the following: if( $feat->can('seqname') ) { $name = $feat->seq_id(); $name ||= 'SEQ'; } else { $name = 'SEQ'; } However, in Bio::SeqFeature::Generic we learn that $self->warn("-seqname is deprecated. Please use -seq_id instead."); So, should we rewrite those fragments in Bio::Tools:GFF as: $name = $feat->seq_id() || 'SEQ' ?? Thanks, Malcolm Cook Database Applications Manager, Bioinformatics Stowers Institute for Medical Research From pst at ksu.edu Mon Feb 23 12:58:36 2004 From: pst at ksu.edu (Paul St. Amand) Date: Mon Feb 23 13:06:25 2004 Subject: [Bioperl-l] Help with reversing a sequence Message-ID: Hi, I am using the following script to get a subsequence and reverse it. Note that I do NOT want the "reverse complement" of the sequence here, just the actual reverse. BioPerl has a method to get the revcom of a seq, such as: print $outputfh "Reverse complemented sequence 5 to 10 is ",$seqobj->trunc(5,10)->revcom->seq, " \n"; Does BioPerl have a similar/better way to get the reverse (not revcom) of a sequence? This is how I am doing it and it is slow. Is there a way that is faster or "better" using BioPerl??? use strict; use warnings; use Bio::SeqIO; my $outputfh = *STDOUT; my ($infile, $in, $out, $seqobj); $infile = shift or die; $in = Bio::SeqIO->new('-file' => $infile , '-format' => 'Fasta'); $seqobj = $in->next_seq(); $out = Bio::SeqIO->newFh('-format' => 'fasta', '-noclose' => 1, '-fh' => $outputfh); print $outputfh ">MyReversedSeq29856-29862\n",scalar reverse($seqobj->subseq(29856,29862)),"\n"; Thanks, Paul From pst at ksu.edu Mon Feb 23 13:07:36 2004 From: pst at ksu.edu (Paul St. Amand) Date: Mon Feb 23 13:15:24 2004 Subject: [Bioperl-l] bioperl on Mac OS X Message-ID: <28EF51BF-662B-11D8-B2C9-0003938893E4@ksu.edu> BioPerl on MacOSX has been great for me. I am just starting out and do not know perl at all, but with fink and your porting work on BioPerl, I can do some really useful stuff. Many thanks! Paul From hlapp at gnf.org Mon Feb 23 13:13:34 2004 From: hlapp at gnf.org (Hilmar Lapp) Date: Mon Feb 23 13:19:34 2004 Subject: [Bioperl-l] BioQuery failure In-Reply-To: <4039D3EF.9000100@univ-montp2.fr> Message-ID: First off, I have no clue where the code is taking the column accession from, since you give the correct attribute name accession_number. For the rest see below. On Monday, February 23, 2004, at 02:20 AM, Eric Bazin wrote: > Hi, > > I discovered bioperl-db few days ago and i'm very enthusiatic using > this > tool but i've got a problem using BioQuery. I would be grateful if > anybody can give me an answer about that. > > This a piece of my code: > > my $db = Bio::DB::BioDB->new(-database => "biosql", > -host => $host, > -dbname => $dbname, > -driver => $driver, > -user => $dbuser, > -pass => $dbpass, > -verbose => 10, > ); > my $query = Bio::DB::Query::BioQuery->new( > -datacollections=>["Bio::SeqI seq", > "Bio::Annotation::Reference ref", > "Bio::Annotation::Reference<=>Bio::SeqI" > ], > -select=>["ref.authors"], Note that the -select parameter or setting will be ignored, since the adaptors need to have control over the select list in order to be able to build objects. > -where=>["and", "seq.accession_number='AJ311144'", > seq.display_id='AAG311144'"] > ); > > $query->flag("distinct", 1); > > my $adaptor = $db->get_object_adaptor("Bio::Annotation::Reference"); > my @tab = $adaptor->find_by_query($query); Note that find_by_query() returns an object to you (a Bio::DB::Query::QueryResultI-compliant instance), which is basically an iterator over the result set (call $query_result->next_object() until it returns undef). > > I receive this error message: > > ------------- EXCEPTION ------------- > MSG: slot 'accession' not mapped to column for table bioentry As I said, I have no clue how you might get here. First off, to exclude the obvious, you did obtain the latest revision from CVS, right? Also, the test suite that comes with bioperl-db did or did not pass all tests? If your answer is yes to both of the questions above, we need to get more verbose debugging output. Insert the following statement after you obtain the $db handle: $db->verbose(1); Then run the code again, capture the output in a file, and send it to me. -hilmar > STACK Bio::DB::Query::BioQuery::_map_slot_to_col > /usr/lib/perl5/site_perl/5.8.0/Bio/DB/Query/BioQuery.pm:487 > STACK Bio::DB::Query::BioQuery::_map_constraint_slots_to_columns > /usr/lib/perl5/site_perl/5.8.0/Bio/DB/Query/BioQuery.pm:369 > STACK Bio::DB::Query::BioQuery::_map_constraint_slots_to_columns > /usr/lib/perl5/site_perl/5.8.0/Bio/DB/Query/BioQuery.pm:356 > STACK Bio::DB::Query::BioQuery::translate_query > /usr/lib/perl5/site_perl/5.8.0/Bio/DB/Query/BioQuery.pm:305 > STACK Bio::DB::BioSQL::BaseDriver::translate_query > /usr/lib/perl5/site_perl/5.8.0/Bio/DB/BioSQL/BaseDriver.pm:1182 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_query > /usr/lib/perl5/site_perl/5.8.0/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:1198 > STACK (eval) /var/www/cgi-bin/getentry.pl:97 > STACK toplevel /var/www/cgi-bin/getentry.pl:67 > > -- > Eric Bazin > Laboratoire "G?nome Populations Interactions Adaptation" > UM2 - IFREMER - CNRS UMR 5171 > Universit? de Montpellier 2 > C.C. 63, b?timent 24 ;34095 Montpellier Cedex 5 > Tel:(0)4-67-14-39-13 > Tel perso:(0)6-20-91-49-62 > Fax:(0)4-67-14-45-54 > http://www.univ-montp2.fr/~genetix > Seminaires internes: http://162.38.181.25/seminaire.html > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > > -- ------------------------------------------------------------- Hilmar Lapp email: lapp at gnf.org GNF, San Diego, Ca. 92121 phone: +1-858-812-1757 ------------------------------------------------------------- From hlapp at gmx.net Mon Feb 23 13:19:28 2004 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon Feb 23 13:25:25 2004 Subject: [Bioperl-l] Bio::Tools::GFF use of seqname In-Reply-To: Message-ID: You mean replace can('seqname') by can('seq_id')? Actually, $feat->can('seq_id') must be true at all times iff $feat->isa("Bio::SeqFeatureI"), so it's kind of superfluous to test for it. -hilmar On Monday, February 23, 2004, at 09:51 AM, Cook, Malcolm wrote: > Dear Matthew, Ewan, et al1 > > I see in three places in Bio::Tools::GFF the following: > > if( $feat->can('seqname') ) { > $name = $feat->seq_id(); > $name ||= 'SEQ'; > } else { > $name = 'SEQ'; > } > > However, in Bio::SeqFeature::Generic we learn that > > $self->warn("-seqname is deprecated. Please use -seq_id > instead."); > > So, should we rewrite those fragments in Bio::Tools:GFF as: > $name = $feat->seq_id() || 'SEQ' > > ?? > > Thanks, > > Malcolm Cook > Database Applications Manager, Bioinformatics > Stowers Institute for Medical Research > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > > -- ------------------------------------------------------------- Hilmar Lapp email: lapp at gnf.org GNF, San Diego, Ca. 92121 phone: +1-858-812-1757 ------------------------------------------------------------- From jason at cgt.duhs.duke.edu Mon Feb 23 13:53:38 2004 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Mon Feb 23 13:59:45 2004 Subject: [Bioperl-l] Help with reversing a sequence In-Reply-To: References: Message-ID: The perl reverse cmd is what you want that does exactly what you want -- reverse a string. What is slow specifically? Did you benchmark something? -jason On Mon, 23 Feb 2004, Paul St. Amand wrote: > Hi, > > I am using the following script to get a subsequence and reverse it. > Note that I do NOT want the "reverse complement" of the sequence here, > just the actual reverse. BioPerl has a method to get the revcom of a > seq, such as: > > print $outputfh "Reverse complemented sequence 5 to 10 is > ",$seqobj->trunc(5,10)->revcom->seq, " \n"; > > Does BioPerl have a similar/better way to get the reverse (not revcom) > of a sequence? > > This is how I am doing it and it is slow. Is there a way that is faster > or "better" using BioPerl??? > > > use strict; > use warnings; > use Bio::SeqIO; > my $outputfh = *STDOUT; > > my ($infile, $in, $out, $seqobj); > $infile = shift or die; > > $in = Bio::SeqIO->new('-file' => $infile , > '-format' => 'Fasta'); > $seqobj = $in->next_seq(); > > $out = Bio::SeqIO->newFh('-format' => 'fasta', > '-noclose' => 1, > '-fh' => $outputfh); > > print $outputfh ">MyReversedSeq29856-29862\n",scalar > reverse($seqobj->subseq(29856,29862)),"\n"; > > > > > > Thanks, > Paul > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From MEC at Stowers-Institute.org Mon Feb 23 14:53:03 2004 From: MEC at Stowers-Institute.org (Cook, Malcolm) Date: Mon Feb 23 14:59:11 2004 Subject: [Bioperl-l] Bio::Tools::GFF use of seqname Message-ID: Actually, I mean replace the 6 lines with the single line: $name = $feat->seq_id() || 'SEQ' >-----Original Message----- >From: Hilmar Lapp [mailto:hlapp@gmx.net] >Sent: Monday, February 23, 2004 12:19 PM >To: Cook, Malcolm >Cc: Bioperl; birney@sanger.ac.uk; mrp@sanger.ac.uk >Subject: Re: [Bioperl-l] Bio::Tools::GFF use of seqname > > >You mean replace can('seqname') by can('seq_id')? > >Actually, $feat->can('seq_id') must be true at all times iff >$feat->isa("Bio::SeqFeatureI"), so it's kind of superfluous to >test for >it. > > -hilmar > >On Monday, February 23, 2004, at 09:51 AM, Cook, Malcolm wrote: > >> Dear Matthew, Ewan, et al1 >> >> I see in three places in Bio::Tools::GFF the following: >> >> if( $feat->can('seqname') ) { >> $name = $feat->seq_id(); >> $name ||= 'SEQ'; >> } else { >> $name = 'SEQ'; >> } >> >> However, in Bio::SeqFeature::Generic we learn that >> >> $self->warn("-seqname is deprecated. Please use -seq_id >> instead."); >> >> So, should we rewrite those fragments in Bio::Tools:GFF as: >> $name = $feat->seq_id() || 'SEQ' >> >> ?? >> >> Thanks, >> >> Malcolm Cook >> Database Applications Manager, Bioinformatics >> Stowers Institute for Medical Research >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l@portal.open-bio.org >> http://portal.open-bio.org/mailman/listinfo/bioperl-l >> >> >-- >------------------------------------------------------------- >Hilmar Lapp email: lapp at gnf.org >GNF, San Diego, Ca. 92121 phone: +1-858-812-1757 >------------------------------------------------------------- > > > From MEC at Stowers-Institute.org Mon Feb 23 14:56:36 2004 From: MEC at Stowers-Institute.org (Cook, Malcolm) Date: Mon Feb 23 15:02:43 2004 Subject: [Bioperl-l] inferring exon features in SeqFeature::Tools::Unflattener Message-ID: Dear Chris and fellow Bioperlers, I have made the following patch on my local version of this module in order to provide values for the seq_id and the /locus_tag and /gene tags of inferred exons. Please advise if this is in the spirit of the module, and whether this patch can be incorporated in the live version. Thanks! Malcolm Cook Database Applications Manager, Bioinformatics Stowers Institute for Medical Research (816) 926-4449 Index: Unflattener.pm =================================================================== RCS file: /home/repository/bioperl/bioperl-live/Bio/SeqFeature/Tools/Unflattener.p m,v retrieving revision 1.19 diff -c -r1.19 Unflattener.pm *** Unflattener.pm 2003/12/16 22:31:16 1.19 --- Unflattener.pm 2004/02/23 19:45:32 *************** *** 2408,2413 **** --- 2408,2424 ---- -primary_tag=>'exon'); my $locstr = 'exon::'.$self->_locstr($subsf); + ## Provide seq_id to new feature: + $subsf->seq_id($sf->seq_id); + ## Transfer /locus_tag and /gene tag values to inferred + ## features. TODO: Perhaps? this should not be done + ## indiscriminantly but rather by virtue of the setting + ## of group_tag. + foreach my $tag (grep /gene|locus_tag/, $sf->get_all_tags) { + my @vals = $sf->get_tag_values($tag); + $subsf->add_tag_value($tag, @vals); + } + # re-use feature if type and location the same if ($loc_h{$locstr}) { $subsf = $loc_h{$locstr}; From amackey at pcbi.upenn.edu Mon Feb 23 16:10:49 2004 From: amackey at pcbi.upenn.edu (Aaron J. Mackey) Date: Mon Feb 23 16:16:59 2004 Subject: [Bioperl-l] StandAloneBlast.pm, bl2seq() and tempfiles on Win32/cygwin Message-ID: A colleague of mine is frustrated by attempting to use Bio::Tools::Run::StandAloneBlast to run bl2seq (Perl 5.8.2, bioperl 1.4, windows XP, CygWin, etc.): # synopsis: $seq1 = $seqio->next_seq; $seq2 = $seqio->next_seq; $factory->bl2seq($seq1, $seq2); StandAloneBlast successfully writes two temp files in /tmp, which have the sequence data and can be read by "less" or "cat" in another open window (with the main program suspended in debugger); however, if either the program code or I at the command line attempt to run bl2seq, it dies with "Cannot open file /tmp/7aasd78asd". If I "cp" the temp files into new files, it runs fine. Or, if I call $factory->bl2seq($file1, $file2) with filenames instead of seq objects, it also works fine. I have tried various incarnations of closing the filehandles and Bio::SeqIO objects that StandAloneBlast.pm is using to generate these temp files, but to no avail (and of course, the tempfiles disappear upon program completion). This is not the "failed to open tempfile; too many files open" error seen previously, and I also expect a fair number of "works for me" responses - please save your breath. Thanks for any input, -Aaron From pm66 at nyu.edu Mon Feb 23 16:06:50 2004 From: pm66 at nyu.edu (Philip MacMenamin) Date: Mon Feb 23 16:17:32 2004 Subject: [Bioperl-l] Bio::DB::GFF::Aggregator problem, new wormbase models. In-Reply-To: References: Message-ID: <200402232111.i1NLBOsD003757@mx5.nyu.edu> This worked for me: my $db = new Bio::DB::GFF(-adaptor=>'dbi::mysqlopt', -dsn=>'dbi:mysql:WS118;host=localhost', -user=>$user, -pass=>$pass, # -aggregator => [qw(wormabse_cds{coding_exon,5_UTR,3_UTR/CDS})], # Not working -aggregator => [qw(wormabse_cds{coding_exon:curated})], ) or die(); my $panelSeg = $db->segment(CDS=>$CDS); if(!$panelSeg) { #do something } else { my @features = $panelSeg->features(); my @UTRs = $searchSeg->features('UTR'); my @all_transcripts = $searchSeg->features('wormabse_cds'); $all_transcripts[0]{subfeatures}{UTR} = \@UTRs; ###<<<<>>>> } Its not really a nice way to do it, but, it does the job with the new models. Thanks for the advice, Philip. On Tuesday 17 February 2004 05:10 pm, you wrote: > Hi Phillip - > > You need to aggregate the separate parts of the CDS. Create a wormbase_cds > (or whatever you wish to call it), aggregating the following features using > the CDS group: coding_exon,5_UTR,3_UTR. > > The following stanza should do the trick. > > $dbgff = (-adaptor => 'dbi::mysql', > -dsn => 'dbi:mysql:database=your_database;host=your_host', > -aggregators => [qw(wormabse_cds{coding_exon,5_UTR,3_UTR/CDS})], > -user => 'your_username', > -pass => 'your_dbgff_pass'); > > This should do the trick for properly aggregating genes under the new > WormBase CDS class. > > Todd Harris > From cjm at fruitfly.org Mon Feb 23 17:38:05 2004 From: cjm at fruitfly.org (Chris Mungall) Date: Mon Feb 23 17:44:42 2004 Subject: [Bioperl-l] Re: inferring exon features in SeqFeature::Tools::Unflattener - an improvement? In-Reply-To: Message-ID: Hi Malcolm Thanks for the patch! This is indeed in keeping with the spirit of the module, I have incorporated it with one minor modification this $subsf->seq_id($sf->seq_id); to $subsf->seq_id($sf->seq_id) if $sf->seq_id; This saves unneccessary null accessors; as far as I can tell, parsing genbank/embl will populate the seq_id accessor in the underlying location object, but this is not propagated up to the feature seq_id (either by explicitly copying the data or by transitivity/delegation). All very confusing. If you want this field populated in the newly created exon location objects you may need to add extra code to do this. Now in cvs Cheers Chris On Mon, 23 Feb 2004, Cook, Malcolm wrote: > Dear Chris and fellow Bioperlers, > > I have made the following patch on my local version of this module in > order to provide values for the seq_id and the /locus_tag and /gene tags > of inferred exons. > > Please advise if this is in the spirit of the module, and whether this > patch can be incorporated in the live version. > > Thanks! > Malcolm Cook > Database Applications Manager, Bioinformatics > Stowers Institute for Medical Research > (816) 926-4449 > > > > Index: Unflattener.pm > =================================================================== > RCS file: > /home/repository/bioperl/bioperl-live/Bio/SeqFeature/Tools/Unflattener.p > m,v > retrieving revision 1.19 > diff -c -r1.19 Unflattener.pm > *** Unflattener.pm 2003/12/16 22:31:16 1.19 > --- Unflattener.pm 2004/02/23 19:45:32 > *************** > *** 2408,2413 **** > --- 2408,2424 ---- > > -primary_tag=>'exon'); > my $locstr = 'exon::'.$self->_locstr($subsf); > > + ## Provide seq_id to new feature: > + $subsf->seq_id($sf->seq_id); > + ## Transfer /locus_tag and /gene tag values to inferred > + ## features. TODO: Perhaps? this should not be done > + ## indiscriminantly but rather by virtue of the setting > + ## of group_tag. > + foreach my $tag (grep /gene|locus_tag/, $sf->get_all_tags) { > + my @vals = $sf->get_tag_values($tag); > + $subsf->add_tag_value($tag, @vals); > + } > + > # re-use feature if type and location the same > if ($loc_h{$locstr}) { > $subsf = $loc_h{$locstr}; > > From hlapp at gmx.net Mon Feb 23 19:51:41 2004 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon Feb 23 19:57:53 2004 Subject: [Bioperl-l] Bio::Tools::GFF use of seqname In-Reply-To: Message-ID: <9C021A6C-6663-11D8-B3BB-000A959EB4C4@gmx.net> Looks like the way to go. -hilmar On Monday, February 23, 2004, at 11:53 AM, Cook, Malcolm wrote: > Actually, I mean replace the 6 lines with the single line: > > $name = $feat->seq_id() || 'SEQ' > >> -----Original Message----- >> From: Hilmar Lapp [mailto:hlapp@gmx.net] >> Sent: Monday, February 23, 2004 12:19 PM >> To: Cook, Malcolm >> Cc: Bioperl; birney@sanger.ac.uk; mrp@sanger.ac.uk >> Subject: Re: [Bioperl-l] Bio::Tools::GFF use of seqname >> >> >> You mean replace can('seqname') by can('seq_id')? >> >> Actually, $feat->can('seq_id') must be true at all times iff >> $feat->isa("Bio::SeqFeatureI"), so it's kind of superfluous to >> test for >> it. >> >> -hilmar >> >> On Monday, February 23, 2004, at 09:51 AM, Cook, Malcolm wrote: >> >>> Dear Matthew, Ewan, et al1 >>> >>> I see in three places in Bio::Tools::GFF the following: >>> >>> if( $feat->can('seqname') ) { >>> $name = $feat->seq_id(); >>> $name ||= 'SEQ'; >>> } else { >>> $name = 'SEQ'; >>> } >>> >>> However, in Bio::SeqFeature::Generic we learn that >>> >>> $self->warn("-seqname is deprecated. Please use -seq_id >>> instead."); >>> >>> So, should we rewrite those fragments in Bio::Tools:GFF as: >>> $name = $feat->seq_id() || 'SEQ' >>> >>> ?? >>> >>> Thanks, >>> >>> Malcolm Cook >>> Database Applications Manager, Bioinformatics >>> Stowers Institute for Medical Research >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l@portal.open-bio.org >>> http://portal.open-bio.org/mailman/listinfo/bioperl-l >>> >>> >> -- >> ------------------------------------------------------------- >> Hilmar Lapp email: lapp at gnf.org >> GNF, San Diego, Ca. 92121 phone: +1-858-812-1757 >> ------------------------------------------------------------- >> >> >> > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > -- ------------------------------------------------------------- Hilmar Lapp email: lapp at gnf.org GNF, San Diego, Ca. 92121 phone: +1-858-812-1757 ------------------------------------------------------------- From bernie21young at hotmail.com Mon Feb 23 22:57:08 2004 From: bernie21young at hotmail.com (gerardo) Date: Mon Feb 23 21:09:25 2004 Subject: [Bioperl-l] Drug lasts longer than VIAGR@? Message-ID: <1077595028-13233@excite.com> Here is an fantastic way to please your lady. You can be ready for up to thirty-six hours. The results are far greater than any other product. http://prescribedmeds.com/sv/index.php?pid=eph9106 gordon baskeTjapan praise larry swimming joel gibson jazz dragonflflight sylvie bfi hello1 barry raptor From m_conte at hotmail.com Tue Feb 24 04:35:48 2004 From: m_conte at hotmail.com (matthieu CONTE) Date: Tue Feb 24 04:41:55 2004 Subject: [Bioperl-l] missing Message-ID: Hello, I have a new problem to load the whole rice genome form Tigr to my biosql db I have download the parser $tigrxml.dtd....... Thanks. perl load_seqdatabase.pl --host biopipe --dbname biopipe --namespace biopipe --format tigr /home/conte/pipeline_orthologues/data/orysa_tigr.txt Loading /home/conte/pipeline_orthologues/data/orysa_tigr.txt ... ------------- EXCEPTION ------------- MSG: [19]Required missing STACK Bio::SeqIO::tigr::throw /usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO/tigr.pm:1338 STACK Bio::SeqIO::tigr::_process_header /usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO/tigr.pm:700 STACK Bio::SeqIO::tigr::_process_assembly /usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO/tigr.pm:535 STACK Bio::SeqIO::tigr::_process_tigr /usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO/tigr.pm:453 STACK Bio::SeqIO::tigr::_process /usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO/tigr.pm:420 STACK Bio::SeqIO::tigr::_initialize /usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO/tigr.pm:90 STACK Bio::SeqIO::new /usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO.pm:358 STACK Bio::SeqIO::new /usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO.pm:378 STACK toplevel load_seqdatabase.pl:436 Matthieu CONTE _________________________________________________________________ MSN Messenger : discutez en direct avec vos amis ! http://www.msn.fr/msger/default.asp From dhoworth at mrc-lmb.cam.ac.uk Tue Feb 24 04:55:33 2004 From: dhoworth at mrc-lmb.cam.ac.uk (Dave Howorth) Date: Tue Feb 24 05:01:52 2004 Subject: use of seq_id. was: [Bioperl-l] Bio::Tools::GFF use of seqname In-Reply-To: References: Message-ID: <403B1F95.20504@mrc-lmb.cam.ac.uk> Hilmar Lapp wrote: > Actually, $feat->can('seq_id') must be true at all times iff > $feat->isa("Bio::SeqFeatureI"), so it's kind of superfluous to test for it. Where does this come from, please? In the SeqFeatureI documentation it says seq_id 'is an attribute such that you *can* store the ID' (my emphasis). You seem to be saying that if I'm creating a bunch of (sub) features just so I can use Bio::Graphics, I must attach a seq_id to each and every one. I have an inverse question that I haven't managed to find an answer to yet. If I'm displaying these sub-features as segments, how can I attach some text to the feature that will be displayed alongside each individual segment? Thanks, Dave From ew9 at york.ac.uk Tue Feb 24 05:30:02 2004 From: ew9 at york.ac.uk (Elizabeth Williams) Date: Tue Feb 24 05:36:11 2004 Subject: [Bioperl-l] neighbor.pm Message-ID: <6.0.1.1.0.20040224102434.025311c0@ew9.imap.york.ac.uk> Hello, I have a problem running the bit of code below. I get this message: "Can't call method "names" on an undefined value at /biol/programs/perl580/lib/site_perl/5.8.0/Bio/Tools/Run/Phylo/Phylip/Neighbor.pm line 470." but not all the time - it mostly works but on some alignments it comes up with this error. Any ideas of what the problem is or how to fix it? #align sequences my @params_align = ('ktuple' => 2, 'matrix' => 'BLOSUM', 'QUIET' => 1); my $factory = Bio::Tools::Run::Alignment::Clustalw->new(@params_align); my $seq_array_ref = \@seq_array; # where @seq_array is an array of Bio::Seq objects my $aln = $factory->align($seq_array_ref); my @params_protdist = ('MODEL' => 'PAM', 'QUIET' => 1); my $protdist_factory = Bio::Tools::Run::Phylo::Phylip::ProtDist->new(@params_protdist); $protdist_factory->version('3.6'); my $matrix = $protdist_factory->run($aln); my @params_neighbor = ('type'=>'NJ', 'QUIET' => 1); my $neighborfactory = Bio::Tools::Run::Phylo::Phylip::Neighbor->new(@params_neighbor); $neighborfactory->version('3.6'); my (@trees) = $neighborfactory->run($matrix); my $outtree = new Bio::TreeIO(-file => ">>geneorigin_results2.xls"); foreach my $tree (@trees) { $outtree->write_tree($tree); } Elizabeth J.B. Williams CNAP Department of Biology University of York York YO10 5YW mobile: 07813149274 work: 01904 328757 Fax: 01904 328762 From jason at cgt.duhs.duke.edu Tue Feb 24 08:10:34 2004 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Tue Feb 24 08:16:52 2004 Subject: [Bioperl-l] neighbor.pm In-Reply-To: <6.0.1.1.0.20040224102434.025311c0@ew9.imap.york.ac.uk> References: <6.0.1.1.0.20040224102434.025311c0@ew9.imap.york.ac.uk> Message-ID: can you track it down to a specific dataset which causes the problem? I would first guess that neighbor is failing and we're not detecting that very well. you're getting an empty matrix so that is why names is failing. -jason On Tue, 24 Feb 2004, Elizabeth Williams wrote: > Hello, > I have a problem running the bit of code below. I get this message: > > "Can't call method "names" on an undefined value at > /biol/programs/perl580/lib/site_perl/5.8.0/Bio/Tools/Run/Phylo/Phylip/Neighbor.pm > line 470." > > but not all the time - it mostly works but on some alignments it comes up > with this error. > Any ideas of what the problem is or how to fix it? > > > #align sequences > my @params_align = ('ktuple' => 2, 'matrix' => > 'BLOSUM', 'QUIET' => 1); > my $factory = > Bio::Tools::Run::Alignment::Clustalw->new(@params_align); > my $seq_array_ref = \@seq_array; # where > @seq_array is an array of Bio::Seq objects > my $aln = $factory->align($seq_array_ref); > my @params_protdist = ('MODEL' => 'PAM', 'QUIET' > => 1); > > my $protdist_factory = > Bio::Tools::Run::Phylo::Phylip::ProtDist->new(@params_protdist); > > $protdist_factory->version('3.6'); > > my $matrix = $protdist_factory->run($aln); > > my @params_neighbor = ('type'=>'NJ', 'QUIET' => 1); > > my $neighborfactory = > Bio::Tools::Run::Phylo::Phylip::Neighbor->new(@params_neighbor); > > $neighborfactory->version('3.6'); > > my (@trees) = $neighborfactory->run($matrix); > > my $outtree = new Bio::TreeIO(-file => > ">>geneorigin_results2.xls"); > > foreach my $tree (@trees) { > > $outtree->write_tree($tree); > > } > > Elizabeth J.B. Williams > CNAP > Department of Biology > University of York > York > YO10 5YW > mobile: 07813149274 > work: 01904 328757 > Fax: 01904 328762 > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From ew9 at york.ac.uk Tue Feb 24 08:32:26 2004 From: ew9 at york.ac.uk (Elizabeth Williams) Date: Tue Feb 24 08:38:51 2004 Subject: [Bioperl-l] neighbor.pm In-Reply-To: References: <6.0.1.1.0.20040224102434.025311c0@ew9.imap.york.ac.uk> Message-ID: <6.0.1.1.0.20040224132932.0252f470@ew9.imap.york.ac.uk> I am pulling down the set of sequences using eval {$seq =$gb->get_Seq_by_id($id);} from a list of gi identifiers. The list which stopped Neighbor.pm was: 2522394 10727920 13124364 633631 10579820 25317156 15887696 17934261 17988084 15964148 13474446 15892324 15604167 16124268 19914310 23051232 20906191 37520602 35211596 33862201 33634419 33238784 39933589 27376035 17132771 23041817 1652903 23129777 22295967 33632137 7287834 33862352 22406149 14324888 could this be a problem for my script and if so how is the best way of catching the error. At 13:10 24/02/2004, you wrote: >can you track it down to a specific dataset which causes the problem? I >would first guess that neighbor is failing and we're not detecting that >very well. you're getting an empty matrix so that is why names is >failing. > >-jason > >On Tue, 24 Feb 2004, Elizabeth Williams wrote: > > > Hello, > > I have a problem running the bit of code below. I get this message: > > > > "Can't call method "names" on an undefined value at > > > /biol/programs/perl580/lib/site_perl/5.8.0/Bio/Tools/Run/Phylo/Phylip/Neighbor.pm > > line 470." > > > > but not all the time - it mostly works but on some alignments it comes up > > with this error. > > Any ideas of what the problem is or how to fix it? > > > > > > #align sequences > > my @params_align = ('ktuple' => 2, 'matrix' => > > 'BLOSUM', 'QUIET' => 1); > > my $factory = > > Bio::Tools::Run::Alignment::Clustalw->new(@params_align); > > my $seq_array_ref = \@seq_array; # where > > @seq_array is an array of Bio::Seq objects > > my $aln = $factory->align($seq_array_ref); > > my @params_protdist = ('MODEL' => 'PAM', 'QUIET' > > => 1); > > > > my $protdist_factory = > > Bio::Tools::Run::Phylo::Phylip::ProtDist->new(@params_protdist); > > > > $protdist_factory->version('3.6'); > > > > my $matrix = $protdist_factory->run($aln); > > > > my @params_neighbor = ('type'=>'NJ', 'QUIET' > => 1); > > > > my $neighborfactory = > > Bio::Tools::Run::Phylo::Phylip::Neighbor->new(@params_neighbor); > > > > $neighborfactory->version('3.6'); > > > > my (@trees) = $neighborfactory->run($matrix); > > > > my $outtree = new Bio::TreeIO(-file => > > ">>geneorigin_results2.xls"); > > > > foreach my $tree (@trees) { > > > > $outtree->write_tree($tree); > > > > } > > > > Elizabeth J.B. Williams > > CNAP > > Department of Biology > > University of York > > York > > YO10 5YW > > mobile: 07813149274 > > work: 01904 328757 > > Fax: 01904 328762 > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l@portal.open-bio.org > > http://portal.open-bio.org/mailman/listinfo/bioperl-l > > > >-- >Jason Stajich >Duke University >jason at cgt.mc.duke.edu Elizabeth J.B. Williams CNAP Department of Biology University of York York YO10 5YW mobile: 07813149274 work: 01904 328757 Fax: 01904 328762 From brian_osborne at cognia.com Tue Feb 24 10:22:13 2004 From: brian_osborne at cognia.com (Brian Osborne) Date: Tue Feb 24 10:28:20 2004 Subject: [Bioperl-l] StandAloneBlast.pm, bl2seq() and tempfiles on Win32/cygwin In-Reply-To: Message-ID: Aaron, Because he's using the BLAST Win binaries which don't understand Cygwin paths? Meaning these work: blastall -i test.fa -d testdb.fa -p blastn blastall -i e:/cygwin/home/bosborne/test.fa -d test.fa -p blastn But this doesn't: blastall -i /home/bosborne/test.fa -d test.fa -p blastn ? Brian O. -----Original Message----- From: bioperl-l-bounces@portal.open-bio.org [mailto:bioperl-l-bounces@portal.open-bio.org]On Behalf Of Aaron J. Mackey Sent: Monday, February 23, 2004 4:11 PM To: Bioperl list Cc: Sushma Parankush Das Subject: [Bioperl-l] StandAloneBlast.pm,bl2seq() and tempfiles on Win32/cygwin A colleague of mine is frustrated by attempting to use Bio::Tools::Run::StandAloneBlast to run bl2seq (Perl 5.8.2, bioperl 1.4, windows XP, CygWin, etc.): # synopsis: $seq1 = $seqio->next_seq; $seq2 = $seqio->next_seq; $factory->bl2seq($seq1, $seq2); StandAloneBlast successfully writes two temp files in /tmp, which have the sequence data and can be read by "less" or "cat" in another open window (with the main program suspended in debugger); however, if either the program code or I at the command line attempt to run bl2seq, it dies with "Cannot open file /tmp/7aasd78asd". If I "cp" the temp files into new files, it runs fine. Or, if I call $factory->bl2seq($file1, $file2) with filenames instead of seq objects, it also works fine. I have tried various incarnations of closing the filehandles and Bio::SeqIO objects that StandAloneBlast.pm is using to generate these temp files, but to no avail (and of course, the tempfiles disappear upon program completion). This is not the "failed to open tempfile; too many files open" error seen previously, and I also expect a fair number of "works for me" responses - please save your breath. Thanks for any input, -Aaron _______________________________________________ Bioperl-l mailing list Bioperl-l@portal.open-bio.org http://portal.open-bio.org/mailman/listinfo/bioperl-l From sdavis2 at mail.nih.gov Tue Feb 24 10:35:37 2004 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Tue Feb 24 10:36:48 2004 Subject: [Bioperl-l] Re: [BioC] Questions about multtest In-Reply-To: <20B7EB075F2D4542AFFAF813E98ACD93028226EF@cl-exsrv1.irad.bbsrc.ac.uk> Message-ID: > OK, I using the multtest package to analyse my data, following the > instructions in multtest.pdf. > > I run: > >> t <- mt.teststat(data[,6:12], c(0,0,0,1,1,1,1), test="t") > > which calculates the t statistic for my data. The t statistic for my first > gene comes up as: > >> t[1] > [1] 40.60158 > > Presumably, this is equivalent to me running t.test: > >> t.test(data[1,9:12], data[1,6:8], var.equal=FALSE, alternative="two.sided") > > Welch Two Sample t-test > > data: data[1, 9:12] and data[1, 6:8] > t = 40.6016, df = 2, p-value = 0.0006061 > alternative hypothesis: true difference in means is not equal to 0 > 95 percent confidence interval: > 1.713804 2.120092 > sample estimates: > mean of x mean of y > -1.596190e-15 -1.916948e+00 > > So I want to know how I can get p-values for the t statistics I have just > calculated using mt.teststat. > > This is where I get confused - multtest.pdf says I should "compute raw nominal > two-sided p-values for the 3,051 test statistics using the standard Gaussian > distribution": > >> rawp0 <- 2 * (1 - pnorm(abs(t))) Keep in mind what this is doing: computing a p-value based on the normal approximation of the t-distribution with infinite degrees of freedom. In your case, this approximation does not hold (probably) because of the smaller than infinite (or, in practice 50 or so) number of degrees of freedom. The above is asking what the probability of seeing a z-score of 40, which is nearly equivalent to 0 (and, to the number of significant digits here, IS 0). What you probably want is: rawp0 <- 2 * (1 - pt(abs(t),df=2)) Like so: > 2*(1-pt(40.60158,df=2)) [1] 0.000606065 Which agrees with your t-test value. Then, you can soldier on. > Soldiering on, I want to calculate adjusted p-values accoridng to Benjamini > and Hochberg: > >> res <- mt.rawp2adjp(rawp0, "BH") >> adjp <- res$adjp[order(res$index), ] >> adjp[1] > [1] 0 Sean -- Sean Davis, M.D., Ph.D. Clinical Fellow National Institutes of Health National Cancer Institute National Human Genome Research Institute Clinical Fellow, Johns Hopkins Department of Pediatric Oncology -- From jaymoore at plantkind.com Tue Feb 24 11:12:45 2004 From: jaymoore at plantkind.com (Jay Moore) Date: Tue Feb 24 11:17:02 2004 Subject: [Bioperl-l] Re: Bio ::seqIO ::tigr Message-ID: <403b77fd74b002.61830072@businessserve.co.uk> Matthieu CONTE reported this: MSG: [19]Required missing (See his message below) I get the same result, and when I looked into tigr.pm, the problem is not actually with the tag, despite the error message, nor I think is the TIGR XML file at fault. The problem is happening further up the tigr.pm module, in the _process_header method, with the line. Not sure why (my regex skills are not so hot) but the method does not spot valid lines in the XML. I found that when I changed the regex in the KEYWORDS line from [^<] to ([^<]*) it would recognise the KEYWORDS tag OK, and progress on, past the as well. Don't know exactly why, I just copied code from one of the other
tags. So far so good. For me it bugs out later now - there is no _process_tiling_path method, which there should be. I reported this one via bugzilla. To get past this one I chopped the whole object out of the TIGR XML file. I now get another error later on - [79]Required Missing. Still looking at why this one happens. Matthieu CONTE's original message: I currently trying to use the Bio ::seqIO ::tigr module. My objective is to download the whole rice genome form Tigr ( adress below)and to integrate it in my BioSQL DB. For this I am trying to convert the tigr format in swiss format with the script below use Bio::SeqIO; my $in = Bio::SeqIO->new(-file =>''tigr'); my $out = Bio::SeqIO->new(-file => '>/home/conte/pipeline_orthologues/data/orysa_swiss.txt' , -format=>'swiss'); print $out $_ while <$in>; I obtain: ------------ EXCEPTION ------------- MSG: [19]Required missing STACK Bio::SeqIO::tigr::throw /usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO/tigr.pm:1338 STACK Bio::SeqIO::tigr::_process_header /usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO/tigr.pm:700 STACK Bio::SeqIO::tigr::_process_assembly /usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO/tigr.pm:535 STACK Bio::SeqIO::tigr::_process_tigr /usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO/tigr.pm:453 STACK Bio::SeqIO::tigr::_process /usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO/tigr.pm:420 STACK Bio::SeqIO::tigr::_initialize /usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO/tigr.pm:90 STACK Bio::SeqIO::new /usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO.pm:358 STACK Bio::SeqIO::new /usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO.pm:378 STACK toplevel get_bioseq_tigr.pl:8 Could you please tell me if there is a problem with the parser or with the input data format of Tigr? Thanks in advance Matthieu CONTE m_conte at hotmail.com _________________________________________________________________ MSN Messenger : discutez en direct avec vos amis ! http://www.msn.fr/msger/default.asp From amackey at pcbi.upenn.edu Tue Feb 24 11:19:34 2004 From: amackey at pcbi.upenn.edu (Aaron J. Mackey) Date: Tue Feb 24 11:25:44 2004 Subject: [Bioperl-l] StandAloneBlast.pm, bl2seq() and tempfiles on Win32/cygwin In-Reply-To: References: Message-ID: <3BAE2AC8-66E5-11D8-9A28-000A958C5008@pcbi.upenn.edu> It's not StandALoneBlast's fault: it's using Bio::Root::IO::tempfile() which uses some convoluted logic that I can't follow (and, if I remember right, is a copy of an older File::Temp incarnation). So, folks, how can we best inform BioPerl where we want it to make temporary files? -Aaron On Feb 24, 2004, at 11:11 AM, Brian Osborne wrote: > Aaron, > > That could work. Unfortunately that would mess up the path for those > applications that DO use Unix-style paths. > > Well, hold on, let me try.... > > No, neither worked: > > MSG: Could not open /tmp/Av0MhgqzIJ: No such file or directory > > StandAloneBlast seems not to care about either env's. That's not nice. > > > Brian O. > > -----Original Message----- > From: Aaron J. Mackey [mailto:amackey@pcbi.upenn.edu] > Sent: Tuesday, February 24, 2004 10:58 AM > To: Brian Osborne > Subject: Re: [Bioperl-l] StandAloneBlast.pm, bl2seq() and tempfiles on > Win32/cygwin > > Ahh, right; perhaps the $TEMPDIR (or $TEMP?) environment variable would > do the trick? > > -Aaron > > On Feb 24, 2004, at 10:46 AM, Brian Osborne wrote: > >> Aaron, >> >> Or ask the author for a way to set the tempdir, in my case it would be >> "e:/cygwin/tmp". I couldn't see such a thing in the documentation, >> perhaps I >> missed it though. >> >> BIO >> >> -----Original Message----- >> From: Aaron J. Mackey [mailto:amackey@pcbi.upenn.edu] >> Sent: Tuesday, February 24, 2004 10:32 AM >> To: Brian Osborne >> Subject: Re: [Bioperl-l] StandAloneBlast.pm, bl2seq() and tempfiles on >> Win32/cygwin >> >> Ooo, that's probably it; what's the solution? >> >> -Aaron >> >> On Feb 24, 2004, at 10:22 AM, Brian Osborne wrote: >> >>> Aaron, >>> >>> Because he's using the BLAST Win binaries which don't understand >>> Cygwin >>> paths? >>> >>> Meaning these work: >>> >>> blastall -i test.fa -d testdb.fa -p blastn >>> >>> blastall -i e:/cygwin/home/bosborne/test.fa -d test.fa -p blastn >>> >>> >>> But this doesn't: >>> >>> blastall -i /home/bosborne/test.fa -d test.fa -p blastn >>> >>> ? >>> >>> >>> Brian O. >>> >>> -----Original Message----- >>> From: bioperl-l-bounces@portal.open-bio.org >>> [mailto:bioperl-l-bounces@portal.open-bio.org]On Behalf Of Aaron J. >>> Mackey >>> Sent: Monday, February 23, 2004 4:11 PM >>> To: Bioperl list >>> Cc: Sushma Parankush Das >>> Subject: [Bioperl-l] StandAloneBlast.pm,bl2seq() and tempfiles on >>> Win32/cygwin >>> >>> >>> A colleague of mine is frustrated by attempting to use >>> Bio::Tools::Run::StandAloneBlast to run bl2seq (Perl 5.8.2, bioperl >>> 1.4, windows XP, CygWin, etc.): >>> >>> # synopsis: >>> $seq1 = $seqio->next_seq; >>> $seq2 = $seqio->next_seq; >>> $factory->bl2seq($seq1, $seq2); >>> >>> StandAloneBlast successfully writes two temp files in /tmp, which >>> have >>> the sequence data and can be read by "less" or "cat" in another open >>> window (with the main program suspended in debugger); however, if >>> either the program code or I at the command line attempt to run >>> bl2seq, >>> it dies with "Cannot open file /tmp/7aasd78asd". If I "cp" the temp >>> files into new files, it runs fine. Or, if I call >>> $factory->bl2seq($file1, $file2) with filenames instead of seq >>> objects, >>> it also works fine. I have tried various incarnations of closing the >>> filehandles and Bio::SeqIO objects that StandAloneBlast.pm is using >>> to >>> generate these temp files, but to no avail (and of course, the >>> tempfiles disappear upon program completion). >>> >>> This is not the "failed to open tempfile; too many files open" error >>> seen previously, and I also expect a fair number of "works for me" >>> responses - please save your breath. >>> >>> Thanks for any input, >>> >>> -Aaron >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l@portal.open-bio.org >>> http://portal.open-bio.org/mailman/listinfo/bioperl-l >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l@portal.open-bio.org >>> http://portal.open-bio.org/mailman/listinfo/bioperl-l >>> >> > From Marc.Logghe at devgen.com Tue Feb 24 11:32:55 2004 From: Marc.Logghe at devgen.com (Marc Logghe) Date: Tue Feb 24 11:39:31 2004 Subject: [Bioperl-l] StandAloneBlast.pm, bl2seq() and tempfiles on Win32/cygwin Message-ID: > -----Original Message----- > From: Aaron J. Mackey [mailto:amackey@pcbi.upenn.edu] > Sent: dinsdag 24 februari 2004 17:20 > To: Brian Osborne > Cc: Bioperl list > Subject: Re: [Bioperl-l] StandAloneBlast.pm,bl2seq() and tempfiles on > Win32/cygwin Setting the environment variable TMPDIR should work. It is used by File::Spec::Unix and File::Spec::Win32 HTH, Marc From hz5 at njit.edu Tue Feb 24 11:37:13 2004 From: hz5 at njit.edu (hz5@njit.edu) Date: Tue Feb 24 11:43:17 2004 Subject: [Bioperl-l] Bioperl graphics Message-ID: <1077640633.403b7db94e285@webmail.njit.edu> Dear all, I am trying to render a CDS using bioperl, I want the arrow ruler on top display coordinates from 18058059 to 18068032 but it seems that it is too big, the image just wouldn't render any suggestions? # #$s = 18058059 and $t = 18068032 # my $whole_seq = Bio::SeqFeature::Generic->new( -start=>$s, -end=>$t, ); $panel->add_track($whole_seq, -glyph => 'arrow', -fgcolor => 'black', -bump => 0, -bgcolor => 'red', -double=>1, -tick => 2); Thanks, waiting online! ========================================================= Haibo Zhang, PhD student Computational Biology, NJIT & Rutgers University Center for Applied Genomics, PHRI http://afs13.njit.edu/~hz5 From brian_osborne at cognia.com Tue Feb 24 11:41:27 2004 From: brian_osborne at cognia.com (Brian Osborne) Date: Tue Feb 24 11:47:37 2004 Subject: [Bioperl-l] StandAloneBlast.pm, bl2seq() and tempfiles on Win32/cygwin In-Reply-To: Message-ID: Marc, That worked! So this is the fix for Cygwin, provided some other application, driven by Perl and using those modules, expects to see Unix-style paths. I'll note in INSTALL.WIN that this is the workaround but that since other apps in Bioperl and Cygwin may suffer, one might want to set this in a script. Brian O. -----Original Message----- From: Marc Logghe [mailto:Marc.Logghe@devgen.com] Sent: Tuesday, February 24, 2004 11:33 AM To: Aaron J. Mackey; Brian Osborne Cc: Bioperl list Subject: RE: [Bioperl-l] StandAloneBlast.pm,bl2seq() and tempfiles on Win32/cygwin > -----Original Message----- > From: Aaron J. Mackey [mailto:amackey@pcbi.upenn.edu] > Sent: dinsdag 24 februari 2004 17:20 > To: Brian Osborne > Cc: Bioperl list > Subject: Re: [Bioperl-l] StandAloneBlast.pm,bl2seq() and tempfiles on > Win32/cygwin Setting the environment variable TMPDIR should work. It is used by File::Spec::Unix and File::Spec::Win32 HTH, Marc From light21bird at hotmail.com Tue Feb 24 15:33:10 2004 From: light21bird at hotmail.com (stephen) Date: Tue Feb 24 13:45:24 2004 Subject: [Bioperl-l] The Drug that puts VIAGR@ to shame! Message-ID: <1077654790-30236@excite.com> Here is an fantastic way to please your lady. You can be ready for up to thirty-six hours. The results are far greater than any other product. http://prescribedmeds.com/sv/index.php?pid=eph9106 cougars eastervolley mimi planet godzilla wright chiquita bird mishkasaskia moroni glenn turbo kitty monopoly Get off this list by writing to getoff3731@mail.com From jason at cgt.duhs.duke.edu Tue Feb 24 14:43:27 2004 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Tue Feb 24 14:49:42 2004 Subject: [Bioperl-l] neighbor.pm In-Reply-To: <6.0.1.1.0.20040224145032.0255bab8@ew9.imap.york.ac.uk> References: <6.0.1.1.0.20040224102434.025311c0@ew9.imap.york.ac.uk> <6.0.1.1.0.20040224132932.0252f470@ew9.imap.york.ac.uk> <6.0.1.1.0.20040224145032.0255bab8@ew9.imap.york.ac.uk> Message-ID: There is a 'bad' amino acid in your data. I get this when I run phylip protdist by hand on your data: WARNING -- BAD AMINO ACID:U AT POSITION 1206 OF SPECIES 32 offending base is a 'U' [jason@sonogno jason]$ seqret -sbegin 1205 -send 1207 out.aln.fasta:AAF44872/1-3 stdout Reads and writes (returns) sequences >AAF44872/1-3 -U- So you'll need to prune these out of the data I guess. -jason On Tue, 24 Feb 2004, Elizabeth Williams wrote: > Here is the alignment. > > At 14:06 24/02/2004, you wrote: > >Any chance you can save the multiple sequence alignment and send that out > >instead? > >-jason > > > >On Tue, 24 Feb 2004, Elizabeth Williams wrote: > > > > > I am pulling down the set of sequences using eval {$seq > > > =$gb->get_Seq_by_id($id);} > > > from a list of gi identifiers. > > > The list which stopped Neighbor.pm was: > > > 2522394 > > > 10727920 > > > 13124364 > > > 633631 > > > 10579820 > > > 25317156 > > > 15887696 > > > 17934261 > > > 17988084 > > > 15964148 > > > 13474446 > > > 15892324 > > > 15604167 > > > 16124268 > > > 19914310 > > > 23051232 > > > 20906191 > > > 37520602 > > > 35211596 > > > 33862201 > > > 33634419 > > > 33238784 > > > 39933589 > > > 27376035 > > > 17132771 > > > 23041817 > > > 1652903 > > > 23129777 > > > 22295967 > > > 33632137 > > > 7287834 > > > 33862352 > > > 22406149 > > > 14324888 > > > > > > could this be a problem for my script and if so how is the best way of > > > catching the error. > > > > > > At 13:10 24/02/2004, you wrote: > > > >can you track it down to a specific dataset which causes the problem? I > > > >would first guess that neighbor is failing and we're not detecting that > > > >very well. you're getting an empty matrix so that is why names is > > > >failing. > > > > > > > >-jason > > > > > > > >On Tue, 24 Feb 2004, Elizabeth Williams wrote: > > > > > > > > > Hello, > > > > > I have a problem running the bit of code below. I get this message: > > > > > > > > > > "Can't call method "names" on an undefined value at > > > > > > > > > > > /biol/programs/perl580/lib/site_perl/5.8.0/Bio/Tools/Run/Phylo/Phylip/Neighbor.pm > > > > > line 470." > > > > > > > > > > but not all the time - it mostly works but on some alignments it > > comes up > > > > > with this error. > > > > > Any ideas of what the problem is or how to fix it? > > > > > > > > > > > > > > > #align sequences > > > > > my @params_align = ('ktuple' => 2, 'matrix' => > > > > > 'BLOSUM', 'QUIET' => 1); > > > > > my $factory = > > > > > Bio::Tools::Run::Alignment::Clustalw->new(@params_align); > > > > > my $seq_array_ref = \@seq_array; # where > > > > > @seq_array is an array of Bio::Seq objects > > > > > my $aln = $factory->align($seq_array_ref); > > > > > my @params_protdist = ('MODEL' => 'PAM', > > 'QUIET' > > > > > => 1); > > > > > > > > > > my $protdist_factory = > > > > > Bio::Tools::Run::Phylo::Phylip::ProtDist->new(@params_protdist); > > > > > > > > > > $protdist_factory->version('3.6'); > > > > > > > > > > my $matrix = $protdist_factory->run($aln); > > > > > > > > > > my @params_neighbor = ('type'=>'NJ', 'QUIET' > > > > => 1); > > > > > > > > > > my $neighborfactory = > > > > > Bio::Tools::Run::Phylo::Phylip::Neighbor->new(@params_neighbor); > > > > > > > > > > $neighborfactory->version('3.6'); > > > > > > > > > > my (@trees) = $neighborfactory->run($matrix); > > > > > > > > > > my $outtree = new Bio::TreeIO(-file => > > > > > ">>geneorigin_results2.xls"); > > > > > > > > > > foreach my $tree (@trees) { > > > > > > > > > > $outtree->write_tree($tree); > > > > > > > > > > } > > > > > > > > > > Elizabeth J.B. Williams > > > > > CNAP > > > > > Department of Biology > > > > > University of York > > > > > York > > > > > YO10 5YW > > > > > mobile: 07813149274 > > > > > work: 01904 328757 > > > > > Fax: 01904 328762 > > > > > > > > > > _______________________________________________ > > > > > Bioperl-l mailing list > > > > > Bioperl-l@portal.open-bio.org > > > > > http://portal.open-bio.org/mailman/listinfo/bioperl-l > > > > > > > > > > > > >-- > > > >Jason Stajich > > > >Duke University > > > >jason at cgt.mc.duke.edu > > > > > > Elizabeth J.B. Williams > > > CNAP > > > Department of Biology > > > University of York > > > York > > > YO10 5YW > > > mobile: 07813149274 > > > work: 01904 328757 > > > Fax: 01904 328762 > > > > > > >-- > >Jason Stajich > >Duke University > >jason at cgt.mc.duke.edu > > Elizabeth J.B. Williams > CNAP > Department of Biology > University of York > York > YO10 5YW > mobile: 07813149274 > work: 01904 328757 > Fax: 01904 328762 > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From barry.moore at genetics.utah.edu Tue Feb 24 13:21:39 2004 From: barry.moore at genetics.utah.edu (Barry Moore) Date: Tue Feb 24 14:51:37 2004 Subject: [Bioperl-l] Re: Fwd: nucleic acid melting temperature In-Reply-To: References: Message-ID: <403B9633.6070902@genetics.utah.edu> Nicolas, There is a module (primer.pm) that will allow you to generate a primer object. This object has a Tm method to return the melting temperature of that primer. About a week ago that method was updated to use the nearest-neighbor thermodynamic approach to calculating Tm, and there has been a discussion going on since then about that. Your program exceeds the capabilities of that method in a variety of ways. The current method calculates the enthalpy and entropy for all dinucleotide pairs, and adjusts those for duplex initiation. It calculates Tm based on those values, the oligo concentration and salt concentration as per Allawi et. al Biochemistry 1997 36:10581-10594 (however the salt adjustment was taken from http://biotools.idtdna.com/analyzer/). The primer.pm module containing that code can be found at: http://cvs.bioperl.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-live/Bio/SeqFeature/Primer.pm?cvsroot=bioperl. I believe that Rob Edwards is the current maintainer of that module. What the current method does not do that your program does is account for the possibility of mismatches and dangling ends. I think the current primer object would need some redesigning to allow for those. You may also be using a more accurate adjustments for salt concentration. Your Melting program looks like it would be a great addition to bioperl. I'm farily new to bioperl, and don't know the overall object structure well enough yet to comment from a developers point of view, but I wonder if your algorithm would be better placed somewhere with a boarder scope than as a method of the SeqFeature::Primer object, perhaps as a method available to all sequence objects. I beleive Rob made a similar comment in his original documentation of the Tm method. Perhaps some of the seasoned Bioperl developers can discuss where a module with the capabilities of Melting should live. Also as a new user, I would suggest that porting Melting to perl and integrating it into Bioperl is preferable to simply writing a wrapper (from the users point of view, not the developers of course). To casual and new users of Bioperl, long lists of dependencies can be very daunting. Barry Moore Nicolas Le Novere wrote: >On Tue, 24 Feb 2004, Jason Stajich wrote: > > > >>Would absolutely love your help/contibution here if you've got some >>working code, I've CC-ed the interested parties. >> >> > >I do not have BioPerl working code regarding that issue. However I >have a C program that is much better and much more plastic than >EMBOSS DAN, or the GCG equivalent (well, last time I looked at those >programs. Maybe they evolved). > >http://www.ebi.ac.uk/~lenov/meltinghome.html > >What is the current state of melting temp calculation in BioPerl? Do >you already have a module? > >If not, we could envision to write a wrapper to MELTING, like in WWW >interface of the program, or in OligoDB, SOL or SEPON, that use >MELTING for the elementary Tm. > >Or on can rewrite MELTING as a BioPerl module and then take advantage >of the reimplementation to improve the program, for instance to add >correction for Mg2+ ions. > >Just tell me how I can help. > > > -- Barry Moore Dept. of Human Genetics University of Utah Salt Lake City, UT From pst at ksu.edu Mon Feb 23 14:21:16 2004 From: pst at ksu.edu (Paul St. Amand) Date: Tue Feb 24 14:52:20 2004 Subject: [Bioperl-l] Help with reversing a sequence Message-ID: <73BFE030-6635-11D8-B2C9-0003938893E4@ksu.edu> The perl reverse cmd is what you want that does exactly what you want -- reverse a string. What is slow specifically? Did you benchmark something? -jason No, I did not benchmark it and speed is really secondary to the question of "what is the best way to do this using BioPerl". I was just wondering if BioPerl had its own function or method for reversing like it has for revcom. So, the reverse that I am using below is the best way? > print $outputfh ">MyReversedSeq29856-29862\n",scalar > reverse($seqobj->subseq(29856,29862)),"\n"; Thanks! Paul PS, I am not subscribed to BioPerl-l, so I could not post a reply on-line. On Mon, 23 Feb 2004, Paul St. Amand wrote: > Hi, > > I am using the following script to get a subsequence and reverse it. > Note that I do NOT want the "reverse complement" of the sequence here, > just the actual reverse. BioPerl has a method to get the revcom of a > seq, such as: > > print $outputfh "Reverse complemented sequence 5 to 10 is > ",$seqobj->trunc(5,10)->revcom->seq, " \n"; > > Does BioPerl have a similar/better way to get the reverse (not revcom) > of a sequence? > > This is how I am doing it and it is slow. Is there a way that is faster > or "better" using BioPerl??? > > > use strict; > use warnings; > use Bio::SeqIO; > my $outputfh = *STDOUT; > > my ($infile, $in, $out, $seqobj); > $infile = shift or die; > > $in = Bio::SeqIO->new('-file' => $infile , > '-format' => 'Fasta'); > $seqobj = $in->next_seq(); > > $out = Bio::SeqIO->newFh('-format' => 'fasta', > '-noclose' => 1, > '-fh' => $outputfh); > > print $outputfh ">MyReversedSeq29856-29862\n",scalar > reverse($seqobj->subseq(29856,29862)),"\n"; > > > > > > Thanks, > Paul > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > -- Jason Stajich Duke University jason at cgt.mc.duke.edu -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/enriched Size: 3071 bytes Desc: not available Url : http://portal.open-bio.org/pipermail/bioperl-l/attachments/20040223/68cfa05b/attachment.bin From xiang.deng at duke.edu Mon Feb 23 10:37:50 2004 From: xiang.deng at duke.edu (Xiang Deng) Date: Tue Feb 24 14:53:15 2004 Subject: [Bioperl-l] Command-line Psiblast using NCBI blastpgp Message-ID: Hi Everyboday, I got a question about how to do psiblast using NCBI blastpgp. The thing I want to do is to use a PSSM generated from a multiple alignment of our internal data to blast against NCBI nr database. I followed the instruction from blast tutorial as follows, blastpgp -i seq1.txt -B align.msf -e 5000 -F F -j 2 -v 10 -d nr -o test_out.txt -C pssm.txt I do not know why I have to specify a single sequence in seq1.txt from the aligned sequences in align.msf. I want to use the pssm created from the multiple alignment in align.msf to blast instead of only one sequence. And the result looks like using the single sequence only for blast and I could not see any sign of using the PSSP calculated from the multiple alignment. I am concerned about that result, does anyone have the same experience and know what is going on there? whether or not the command-line above did exactly what I want and Iam just too suspicious? And anyone has a better way to do this kind of psiblast via command-line? thanks a lot, Xiang Department of Pharmacology and Cancer Biology Duke University Medical Center Durham, NC 27710 From amackey at pcbi.upenn.edu Tue Feb 24 15:01:23 2004 From: amackey at pcbi.upenn.edu (Aaron J. Mackey) Date: Tue Feb 24 15:07:32 2004 Subject: [Bioperl-l] Command-line Psiblast using NCBI blastpgp In-Reply-To: References: Message-ID: <38DAACA4-6704-11D8-9A28-000A958C5008@pcbi.upenn.edu> You must include at least one sequence from the MSA as a query; this sequence defines the "columns" of the PSSM (i.e. any columns in the MSA that include gaps in this sequence will not be apart of the final PSSM). blastpgp reads the MSA and builds a PSSM after determining the relative uniqueness of each sequence in the profile, and weighting the contribution of each sequence to the PSSM by its uniqueness (imagine the extreme: an MSA that consisted of the same protein repeated 10 times; searching with this MSA would be no different than searching with the single protein). How many sequences are in your MSA? If less than 10, you won't see very much change between using the PSSM and just the query sequence alone. If you have 50, but they're all practically the same (redundant) sequence, you'll also see little change in the results. To sum up: don't be so suspicious, I expect it's working as well as it can, given your input sequences. -Aaron On Feb 23, 2004, at 10:37 AM, Xiang Deng wrote: > Hi Everyboday, > > I got a question about how to do psiblast using NCBI blastpgp. The > thing I > want to do is to use a PSSM generated from a multiple alignment of our > internal data to blast against NCBI nr database. I followed the > instruction from blast tutorial as follows, > > blastpgp -i seq1.txt -B align.msf -e 5000 -F F -j 2 -v 10 -d nr -o > test_out.txt -C pssm.txt > > I do not know why I have to specify a single sequence in seq1.txt from > the > aligned sequences in align.msf. I want to use the pssm created from the > multiple alignment in align.msf to blast instead of only one sequence. > And > the result looks like using the single sequence only for blast and I > could > not see any sign of using the PSSP calculated from the multiple > alignment. > I am concerned about that result, does anyone have the same experience > and > know what is going on there? whether or not the command-line above did > exactly what I want and Iam just too suspicious? > > And anyone has a better way to do this kind of psiblast via > command-line? > > thanks a lot, > > Xiang > > Department of Pharmacology and Cancer Biology > Duke University Medical Center > Durham, NC 27710 > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l From laurichj at bioinfo.ucr.edu Tue Feb 24 17:47:30 2004 From: laurichj at bioinfo.ucr.edu (Josh Lauricha) Date: Tue Feb 24 19:24:09 2004 Subject: [Bioperl-l] Re: Bio ::seqIO ::tig In-Reply-To: <403b77fd74b002.61830072@businessserve.co.uk> References: <403b77fd74b002.61830072@businessserve.co.uk> Message-ID: <20040224224730.GB18113@batch107a> On Tue 02/24/04 16:12, Jay Moore wrote: I've commited the fixes for the 1.4 version of tigr.pm to CVS, this should fix both the tiling path and keyword problems. Testing would help, as I don't use the files that cause this problem (or at least, I don't run into it). However, I'm puting the final touches on a XML::SAX based parser, which I hope to have out fairly soon. -- ------------------------------------------------------ | Josh Lauricha | Ford, your turning into | | laurichj@bioinfo.ucr.edu | a penguin. Stop it. | | Bioinformatics, UCR | | |----------------------------------------------------| | OpenPG: | | 5A0D 92D3 D093 79DE F724 1137 6DF1 B5EB D9CE AAA8 | |----------------------------------------------------| -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 197 bytes Desc: Digital signature Url : http://portal.open-bio.org/pipermail/bioperl-l/attachments/20040224/cff1ae7c/attachment.bin From redwards at utmem.edu Tue Feb 24 19:27:40 2004 From: redwards at utmem.edu (Rob Edwards) Date: Tue Feb 24 19:33:51 2004 Subject: [Bioperl-l] Re: Fwd: nucleic acid melting temperature In-Reply-To: <403B9633.6070902@genetics.utah.edu> References: <403B9633.6070902@genetics.utah.edu> Message-ID: <6BCD6BCC-6729-11D8-8A23-000A959E1622@utmem.edu> This is a pretty good summary of the situation. I initially wrote Bio::SeqFeature::Primer to hold the Primer object. At the time I was mainly using it with Bio::Tools::Run::Primer3 to design some primers for PCR amplification. The Tm calculation routine was included because it makes sense for a primer module (*). However, there are many people on the list that know a lot more about Tm calculations than I do, and I have updated the module when others propose better calculations. I do think that Tm calculations should be in a separate module (probably either their own or Bio::Tools::SeqStats) as Tm calculations could be appropriate in a variety of different experiments, but I am happy to cede that this may not be desirable because of the large number of modules! There actually is already a bioperl wrapper for running Melting .... Bio::Tools::Run::PiseApplication::melting (note that you'll need to install Bio::Tools::Run separately) that works via the Pise website. We could duplicate the effort and rewrite Melting in Perl, we could write a separate Wrapper for Bio::Tools::Run, or we could direct people to the Pise implementation. Rob (*) The Bio::SeqFeature::Primer and other modules were re-written by me based on the work of Chad Matsalla for which I am grateful. I expect that Chad had a Tm calculator too (possibly the same one), though I can't find an old copy of his modules to check this. On Feb 24, 2004, at 12:21 PM, Barry Moore wrote: > Nicolas, > > There is a module (primer.pm) that will allow you to generate a primer > object. This object has a Tm method to return the melting temperature > of that primer. About a week ago that method was updated to use the > nearest-neighbor thermodynamic approach to calculating Tm, and there > has been a discussion going on since then about that. Your program > exceeds the capabilities of that method in a variety of ways. The > current method calculates the enthalpy and entropy for all > dinucleotide pairs, and adjusts those for duplex initiation. It > calculates Tm based on those values, the oligo concentration and salt > concentration as per Allawi et. al Biochemistry 1997 36:10581-10594 > (however the salt adjustment was taken from > http://biotools.idtdna.com/analyzer/). The primer.pm module > containing that code can be found at: > http://cvs.bioperl.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-live/Bio/ > SeqFeature/Primer.pm?cvsroot=bioperl. I believe that Rob Edwards is > the current maintainer of that module. What the current method does > not do that your program does is account for the possibility of > mismatches and dangling ends. I think the current primer object would > need some redesigning to allow for those. You may also be using a > more accurate adjustments for salt concentration. > > Your Melting program looks like it would be a great addition to > bioperl. I'm farily new to bioperl, and don't know the overall object > structure well enough yet to comment from a developers point of view, > but I wonder if your algorithm would be better placed somewhere with a > boarder scope than as a method of the SeqFeature::Primer object, > perhaps as a method available to all sequence objects. I beleive Rob > made a similar comment in his original documentation of the Tm method. > Perhaps some of the seasoned Bioperl developers can discuss where a > module with the capabilities of Melting should live. Also as a new > user, I would suggest that porting Melting to perl and integrating it > into Bioperl is preferable to simply writing a wrapper (from the users > point of view, not the developers of course). To casual and new users > of Bioperl, long lists of dependencies can be very daunting. > > Barry Moore > From hlapp at gnf.org Tue Feb 24 22:28:48 2004 From: hlapp at gnf.org (Hilmar Lapp) Date: Tue Feb 24 22:34:56 2004 Subject: use of seq_id. was: [Bioperl-l] Bio::Tools::GFF use of seqname In-Reply-To: <403B1F95.20504@mrc-lmb.cam.ac.uk> Message-ID: On Tuesday, February 24, 2004, at 01:55 AM, Dave Howorth wrote: > Hilmar Lapp wrote: >> Actually, $feat->can('seq_id') must be true at all times iff >> $feat->isa("Bio::SeqFeatureI"), so it's kind of superfluous to test >> for it. > > Where does this come from, please? In the SeqFeatureI documentation > it says seq_id 'is an attribute such that you *can* store the ID' (my > emphasis). You seem to be saying that if I'm creating a bunch of > (sub) features just so I can use Bio::Graphics, I must attach a seq_id > to each and every one. I'm not saying anything about the value of the attribute. $feat->can('seq_id') will be true if you can call $feat->seq_id(), which you will always be able to since it's defined in Bio::SeqFeatureI. Whether that method returns garbage or something useful is another story. I'm not sure whether you have to set seq_id() to something meaningful in order to remain compatible with Bio::Graphics, but I'd guess you do. Lincoln? > > I have an inverse question that I haven't managed to find an answer to > yet. If I'm displaying these sub-features as segments, how can I > attach some text to the feature that will be displayed alongside each > individual segment? > This one is for Lincoln ... -hilmar > Thanks, Dave > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > -- ------------------------------------------------------------- Hilmar Lapp email: lapp at gnf.org GNF, San Diego, Ca. 92121 phone: +1-858-812-1757 ------------------------------------------------------------- From sdavis2 at mail.nih.gov Wed Feb 25 07:22:35 2004 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Wed Feb 25 07:28:40 2004 Subject: [Bioperl-l] Gbrowse DAS support Message-ID: I have the following situation (and I imagine that I am not unique, here): I have a number of different oligo and cDNA microarray platforms, each with probes that have (human) sequence associated with them. For many of them, we construct our own annotation by blasting against various databases, the result being blast results for each probe to genbank sequences, ests, refseq genes, ensembl genes, or the reference genome. While this level of knowledge is adequate for most applications, we are now finding (particularly with oligo probes) that we need to know with a fair amount of detail what these sequences look like in genomic context down to the basepair level. Therefore, I would like to build a browser that incorporates my local information including blast hits of my sequence against various reference sequences. While I could certainly build a local database to hold all of the possible references and their assembly, I would like to use Bio::Das to fetch annotations. I could see fetching most annotations from the DAS server, but including tracks for my local data, also. And, while I could build a browser, I would like to start with something done already, and Gbrowse seems a likely candidate for me. I currently have bioperl-1.4, bio::das, and gbrowse running. The installation file for gbrowse says that a feature wish list includes "better DAS support" and the ability to "configure data sources on a track-by-track" basis. Has anyone accomplished this? Are there other options at which I should look (including "go do it yourself")? Sean -- Sean Davis, M.D., Ph.D. Postdoctoral Research Fellow NHGRI, NIH Clinical Fellow NCI, NIH Clinical Fellow, Johns Hopkins Department of Pediatric Oncology -- From dhoworth at mrc-lmb.cam.ac.uk Wed Feb 25 09:05:58 2004 From: dhoworth at mrc-lmb.cam.ac.uk (Dave Howorth) Date: Wed Feb 25 09:12:02 2004 Subject: [Bioperl-l] CPAN install Message-ID: <403CABC6.4090007@mrc-lmb.cam.ac.uk> I'm trying to upgrade my bioperl installation to 1.4. I want to do it from CPAN, using perl -MCPAN because that's the way I install Perl software. I'm having trouble, so now I have an extra reason: to see if the install process needs fixing :) I have a couple of questions: (1) On it says "Information on how to use CPAN.pm to automatically download BioPerl and various CPAN module dependencies is described on our INSTALL file." but the install file does not contain this information as far as I can see? (it describes how to install Bundle::Bioperl that way, but not Bioperl itself) (2) What is the name of the distribution? The link takes me to bioperl-1.4 but I get an error when I try to install it: perl -MCPAN -e shell cpan> install bioperl-1.4 Warning: Cannot install bioperl-1.4, don't know what it is. Try the command i /bioperl-1.4/ to find objects with matching identifiers. The suggested search returns a long list of individual modules. I get similar errors if I try 'bioperl' or 'Bioperl'. (3) There is a module called Bioperl, which says it is part of the bioperl-1.4 distribution, but describes itself as "Bioperl 1.3 - Perl Modules for Biology" Any ideas? Thanks, Dave -- Dave Howorth MRC Centre for Protein Engineering Hills Road, Cambridge, CB2 2QH 01223 252960 From dhoworth at mrc-lmb.cam.ac.uk Wed Feb 25 09:32:50 2004 From: dhoworth at mrc-lmb.cam.ac.uk (Dave Howorth) Date: Wed Feb 25 09:38:53 2004 Subject: [Bioperl-l] Re: CPAN install In-Reply-To: <403CABC6.4090007@mrc-lmb.cam.ac.uk> References: <403CABC6.4090007@mrc-lmb.cam.ac.uk> Message-ID: <403CB212.4000601@mrc-lmb.cam.ac.uk> I wrote: > I'm trying to upgrade my bioperl installation to 1.4. > > (2) What is the name of the distribution? I got a little further. It is necessary to type: install B/BI/BIRNEY/bioperl-1.4.tar.gz It might be worth documenting this in the install instructions or fixing the distribution so it's not necessary. But now I get test failures: Failed Test Stat Wstat Total Fail Failed List of Failed ------------------------------------------------------------------------------- t/DB.t 78 8 10.26% 55 79-85 t/Perl.t 14 1 7.14% 13 t/RestrictionIO.t 14 1 7.14% 10 121 subtests skipped. Failed 3/179 test scripts, 98.32% okay. -4/8268 subtests failed, 100.05% okay. make: *** [test_dynamic] Error 11 /usr/bin/make test -- NOT OK Running make install make test had returned bad status, won't install without force My system is Debian Woody with a bunch of backports. Perl 5.6.1. Any thoughts on why these tests are failing? Thanks, Dave -- Dave Howorth MRC Centre for Protein Engineering Hills Road, Cambridge, CB2 2QH 01223 252960 From brian_osborne at cognia.com Wed Feb 25 09:35:11 2004 From: brian_osborne at cognia.com (Brian Osborne) Date: Wed Feb 25 09:41:24 2004 Subject: [Bioperl-l] CPAN install In-Reply-To: <403CABC6.4090007@mrc-lmb.cam.ac.uk> Message-ID: Dave, (1) You're right, there should be instructions in the INSTALL file, I'll fix that. (2) That long list of modules from "i/bioperl-1.4/" is the list of modules in 1.4. If you take a look at the top of that list you'll see "Distribution id = B/BI/BIRNEY/bioperl-1.4.tar.gz", that's the name you should use, i.e. "install B/BI/BIRNEY/bioperl-1.4.tar.gz". However, you'll want to install the Bundle first, "d/BioPerl/" should give you the exact name. Brian O. -----Original Message----- From: bioperl-l-bounces@portal.open-bio.org [mailto:bioperl-l-bounces@portal.open-bio.org]On Behalf Of Dave Howorth Sent: Wednesday, February 25, 2004 9:06 AM To: bioperl-l@bioperl.org Subject: [Bioperl-l] CPAN install I'm trying to upgrade my bioperl installation to 1.4. I want to do it from CPAN, using perl -MCPAN because that's the way I install Perl software. I'm having trouble, so now I have an extra reason: to see if the install process needs fixing :) I have a couple of questions: (1) On it says "Information on how to use CPAN.pm to automatically download BioPerl and various CPAN module dependencies is described on our INSTALL file." but the install file does not contain this information as far as I can see? (it describes how to install Bundle::Bioperl that way, but not Bioperl itself) (2) What is the name of the distribution? The link takes me to bioperl-1.4 but I get an error when I try to install it: perl -MCPAN -e shell cpan> install bioperl-1.4 Warning: Cannot install bioperl-1.4, don't know what it is. Try the command i /bioperl-1.4/ to find objects with matching identifiers. The suggested search returns a long list of individual modules. I get similar errors if I try 'bioperl' or 'Bioperl'. (3) There is a module called Bioperl, which says it is part of the bioperl-1.4 distribution, but describes itself as "Bioperl 1.3 - Perl Modules for Biology" Any ideas? Thanks, Dave -- Dave Howorth MRC Centre for Protein Engineering Hills Road, Cambridge, CB2 2QH 01223 252960 _______________________________________________ Bioperl-l mailing list Bioperl-l@portal.open-bio.org http://portal.open-bio.org/mailman/listinfo/bioperl-l From dhoworth at mrc-lmb.cam.ac.uk Wed Feb 25 09:45:13 2004 From: dhoworth at mrc-lmb.cam.ac.uk (Dave Howorth) Date: Wed Feb 25 09:51:16 2004 Subject: [Bioperl-l] CPAN install In-Reply-To: References: Message-ID: <403CB4F9.9050304@mrc-lmb.cam.ac.uk> Brian Osborne wrote: > "install B/BI/BIRNEY/bioperl-1.4.tar.gz". Thanks. > However, you'll want to install the Bundle first, "d/BioPerl/" should give > you the exact name. I'd already upgraded that. 'install Bundle::BioPerl' works exactly as I would hope. And before that, upgrading libgd from source worked exactly as it should too :) Thanks, Dave -- Dave Howorth MRC Centre for Protein Engineering Hills Road, Cambridge, CB2 2QH 01223 252960 From hz5 at njit.edu Wed Feb 25 09:49:54 2004 From: hz5 at njit.edu (hz5@njit.edu) Date: Wed Feb 25 09:55:58 2004 Subject: [Bioperl-l] bioperl graphics Message-ID: <1077720594.403cb61235be7@webmail.njit.edu> Dear all, Is there any way to render 2 Bio::Graphics::Panel into one png image? because I want 2 different arrows with different labeled coordinates on the same image and align to the left, but one Panel can only have one coordinates system. Thanks! ========================================================= Haibo Zhang, PhD student Computational Biology, NJIT & Rutgers University Center for Applied Genomics, PHRI http://afs13.njit.edu/~hz5 From brian_osborne at cognia.com Wed Feb 25 09:57:06 2004 From: brian_osborne at cognia.com (Brian Osborne) Date: Wed Feb 25 10:03:11 2004 Subject: [Bioperl-l] CPAN install In-Reply-To: <403CB4F9.9050304@mrc-lmb.cam.ac.uk> Message-ID: Dave, I forgot to mention that you may still get errors in your "make test", despite the fact that you installed the Bundle. The question, perhaps, is whether these failures have anything to do your intended use of Bioperl. I've never done an install of Bioperl that passed all of the tests myself. Most of us who like the CPAN approach just do the "force install" at that point. Brian O. -----Original Message----- From: Dave Howorth [mailto:dhoworth@mrc-lmb.cam.ac.uk] Sent: Wednesday, February 25, 2004 9:45 AM To: Brian Osborne Cc: bioperl-l@bioperl.org Subject: Re: [Bioperl-l] CPAN install Brian Osborne wrote: > "install B/BI/BIRNEY/bioperl-1.4.tar.gz". Thanks. > However, you'll want to install the Bundle first, "d/BioPerl/" should give > you the exact name. I'd already upgraded that. 'install Bundle::BioPerl' works exactly as I would hope. And before that, upgrading libgd from source worked exactly as it should too :) Thanks, Dave -- Dave Howorth MRC Centre for Protein Engineering Hills Road, Cambridge, CB2 2QH 01223 252960 From brian_osborne at cognia.com Wed Feb 25 10:42:47 2004 From: brian_osborne at cognia.com (Brian Osborne) Date: Wed Feb 25 10:48:53 2004 Subject: [Bioperl-l] Re: CPAN install In-Reply-To: <403CB212.4000601@mrc-lmb.cam.ac.uk> Message-ID: Dave, No idea, though when I do "perl t/DB.t" I also get failures. You can always do the "force install" and thoroughly investigate after installation. The Perl.t failure is odd, I'm guessing it's the RefSeq retrieval, but I'm not sure. I don't think it has to do with your local installation. Same thing for the RestrictionIO.t failure. >It might be worth documenting this in the install instructions or fixing >the distribution so it's not necessary. Yes. Brian O. -----Original Message----- From: bioperl-l-bounces@portal.open-bio.org [mailto:bioperl-l-bounces@portal.open-bio.org]On Behalf Of Dave Howorth Sent: Wednesday, February 25, 2004 9:33 AM To: bioperl-l@bioperl.org Subject: [Bioperl-l] Re: CPAN install I wrote: > I'm trying to upgrade my bioperl installation to 1.4. > > (2) What is the name of the distribution? I got a little further. It is necessary to type: install B/BI/BIRNEY/bioperl-1.4.tar.gz It might be worth documenting this in the install instructions or fixing the distribution so it's not necessary. But now I get test failures: Failed Test Stat Wstat Total Fail Failed List of Failed ---------------------------------------------------------------------------- --- t/DB.t 78 8 10.26% 55 79-85 t/Perl.t 14 1 7.14% 13 t/RestrictionIO.t 14 1 7.14% 10 121 subtests skipped. Failed 3/179 test scripts, 98.32% okay. -4/8268 subtests failed, 100.05% okay. make: *** [test_dynamic] Error 11 /usr/bin/make test -- NOT OK Running make install make test had returned bad status, won't install without force My system is Debian Woody with a bunch of backports. Perl 5.6.1. Any thoughts on why these tests are failing? Thanks, Dave -- Dave Howorth MRC Centre for Protein Engineering Hills Road, Cambridge, CB2 2QH 01223 252960 _______________________________________________ Bioperl-l mailing list Bioperl-l@portal.open-bio.org http://portal.open-bio.org/mailman/listinfo/bioperl-l From crabtree at tigr.org Wed Feb 25 11:46:29 2004 From: crabtree at tigr.org (Jonathan Crabtree) Date: Wed Feb 25 11:54:01 2004 Subject: [Bioperl-l] bioperl graphics In-Reply-To: <1077720594.403cb61235be7@webmail.njit.edu> References: <1077720594.403cb61235be7@webmail.njit.edu> Message-ID: <403CD165.4090309@tigr.org> Haibo- hz5@njit.edu wrote: >Is there any way to render 2 Bio::Graphics::Panel into one png image? because I >want 2 different arrows with different labeled coordinates on the same image >and align to the left, but one Panel can only have one coordinates system. > > The answer is yes, with a couple of caveats. The first is that you will have to take care of the layout of the individual Panel-generated images. If you're left-justifying everything then this should be easy enough. The second is that I would recommend making a one-line change to Bio/Graphics/Panel.pm, to prevent the package from trying to allocate the same set of colors twice (when you reuse the same GD object to draw the two different parts of the image.) Search for the following piece of code in Panel.pm (at line 411 in bioperl-1.4, I think): for my $name ('white','black',keys %COLORS) { my $idx = $gd->colorAllocate(@{$COLORS{$name}}); $translation_table{$name} = $idx; } Change "colorAllocate" to "colorResolve"; this should have no effect on any existing Bio::Graphics code (AFAIK) and will allow you to do your two (or three or four)-Panel trick. (As an aside, I'd like to lobby for this one-line change to be made in a future version of Bio::Graphics::Panel, for precisely this reason.) In any case, once you've made that change and reinstalled your copy of Bioperl, here is a rough outline of what you need to do: 1. Set up your individual Bio::Graphics::Panel objects (e.g. $p1, $p2, $p3, etc.) as desired to draw your images, but do *not* call the gd method on any of them yet. 2. Create a GD::Image object big enough to hold the images that will be drawn by $p1, $p2, $p3, etc.: my $gdImg = GD::Image->new($fullWidth, $fullHeight); (Note: use $p1->width(), $p1->height(), etc., to determine what $fullWidth and $fullHeight should be, based on your desired Panel layout algorithm.) 4. Use a "dummy" Bio::Graphics::Panel object to allocate all your colors (this is an optional step; I do this because my code does some drawing that isn't handled by Bio::Graphics::Panel, and want to make sure that the palette has been allocated before I start): my $dummyPanel = Bio::Graphics::Panel->new(-length => 100, -offset => 0, -width => $fullWidth); $dummyPanel->gd($gdImg); # forces color allocation 5. Draw the individual panels and generate your png image: $p1->gd($gdImg); $p2->gd($gdImg); my $pngData = $gdImg->png(); I've glossed over some of the details here, for example the fact that you may need to know the value of $p1->height() before you can initialize $p2, but that's the basic idea. I've been using this method to generate some comparative sequence displays and while it's definitely a bit of a hack, it works well in practice. You can also do the same thing with a GD::SVG::Image if you'd like to generate SVG output. Good luck, Jonathan From cjfields at uiuc.edu Wed Feb 25 12:26:45 2004 From: cjfields at uiuc.edu (Chris Fields) Date: Wed Feb 25 18:05:57 2004 Subject: [Bioperl-l] bioperl-1.4 ppm package for Windows Message-ID: <6.0.0.22.2.20040225104734.01c13840@express.cites.uiuc.edu> An HTML attachment was scrubbed... URL: http://portal.open-bio.org/pipermail/bioperl-l/attachments/20040225/aa08846c/attachment.htm From abcdef21hanna at hotmail.com Thu Feb 26 12:23:37 2004 From: abcdef21hanna at hotmail.com (milford) Date: Wed Feb 25 23:28:49 2004 Subject: [Bioperl-l] Forget V1AGRA, there's a new game in town! Message-ID: <1077816217-21027@excite.com> The Biggest New Drug since V1agra! Many times as powerful. http://medspro.net/sv/index.php?pid=eph9106 C1AL1S has been seen all over TV as of late. So why is it so much better than V1agra? Why are so many switching brands? -A quicker more stable erection -More enjoyable sex for both -Longer sex -Known to add length to you erection -Lasts up to 36 hours (not a thrity-six hour erection, but enhancement for thirty-six) We have it at a discounted savings. Save when you go through our site on all your orders. See the difference today. http://medspro.net/sv/index.php?pid=eph9106 yoda gobluevicky kingdom fletch frogs softball binky gretchen vermontlamer zeppelin ruth larry sarah1 homebrew Get off this list by writing to getoff3136@yahoomail.com From brian_osborne at cognia.com Thu Feb 26 08:22:29 2004 From: brian_osborne at cognia.com (Brian Osborne) Date: Thu Feb 26 08:28:35 2004 Subject: [Bioperl-l] DBFetch problem Message-ID: >Dear Brian, >I found out that the RNA entries are missing from our RefSeq at the moment, >that is why the example entry is not found with dbfetch: >LOCUS NM_006732 3775 bp mRNA linear PRI 20-DEC-2003 >This will be fixed today, so by tomorrow you should find the same >entries as from NCBI website. NCBI has changed their RefSeq >distribution files and most of the RNA entries were accidentally >left outside the distribution. From sdavis2 at mail.nih.gov Thu Feb 26 16:01:53 2004 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Thu Feb 26 16:08:02 2004 Subject: [Bioperl-l] Mysql database connection from Gbrowse Message-ID: Lincoln and others, With regard to my last e-mail, I simply used bp_load_gff.pl and was able to load the human gff files from gmod.org within an hour or so. In any case, I have a working mysql database called human that I can access via mysql command line. I also have a working version of gbrowse that connects to the yeast_chr1 memory database. (I also connected to the DAS server at ucsc--WAY COOL.) I am now trying to connect to my mysql human database and get the following returned to the browser. http://localhost/cgi-bin/gbrowse/human ----------------- An internal error has occurred Could not open database. install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC (@INC contains: /System/Library/Perl/5.8.1/darwin-thread-multi-2level /System/Library/Perl/5.8.1 /Library/Perl/5.8.1/darwin-thread-multi-2level /Library/Perl/5.8.1 /Library/Perl /Network/Library/Perl/5.8.1/darwin-thread-multi-2level /Network/Library/Perl/5.8.1 /Network/Library/Perl .) at (eval 18) line 3. Perhaps the DBD::mysql perl module hasn't been fully installed, or perhaps the capitalisation of 'mysql' isn't right. Available drivers: ExampleP, Multiplex, Proxy, Sponge. at /Library/Perl/5.8.1/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm line 139 Please contact this site's maintainer ([no address given]) for assistance. For the source code for this browser, see the Generic Model Organism Database Project. For other questions, send mail to lstein@cshl.org. $Id: yeast_chr1.conf,v 1.6 2004/02/04 15:03:43 marclogghe Exp $ Note: This page uses cookie to save and restore preference information. No information is shared. Generic genome browser version Bio::Graphics::Browser=HASH(0x801294) ------------------ However, this works: perl -e "use Bio::DB::GFF::Adaptor::dbi::mysql" As does perl -e "use DBD::mysql" >From the human.conf file: [GENERAL] description = human db_adaptor = Bio::DB::GFF db_args = -adaptor dbi::mysql -dsn human Any ideas? Thanks, Sean From f.cadieux at btinternet.com Thu Feb 26 17:18:21 2004 From: f.cadieux at btinternet.com (Info) Date: Thu Feb 26 17:24:44 2004 Subject: [Bioperl-l] Federal Provincial Subsidies Message-ID: <200402262224.i1QMOE9Q026843@portal.open-bio.org> CANADA BOOKS 26 CH. BELLEVUE ST-ANNE-DES-LACS QC, CANADA J0R 1B0 (450) 224-9275 PRESS RELEASE CANADIAN SUBSIDY DIRECTORY YEAR 2004 EDITION Legal Deposit-National Library of Canada ISBN 2-922870-05-7 The new revised edition of the Canadian Subsidy Directory 2004 is now available. The new edition is the most complete and affordable reference for anyone looking for financial support. It is deemed to be the perfect tool for new or existing businesses, individual ventures, foundations and associations. This Publication contains more than 2600 direct and indirect financial subsidies, grants and loans offered by government departments and agencies, foundations, associations and organisations. In this new 2004 edition all programs are well described. The Canadian Subsidy Directory is the most comprehensive tool to start up a business, improve existent activities, set up a business plan, or obtain assistance from experts in fields such as: Industry, transport, agriculture, communications, municipal infrastructure, education, import-export, labor, construction and renovation, the service sector, hi-tech industries, research and development, joint ventures, arts, cinema, theatre, music and recording industry, the self employed, contests, and new talents. Assistance from and for foundations and associations, guidance to prepare a business plan, market surveys, computers, and much more! The Canadian Subsidy Directory is sold $ 69.95, to obtain a copy please visit: www.cbooks.biz From meow21safety at hotmail.com Fri Feb 27 11:32:06 2004 From: meow21safety at hotmail.com (rodrick) Date: Thu Feb 26 22:37:16 2004 Subject: [Bioperl-l] Forget V1AGRA, there's a new game in town! Message-ID: <1077899526-4901@excite.com> The Biggest New Drug since V1agra! Many times as powerful. http://healthdo.com/sv/index.php?pid=eph9106 -A quicker more stable erection -More enjoyable sex for both -Longer sex -Known to add length to you erection -Lasts up to 36 hours (not a thrity-six hour erection, but enhancement for thirty-six) We have it at a discounted savings. Save when you go through our site on all your orders. http://healthdo.com/sv/index.php?pid=eph9106 molly1 spaincynthia symbol ladybug dexter cherry sasha misha lightnew japan lloyd oranges bird sbdc Get off this list go to http://healthdo.com/sv/applepie.php From lstein at cshl.edu Fri Feb 27 09:38:31 2004 From: lstein at cshl.edu (Lincoln Stein) Date: Fri Feb 27 09:45:00 2004 Subject: [Bioperl-l] bioperl graphics In-Reply-To: <403CD165.4090309@tigr.org> References: <1077720594.403cb61235be7@webmail.njit.edu> <403CD165.4090309@tigr.org> Message-ID: <200402271638.37134.lstein@cshl.edu> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Sorry, but if you change colorAllocate() to colorResolve(), you will break the ability to generate publication-quality images with GD::SVG. Perhaps Todd Harris will add colorResolve() to a future version of GD::SVG, in which case I will make the suggested change to Bio::Graphics. I would recommend instead making two Bio::Graphics::Panel objects, and generating a pair of GD objects (using the Panel->gd() method). Then you can combine them onto a third GD object in whatever geometry you want by using GD->copy() Lincoln On Wednesday 25 February 2004 06:46 pm, Jonathan Crabtree wrote: > Haibo- > > hz5@njit.edu wrote: > >Is there any way to render 2 Bio::Graphics::Panel into one png > > image? because I want 2 different arrows with different labeled > > coordinates on the same image and align to the left, but one > > Panel can only have one coordinates system. > > The answer is yes, with a couple of caveats. The first is that you > will have to take care of the layout of the individual > Panel-generated images. If you're left-justifying everything then > this should be easy enough. The second is that I would recommend > making a one-line change to Bio/Graphics/Panel.pm, to prevent the > package from trying to allocate the same set of colors twice (when > you reuse the same GD object to draw the two different parts of the > image.) Search for the following piece of code in Panel.pm (at > line 411 in bioperl-1.4, I think): > > for my $name ('white','black',keys %COLORS) { > my $idx = $gd->colorAllocate(@{$COLORS{$name}}); > $translation_table{$name} = $idx; > } > > Change "colorAllocate" to "colorResolve"; this should have no > effect on any existing Bio::Graphics code (AFAIK) and will allow > you to do your two (or three or four)-Panel trick. (As an aside, > I'd like to lobby for this one-line change to be made in a future > version of > Bio::Graphics::Panel, for precisely this reason.) In any case, > once you've made that change and reinstalled your copy of Bioperl, > here is a rough outline of what you need to do: > > 1. Set up your individual Bio::Graphics::Panel objects (e.g. $p1, > $p2, $p3, etc.) as desired to draw your images, but do *not* call > the gd method on any of them yet. > > 2. Create a GD::Image object big enough to hold the images that > will be drawn by $p1, $p2, $p3, etc.: > my $gdImg = GD::Image->new($fullWidth, $fullHeight); > (Note: use $p1->width(), $p1->height(), etc., to determine what > $fullWidth and $fullHeight should be, based on your desired Panel > layout algorithm.) > > 4. Use a "dummy" Bio::Graphics::Panel object to allocate all your > colors (this is an optional step; I do this because my code does > some drawing that isn't handled by Bio::Graphics::Panel, and want > to make sure that the palette has been allocated before I start): > > my $dummyPanel = Bio::Graphics::Panel->new(-length => 100, > -offset => 0, -width => $fullWidth); > $dummyPanel->gd($gdImg); # forces color allocation > > 5. Draw the individual panels and generate your png image: > > $p1->gd($gdImg); > $p2->gd($gdImg); > my $pngData = $gdImg->png(); > > I've glossed over some of the details here, for example the fact > that you may need to know the value of $p1->height() before you can > initialize $p2, but that's the basic idea. I've been using this > method to generate some comparative sequence displays and while > it's definitely a bit of a hack, it works well in practice. You > can also do the same thing with a GD::SVG::Image if you'd like to > generate SVG output. Good luck, > > Jonathan > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l - -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (GNU/Linux) iD8DBQFAP1Zt0CIvUP7P+AkRAreyAJ0XIcjMDeT/Bw69OBOEhD8tsznP+QCfVLWo +RnQaijXxPlVWTbmjTkbHYw= =lN1U -----END PGP SIGNATURE----- From lstein at cshl.edu Fri Feb 27 09:45:17 2004 From: lstein at cshl.edu (Lincoln Stein) Date: Fri Feb 27 09:52:45 2004 Subject: [Bioperl-l] bioperl-1.4 ppm package for Windows In-Reply-To: <6.0.0.22.2.20040225104734.01c13840@express.cites.uiuc.edu> References: <6.0.0.22.2.20040225104734.01c13840@express.cites.uiuc.edu> Message-ID: <200402271645.18011.lstein@cshl.edu> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi Chris, Do you want to try the bioperl-1.4 ppm located on this repository? http://www.gmod.org/ggb/ppm I put it together myself and it's the one that seems to work properly for me. Lincoln On Wednesday 25 February 2004 07:26 pm, Chris Fields wrote: > I was unable to get the PPM package for 1.4 working for Windows > from http:/bioperl.org/DIST and had to perform a workaround. I > decided to post it in case others were running into problems. > > When I first tried installing Bioperl using PPM, it installs > bioperl 1.2 first (!?!), then allows upgrading to 1.2.3. However, > it will not install 1.4 b/c of the additional dependencies > (HTML-Entities and IO-Scalar). The latter dependencies are notably > not req'd for 1.2 or 1.2.3. IMHO, I'm guessing that PPM can't find > these modules b/c it is looking for specific ppm packages named > HTML-Entities and IO-Scalar, not for the modules named > HTML-Entities and IO-Scalar (which are included in the packages > HTML-Parser and IO-stringy). This problem could be linked to the > version of PPM I'm using (3.1) on ActivePerl 5.8.3-809, both of > which are very new, so I have no idea if this is a problem with > older versions of PPM. > > The workaround was to remove the dependencies manually. I > downloaded the relevant ppm tar file and corresponding ppd files > (bioperl-1.4-ppm.tar.gz and Bioperl-1.4.ppd, respectively) to a > local directory (C:\Perl\Bioperl). Using a text editor, I removed > all references to the added dependencies and saved the file. More > specifically, I deleted the following lines, listed twice under > Implementations (so delete both sets!): > > > VERSION="0,0,0,0" /> > /> > > > I then entered PPM, set up a local ppd repository: > > rep add local_bio "C:/Perl/Bioperl" > > I then searched for and installed the modifed PPM file and it > worked. > > Like I said, I don't know if this is a PPM issue or not. However, > I think it might be a good idea to remove those dependencies just > in case, as they are a bit redundant (both HTML-Parser and > IO-stringy are already listed). > > My two cents... > __________________________________ > > > > Chris Fields - Postdoctoral Researcher > Lab of Dr. Robert Switzer > > Address: > > University of Illinois at Urbana-Champaign > Dept. of Biochemistry - 323 RAL > 600 S. Mathews Ave. > Urbana, IL 61801 > > Phone : (217) 333-7098 > Fax : (217) 244-5858 - -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (GNU/Linux) iD8DBQFAP1f90CIvUP7P+AkRAkyOAJ9BmoqcV3DC4zJh392bIveOQ9ec6wCfVMbb EKv61liRTU8XfEeQ1yg6EeU= =IP7P -----END PGP SIGNATURE----- From Glez-Izarzugaza at lycos.es Thu Feb 26 22:55:21 2004 From: Glez-Izarzugaza at lycos.es (=?iso-8859-1?Q?Jose_M=AA_Glez_Izarzugaza?=) Date: Fri Feb 27 09:59:39 2004 Subject: [Bioperl-l] Breadth-First Search Algorithm - BFS Message-ID: <009001c3fce5$8624e0e0$5be625d5@txema> Hello everyone, I'm working with a graph and I need to calculate the values of C and L, to do so, I need an algorithm to calculate the distance to the other elements. A good one is BFS algorithm. I tried to write the script (the algorithm itself) in Perl but I got absolutely lost. Can anyone help me? Thanks in advance, Alquemius PS: Please, send me only-text mails From lstein at cshl.edu Fri Feb 27 09:52:37 2004 From: lstein at cshl.edu (Lincoln Stein) Date: Fri Feb 27 09:59:48 2004 Subject: [Bioperl-l] Mysql database connection from Gbrowse In-Reply-To: References: Message-ID: <200402271652.37417.lstein@cshl.edu> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi Sean, I can think of two explanations for this: 1) you have two versions of perl and the one that you call when you are on the command line is different from the one that the CGI script calls 2) you have two installations of the Perl library files, and the PERL5LIB environment variable is different under the CGI script than when you are logged in yourself. Does this ring any bells? Lincoln On Thursday 26 February 2004 11:01 pm, Sean Davis wrote: > Lincoln and others, > > With regard to my last e-mail, I simply used bp_load_gff.pl and was > able to load the human gff files from gmod.org within an hour or > so. > > In any case, I have a working mysql database called human that I > can access via mysql command line. I also have a working version > of gbrowse that connects to the yeast_chr1 memory database. (I > also connected to the DAS server at ucsc--WAY COOL.) > > I am now trying to connect to my mysql human database and get the > following returned to the browser. > > http://localhost/cgi-bin/gbrowse/human > > ----------------- > An internal error has occurred > > Could not open database. > install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC > (@INC contains: > /System/Library/Perl/5.8.1/darwin-thread-multi-2level > /System/Library/Perl/5.8.1 > /Library/Perl/5.8.1/darwin-thread-multi-2level /Library/Perl/5.8.1 > /Library/Perl > /Network/Library/Perl/5.8.1/darwin-thread-multi-2level > /Network/Library/Perl/5.8.1 /Network/Library/Perl .) at (eval 18) > line 3. Perhaps the DBD::mysql perl module hasn't been fully > installed, or perhaps the capitalisation of 'mysql' isn't right. > Available drivers: ExampleP, Multiplex, Proxy, Sponge. > at /Library/Perl/5.8.1/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm > line 139 > > > Please contact this site's maintainer ([no address given]) for > assistance. > > For the source code for this browser, see the Generic Model > Organism Database Project. For other questions, send mail to > lstein@cshl.org. $Id: yeast_chr1.conf,v 1.6 2004/02/04 15:03:43 > marclogghe Exp $ > > Note: This page uses cookie to save and restore preference > information. No information is shared. > Generic genome browser version > Bio::Graphics::Browser=HASH(0x801294) ------------------ > > However, this works: > perl -e "use Bio::DB::GFF::Adaptor::dbi::mysql" > As does > perl -e "use DBD::mysql" > > > From the human.conf file: > [GENERAL] > description = human > db_adaptor = Bio::DB::GFF > db_args = -adaptor dbi::mysql > -dsn human > > Any ideas? > > Thanks, > Sean > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l - -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (GNU/Linux) iD8DBQFAP1m10CIvUP7P+AkRAhYgAJ98eOkEOIBuVbOsp+p0l64BJOQVIQCfX1g6 BKpWVoq+N2ltfmVYPMazrZY= =mYfN -----END PGP SIGNATURE----- From laurichj at cs.ucr.edu Thu Feb 26 12:45:35 2004 From: laurichj at cs.ucr.edu (Josh Lauricha) Date: Fri Feb 27 10:01:00 2004 Subject: [Bioperl-l] Re: Bio ::seqIO ::tigr In-Reply-To: References: Message-ID: <94AAA922-6883-11D8-ABEB-000A95BBDAD2@cs.ucr.edu> Does the source_term_id refer to the source_tag()? On Feb 26, 2004, at 9:08 AM, matthieu CONTE wrote: > Ok I manage to use load_seqdatabase ! > But....there is another problem....... > There is null field and I think Biosql don?t accept this. > Table Seqfeature id : field 'source_term_id' > > Do you think it will be better to make modifications on the > tigrxml.dtd or on the load_seqdatabase script? > > > [conte@bearn biosql]$ perl load_seqdatabase.pl --dbuser biosql > --dbpass biosql --namespace orysa_tigr --format tigr > /home/conte/pipeline_orthologues/data/orysa_tigr/chr07.xml > Loading /home/conte/pipeline_orthologues/data/orysa_tigr/chr07.xml ... > > -------------------- WARNING --------------------- > MSG: insert in Bio::DB::BioSQL::SeqFeatureAdaptor (driver) failed, > values were ("","1") FKs (26216,37,) > Column 'source_term_id' cannot be null > --------------------------------------------------- > Could not store 8355.t01530: > ------------- EXCEPTION ------------- > MSG: create: object (Bio::SeqFeature::Generic) failed to insert or to > be found by unique key > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create > /usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:207 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store > /usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:253 > STACK Bio::DB::Persistent::PersistentObject::store > /usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/DB/Persistent/ > PersistentObject.pm:270 > STACK Bio::DB::BioSQL::SeqAdaptor::store_children > /usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/DB/BioSQL/ > SeqAdaptor.pm:246 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create > /usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:215 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store > /usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:253 > STACK Bio::DB::Persistent::PersistentObject::store > /usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/DB/Persistent/ > PersistentObject.pm:270 > STACK (eval) load_seqdatabase.pl:517 > STACK toplevel load_seqdatabase.pl:500 > > > > > > > > > ----------------------------------------------------------- > Matthieu CONTE > M. Sc. in Bioinformatics from SIB > > CIRAD-Biotrop TA40/03 > Avenue Agropolis > 34398 Montpellier Cedex 5 > FRANCE > > m_conte@hotmail.com > tel: (33)04 67 61 60 21 > fax :(33) 4 67 61 56 05 > > ----------------------------------------------------------- > > > > > >> From: Josh Lauricha >> To: "matthieu CONTE" >> Subject: Re: [Bioperl-l] Re: Bio ::seqIO ::tigr Date: Wed, 25 Feb >> 2004 08:50:39 -0800 >> >> Thanks for pointing out the typos (the other one is my e-mail address >> ;). >> >> However, based on the size of the file your using (the error is at >> line 2892), I am willing to bet they are the .coordset files. These >> are not the Tigr XML format. Actually, they are not even valid XML... >> If this is the case (check by the extention or, if still in doubt, >> open them up. If there is a tag on the first line then its an >> error in my parser), Jason wrote a parser for this that I can send to >> you. >> >> On Feb 25, 2004, at 8:02 AM, matthieu CONTE wrote: >> >>> Hi, >>> I tried your version of tigr.pm Mr Lauricha. There is a typing >>> mistake line 820. >>> >>> unfortunately it still have another problem: >>> "MSG: [2892]Required missing" >>> >>> ----------------------------------------------------------- >>> Matthieu CONTE >>> M. Sc. in Bioinformatics from SIB >>> >>> CIRAD-Biotrop TA40/03 >>> Avenue Agropolis >>> 34398 Montpellier Cedex 5 >>> FRANCE >>> >>> m_conte@hotmail.com >>> tel: (33)04 67 61 60 21 >>> fax :(33) 4 67 61 56 05 >>> >>> ----------------------------------------------------------- >>> >>> _________________________________________________________________ >>> MSN Messenger : discutez en direct avec vos amis ! >>> http://www.msn.fr/msger/default.asp >>> >>> >> Josh Lauricha >> laurichj@bioinfo.ucr.edu >> OpenPGP: 5A0D 92D3 D093 79DE F724 1137 6DF1 B5EB D9CE AAA8 >> << PGP.sig >> > > _________________________________________________________________ > MSN Search, le moteur de recherche qui pense comme vous ! > http://search.msn.fr/worldwide.asp > > Josh Lauricha laurichj@bioinfo.ucr.edu OpenPGP: 5A0D 92D3 D093 79DE F724 1137 6DF1 B5EB D9CE AAA8 Josh Lauricha laurichj@bioinfo.ucr.edu OpenPGP: 5A0D 92D3 D093 79DE F724 1137 6DF1 B5EB D9CE AAA8 Josh Lauricha laurichj@bioinfo.ucr.edu OpenPGP: 5A0D 92D3 D093 79DE F724 1137 6DF1 B5EB D9CE AAA8 -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 194 bytes Desc: This is a digitally signed message part Url : http://portal.open-bio.org/pipermail/bioperl-l/attachments/20040226/27dfbabf/PGP.bin From todd.harris at cshl.edu Fri Feb 27 10:13:51 2004 From: todd.harris at cshl.edu (Todd Harris) Date: Fri Feb 27 10:20:09 2004 Subject: [Bioperl-l] bioperl graphics In-Reply-To: <200402271638.37134.lstein@cshl.edu> Message-ID: Hi Lincoln - I'll add this to GD::SVG next week and drop you a line when complete. I'm also planning to add code that will allow one to fall back onto GD if a method has not been mapped to the GD::SVG namespace - and if that still doesn't work within the SVG gestalt, to die with a modicum of grace. t > On 2/27/04 8:38 AM, Lincoln Stein wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Sorry, but if you change colorAllocate() to colorResolve(), you will > break the ability to generate publication-quality images with > GD::SVG. Perhaps Todd Harris will add colorResolve() to a future > version of GD::SVG, in which case I will make the suggested change to > Bio::Graphics. > > I would recommend instead making two Bio::Graphics::Panel objects, and > generating a pair of GD objects (using the Panel->gd() method). Then > you can combine them onto a third GD object in whatever geometry you > want by using GD->copy() > > Lincoln > > On Wednesday 25 February 2004 06:46 pm, Jonathan Crabtree wrote: >> Haibo- >> >> hz5@njit.edu wrote: >>> Is there any way to render 2 Bio::Graphics::Panel into one png >>> image? because I want 2 different arrows with different labeled >>> coordinates on the same image and align to the left, but one >>> Panel can only have one coordinates system. >> >> The answer is yes, with a couple of caveats. The first is that you >> will have to take care of the layout of the individual >> Panel-generated images. If you're left-justifying everything then >> this should be easy enough. The second is that I would recommend >> making a one-line change to Bio/Graphics/Panel.pm, to prevent the >> package from trying to allocate the same set of colors twice (when >> you reuse the same GD object to draw the two different parts of the >> image.) Search for the following piece of code in Panel.pm (at >> line 411 in bioperl-1.4, I think): >> >> for my $name ('white','black',keys %COLORS) { >> my $idx = $gd->colorAllocate(@{$COLORS{$name}}); >> $translation_table{$name} = $idx; >> } >> >> Change "colorAllocate" to "colorResolve"; this should have no >> effect on any existing Bio::Graphics code (AFAIK) and will allow >> you to do your two (or three or four)-Panel trick. (As an aside, >> I'd like to lobby for this one-line change to be made in a future >> version of >> Bio::Graphics::Panel, for precisely this reason.) In any case, >> once you've made that change and reinstalled your copy of Bioperl, >> here is a rough outline of what you need to do: >> >> 1. Set up your individual Bio::Graphics::Panel objects (e.g. $p1, >> $p2, $p3, etc.) as desired to draw your images, but do *not* call >> the gd method on any of them yet. >> >> 2. Create a GD::Image object big enough to hold the images that >> will be drawn by $p1, $p2, $p3, etc.: >> my $gdImg = GD::Image->new($fullWidth, $fullHeight); >> (Note: use $p1->width(), $p1->height(), etc., to determine what >> $fullWidth and $fullHeight should be, based on your desired Panel >> layout algorithm.) >> >> 4. Use a "dummy" Bio::Graphics::Panel object to allocate all your >> colors (this is an optional step; I do this because my code does >> some drawing that isn't handled by Bio::Graphics::Panel, and want >> to make sure that the palette has been allocated before I start): >> >> my $dummyPanel = Bio::Graphics::Panel->new(-length => 100, >> -offset => 0, -width => $fullWidth); >> $dummyPanel->gd($gdImg); # forces color allocation >> >> 5. Draw the individual panels and generate your png image: >> >> $p1->gd($gdImg); >> $p2->gd($gdImg); >> my $pngData = $gdImg->png(); >> >> I've glossed over some of the details here, for example the fact >> that you may need to know the value of $p1->height() before you can >> initialize $p2, but that's the basic idea. I've been using this >> method to generate some comparative sequence displays and while >> it's definitely a bit of a hack, it works well in practice. You >> can also do the same thing with a GD::SVG::Image if you'd like to >> generate SVG output. Good luck, >> >> Jonathan >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l@portal.open-bio.org >> http://portal.open-bio.org/mailman/listinfo/bioperl-l > > - -- > Lincoln D. Stein > Cold Spring Harbor Laboratory > 1 Bungtown Road > Cold Spring Harbor, NY 11724 > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.2.1 (GNU/Linux) > > iD8DBQFAP1Zt0CIvUP7P+AkRAreyAJ0XIcjMDeT/Bw69OBOEhD8tsznP+QCfVLWo > +RnQaijXxPlVWTbmjTkbHYw= > =lN1U > -----END PGP SIGNATURE----- > From cjfields at uiuc.edu Fri Feb 27 10:47:18 2004 From: cjfields at uiuc.edu (Chris Fields) Date: Fri Feb 27 10:53:27 2004 Subject: [Bioperl-l] bioperl-1.4 ppm package for Windows In-Reply-To: <200402271645.18011.lstein@cshl.edu> References: <6.0.0.22.2.20040225104734.01c13840@express.cites.uiuc.edu> <200402271645.18011.lstein@cshl.edu> Message-ID: <6.0.0.22.2.20040227094211.01bfc0c8@express.cites.uiuc.edu> I already have it installed an it seems to be fine. I have just one question (and I don't want to start a flame war): what OS do you find that Bioperl works best for? I'm using a duel-boot system with Windows XP and Fedora Core 1, and I've had fewer problems with Fedora (esp. when using GBrowse), but I don't know if this is due to the configuration of Bioperl, GBrowse, or Perl on either OS. I'm considering going pure Linux within the year (although Mac OS X is looking very appealing). Chris At 08:45 AM 2/27/2004, Lincoln Stein wrote: >-----BEGIN PGP SIGNED MESSAGE----- >Hash: SHA1 > >Hi Chris, > >Do you want to try the bioperl-1.4 ppm located on this repository? > > http://www.gmod.org/ggb/ppm > >I put it together myself and it's the one that seems to work properly >for me. > >Lincoln > >On Wednesday 25 February 2004 07:26 pm, Chris Fields wrote: > > I was unable to get the PPM package for 1.4 working for Windows > > from http:/bioperl.org/DIST and had to perform a workaround. I > > decided to post it in case others were running into problems. > > > > When I first tried installing Bioperl using PPM, it installs > > bioperl 1.2 first (!?!), then allows upgrading to 1.2.3. However, > > it will not install 1.4 b/c of the additional dependencies > > (HTML-Entities and IO-Scalar). The latter dependencies are notably > > not req'd for 1.2 or 1.2.3. IMHO, I'm guessing that PPM can't find > > these modules b/c it is looking for specific ppm packages named > > HTML-Entities and IO-Scalar, not for the modules named > > HTML-Entities and IO-Scalar (which are included in the packages > > HTML-Parser and IO-stringy). This problem could be linked to the > > version of PPM I'm using (3.1) on ActivePerl 5.8.3-809, both of > > which are very new, so I have no idea if this is a problem with > > older versions of PPM. > > > > The workaround was to remove the dependencies manually. I > > downloaded the relevant ppm tar file and corresponding ppd files > > (bioperl-1.4-ppm.tar.gz and Bioperl-1.4.ppd, respectively) to a > > local directory (C:\Perl\Bioperl). Using a text editor, I removed > > all references to the added dependencies and saved the file. More > > specifically, I deleted the following lines, listed twice under > > Implementations (so delete both sets!): > > > > > > > VERSION="0,0,0,0" /> > > > /> > > > > > > I then entered PPM, set up a local ppd repository: > > > > rep add local_bio "C:/Perl/Bioperl" > > > > I then searched for and installed the modifed PPM file and it > > worked. > > > > Like I said, I don't know if this is a PPM issue or not. However, > > I think it might be a good idea to remove those dependencies just > > in case, as they are a bit redundant (both HTML-Parser and > > IO-stringy are already listed). > > > > My two cents... > > __________________________________ > > > > > > > > Chris Fields - Postdoctoral Researcher > > Lab of Dr. Robert Switzer > > > > Address: > > > > University of Illinois at Urbana-Champaign > > Dept. of Biochemistry - 323 RAL > > 600 S. Mathews Ave. > > Urbana, IL 61801 > > > > Phone : (217) 333-7098 > > Fax : (217) 244-5858 > >- -- >Lincoln D. Stein >Cold Spring Harbor Laboratory >1 Bungtown Road >Cold Spring Harbor, NY 11724 >-----BEGIN PGP SIGNATURE----- >Version: GnuPG v1.2.1 (GNU/Linux) > >iD8DBQFAP1f90CIvUP7P+AkRAkyOAJ9BmoqcV3DC4zJh392bIveOQ9ec6wCfVMbb >EKv61liRTU8XfEeQ1yg6EeU= >=IP7P >-----END PGP SIGNATURE----- >_______________________________________________ >Bioperl-l mailing list >Bioperl-l@portal.open-bio.org >http://portal.open-bio.org/mailman/listinfo/bioperl-l __________________________________ Chris Fields - Postdoctoral Researcher Lab of Dr. Robert Switzer Address: University of Illinois at Urbana-Champaign Dept. of Biochemistry - 323 RAL 600 S. Mathews Ave. Urbana, IL 61801 Phone : (217) 333-7098 Fax : (217) 244-5858 From brian_osborne at cognia.com Fri Feb 27 10:57:14 2004 From: brian_osborne at cognia.com (Brian Osborne) Date: Fri Feb 27 11:03:26 2004 Subject: [Bioperl-l] bioperl-1.4 ppm package for Windows In-Reply-To: <6.0.0.22.2.20040227094211.01bfc0c8@express.cites.uiuc.edu> Message-ID: Chris, I'd always intended to have a double-boot Windows/Linux machine but I thought I'd check out Cygwin, just for fun. I was so impressed with it running Bioperl that I decided not to install Linux. I must use Windows at work so pure Unix is not an option for me. Someone recently showed me Windows on a Linux machine running VMWare, that was impressive. Brian O. -----Original Message----- From: bioperl-l-bounces@portal.open-bio.org [mailto:bioperl-l-bounces@portal.open-bio.org]On Behalf Of Chris Fields Sent: Friday, February 27, 2004 10:47 AM To: bioperl-l@bioperl.org; Lincoln Stein Cc: bioperl-l@bioperl.org Subject: Re: [Bioperl-l] bioperl-1.4 ppm package for Windows I already have it installed an it seems to be fine. I have just one question (and I don't want to start a flame war): what OS do you find that Bioperl works best for? I'm using a duel-boot system with Windows XP and Fedora Core 1, and I've had fewer problems with Fedora (esp. when using GBrowse), but I don't know if this is due to the configuration of Bioperl, GBrowse, or Perl on either OS. I'm considering going pure Linux within the year (although Mac OS X is looking very appealing). Chris At 08:45 AM 2/27/2004, Lincoln Stein wrote: >-----BEGIN PGP SIGNED MESSAGE----- >Hash: SHA1 > >Hi Chris, > >Do you want to try the bioperl-1.4 ppm located on this repository? > > http://www.gmod.org/ggb/ppm > >I put it together myself and it's the one that seems to work properly >for me. > >Lincoln > >On Wednesday 25 February 2004 07:26 pm, Chris Fields wrote: > > I was unable to get the PPM package for 1.4 working for Windows > > from http:/bioperl.org/DIST and had to perform a workaround. I > > decided to post it in case others were running into problems. > > > > When I first tried installing Bioperl using PPM, it installs > > bioperl 1.2 first (!?!), then allows upgrading to 1.2.3. However, > > it will not install 1.4 b/c of the additional dependencies > > (HTML-Entities and IO-Scalar). The latter dependencies are notably > > not req'd for 1.2 or 1.2.3. IMHO, I'm guessing that PPM can't find > > these modules b/c it is looking for specific ppm packages named > > HTML-Entities and IO-Scalar, not for the modules named > > HTML-Entities and IO-Scalar (which are included in the packages > > HTML-Parser and IO-stringy). This problem could be linked to the > > version of PPM I'm using (3.1) on ActivePerl 5.8.3-809, both of > > which are very new, so I have no idea if this is a problem with > > older versions of PPM. > > > > The workaround was to remove the dependencies manually. I > > downloaded the relevant ppm tar file and corresponding ppd files > > (bioperl-1.4-ppm.tar.gz and Bioperl-1.4.ppd, respectively) to a > > local directory (C:\Perl\Bioperl). Using a text editor, I removed > > all references to the added dependencies and saved the file. More > > specifically, I deleted the following lines, listed twice under > > Implementations (so delete both sets!): > > > > > > > VERSION="0,0,0,0" /> > > > /> > > > > > > I then entered PPM, set up a local ppd repository: > > > > rep add local_bio "C:/Perl/Bioperl" > > > > I then searched for and installed the modifed PPM file and it > > worked. > > > > Like I said, I don't know if this is a PPM issue or not. However, > > I think it might be a good idea to remove those dependencies just > > in case, as they are a bit redundant (both HTML-Parser and > > IO-stringy are already listed). > > > > My two cents... > > __________________________________ > > > > > > > > Chris Fields - Postdoctoral Researcher > > Lab of Dr. Robert Switzer > > > > Address: > > > > University of Illinois at Urbana-Champaign > > Dept. of Biochemistry - 323 RAL > > 600 S. Mathews Ave. > > Urbana, IL 61801 > > > > Phone : (217) 333-7098 > > Fax : (217) 244-5858 > >- -- >Lincoln D. Stein >Cold Spring Harbor Laboratory >1 Bungtown Road >Cold Spring Harbor, NY 11724 >-----BEGIN PGP SIGNATURE----- >Version: GnuPG v1.2.1 (GNU/Linux) > >iD8DBQFAP1f90CIvUP7P+AkRAkyOAJ9BmoqcV3DC4zJh392bIveOQ9ec6wCfVMbb >EKv61liRTU8XfEeQ1yg6EeU= >=IP7P >-----END PGP SIGNATURE----- >_______________________________________________ >Bioperl-l mailing list >Bioperl-l@portal.open-bio.org >http://portal.open-bio.org/mailman/listinfo/bioperl-l __________________________________ Chris Fields - Postdoctoral Researcher Lab of Dr. Robert Switzer Address: University of Illinois at Urbana-Champaign Dept. of Biochemistry - 323 RAL 600 S. Mathews Ave. Urbana, IL 61801 Phone : (217) 333-7098 Fax : (217) 244-5858 _______________________________________________ Bioperl-l mailing list Bioperl-l@portal.open-bio.org http://portal.open-bio.org/mailman/listinfo/bioperl-l From sdavis2 at mail.nih.gov Fri Feb 27 11:16:03 2004 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Fri Feb 27 11:22:02 2004 Subject: [Bioperl-l] Making gff files for ucsc or ncbi build Message-ID: Does anyone know where the build files for build 34_2 are kept these days? They used to be called "by chromosome" files, or some such thing. I am looking to generate gff files and corresponding fasta files for use with Gbrowse. If I make them, I would love to make them available to the community (unless they are out there somewhere already). Sean From cjfields at uiuc.edu Fri Feb 27 12:20:52 2004 From: cjfields at uiuc.edu (Chris Fields) Date: Fri Feb 27 12:26:59 2004 Subject: [Bioperl-l] bioperl-1.4 ppm package for Windows In-Reply-To: References: <6.0.0.22.2.20040227094211.01bfc0c8@express.cites.uiuc.edu> Message-ID: <6.0.0.22.2.20040227110605.01c86fa8@express.cites.uiuc.edu> At 09:57 AM 2/27/2004, you wrote: >Chris, > >I'd always intended to have a double-boot Windows/Linux machine but I >thought I'd check out Cygwin, just for fun. I was so impressed with it >running Bioperl that I decided not to install Linux. I originally installed Linux as a means to an end (using software that aren't Windows friendly, like mfold). It's hard to let go of the luxuries of Windows, thought, especially when you have programs like Office, SigmaPlot, and others which make benchwork research so much easier (and require little to no programming experience). I really like Linux from a number of aspects (open-source, development, etc). However, I think that Apple has really hit upon something with OS X. It is a nice combination of open- and closed-source (I don't mind paying for software,as long as it's reasonable) and isn't unreasonably priced. I get the best of both worlds (closed source software like Office, Endnotes, etc. with open-source software like MySQL and Apache, with a UNIX-based OS, and nice development tools). Apple also is really pushing the bioinformatics angle. The constant updates for both OS X and Linux make both much more appealing to me. That's it, my next system is a G5!!!! Now I'll just have to sell the car... >I must use Windows at work so pure Unix is not an option for me. Someone >recently showed me Windows on a Linux machine running VMWare, that was >impressive. I have managed to get a few things running under Wine (Windows emulation in Linux). It works for certain things, but I haven't tried it out too much b/c I have a dual-boot system. I just get tired of the an number of Windows issues (I use Sun's Java VM and it crawls on Windows XP but flies on Linux). Chris >Brian O. > > > >-----Original Message----- >From: bioperl-l-bounces@portal.open-bio.org >[mailto:bioperl-l-bounces@portal.open-bio.org]On Behalf Of Chris Fields >Sent: Friday, February 27, 2004 10:47 AM >To: bioperl-l@bioperl.org; Lincoln Stein >Cc: bioperl-l@bioperl.org >Subject: Re: [Bioperl-l] bioperl-1.4 ppm package for Windows > >I already have it installed an it seems to be fine. > >I have just one question (and I don't want to start a flame war): what OS >do you find that Bioperl works best for? I'm using a duel-boot system with >Windows XP and Fedora Core 1, and I've had fewer problems with Fedora (esp. >when using GBrowse), but I don't know if this is due to the configuration >of Bioperl, GBrowse, or Perl on either OS. I'm considering going pure >Linux within the year (although Mac OS X is looking very appealing). > >Chris > >At 08:45 AM 2/27/2004, Lincoln Stein wrote: > >-----BEGIN PGP SIGNED MESSAGE----- > >Hash: SHA1 > > > >Hi Chris, > > > >Do you want to try the bioperl-1.4 ppm located on this repository? > > > > http://www.gmod.org/ggb/ppm > > > >I put it together myself and it's the one that seems to work properly > >for me. > > > >Lincoln > > > >On Wednesday 25 February 2004 07:26 pm, Chris Fields wrote: > > > I was unable to get the PPM package for 1.4 working for Windows > > > from http:/bioperl.org/DIST and had to perform a workaround. I > > > decided to post it in case others were running into problems. > > > > > > When I first tried installing Bioperl using PPM, it installs > > > bioperl 1.2 first (!?!), then allows upgrading to 1.2.3. However, > > > it will not install 1.4 b/c of the additional dependencies > > > (HTML-Entities and IO-Scalar). The latter dependencies are notably > > > not req'd for 1.2 or 1.2.3. IMHO, I'm guessing that PPM can't find > > > these modules b/c it is looking for specific ppm packages named > > > HTML-Entities and IO-Scalar, not for the modules named > > > HTML-Entities and IO-Scalar (which are included in the packages > > > HTML-Parser and IO-stringy). This problem could be linked to the > > > version of PPM I'm using (3.1) on ActivePerl 5.8.3-809, both of > > > which are very new, so I have no idea if this is a problem with > > > older versions of PPM. > > > > > > The workaround was to remove the dependencies manually. I > > > downloaded the relevant ppm tar file and corresponding ppd files > > > (bioperl-1.4-ppm.tar.gz and Bioperl-1.4.ppd, respectively) to a > > > local directory (C:\Perl\Bioperl). Using a text editor, I removed > > > all references to the added dependencies and saved the file. More > > > specifically, I deleted the following lines, listed twice under > > > Implementations (so delete both sets!): > > > > > > > > > > > VERSION="0,0,0,0" /> > > > > > /> > > > > > > > > > I then entered PPM, set up a local ppd repository: > > > > > > rep add local_bio "C:/Perl/Bioperl" > > > > > > I then searched for and installed the modifed PPM file and it > > > worked. > > > > > > Like I said, I don't know if this is a PPM issue or not. However, > > > I think it might be a good idea to remove those dependencies just > > > in case, as they are a bit redundant (both HTML-Parser and > > > IO-stringy are already listed). > > > > > > My two cents... > > > __________________________________ > > > > > > > > > > > > Chris Fields - Postdoctoral Researcher > > > Lab of Dr. Robert Switzer > > > > > > Address: > > > > > > University of Illinois at Urbana-Champaign > > > Dept. of Biochemistry - 323 RAL > > > 600 S. Mathews Ave. > > > Urbana, IL 61801 > > > > > > Phone : (217) 333-7098 > > > Fax : (217) 244-5858 > > > >- -- > >Lincoln D. Stein > >Cold Spring Harbor Laboratory > >1 Bungtown Road > >Cold Spring Harbor, NY 11724 > >-----BEGIN PGP SIGNATURE----- > >Version: GnuPG v1.2.1 (GNU/Linux) > > > >iD8DBQFAP1f90CIvUP7P+AkRAkyOAJ9BmoqcV3DC4zJh392bIveOQ9ec6wCfVMbb > >EKv61liRTU8XfEeQ1yg6EeU= > >=IP7P > >-----END PGP SIGNATURE----- > >_______________________________________________ > >Bioperl-l mailing list > >Bioperl-l@portal.open-bio.org > >http://portal.open-bio.org/mailman/listinfo/bioperl-l > >__________________________________ > >Chris Fields - Postdoctoral Researcher >Lab of Dr. Robert Switzer > >Address: > >University of Illinois at Urbana-Champaign >Dept. of Biochemistry - 323 RAL >600 S. Mathews Ave. >Urbana, IL 61801 > >Phone : (217) 333-7098 >Fax : (217) 244-5858 > >_______________________________________________ >Bioperl-l mailing list >Bioperl-l@portal.open-bio.org >http://portal.open-bio.org/mailman/listinfo/bioperl-l __________________________________ Chris Fields - Postdoctoral Researcher Lab of Dr. Robert Switzer Address: University of Illinois at Urbana-Champaign Dept. of Biochemistry - 323 RAL 600 S. Mathews Ave. Urbana, IL 61801 Phone : (217) 333-7098 Fax : (217) 244-5858 From jrs at denny.farviolet.com Fri Feb 27 15:27:10 2004 From: jrs at denny.farviolet.com (Jeremy Semeiks) Date: Fri Feb 27 15:33:01 2004 Subject: [Bioperl-l] OT: Breadth-First Search Algorithm - BFS In-Reply-To: <009001c3fce5$8624e0e0$5be625d5@txema> References: <009001c3fce5$8624e0e0$5be625d5@txema> Message-ID: <20040227202710.GH595@64.81.242.180> On Fri, Feb 27, 2004 at 04:55:21AM +0100, Jose M? Glez Izarzugaza wrote: > Hello everyone, > > I'm working with a graph and I need to calculate the values of C and L, to do so, I need an algorithm to calculate the distance to the other elements. > > A good one is BFS algorithm. > > I tried to write the script (the algorithm itself) in Perl but I got absolutely lost. > > Can anyone help me? Hi Jose, One solution is to use the Graph::Base module on CPAN: http://search.cpan.org/~jhi/Graph-0.20101/lib/Graph/Base.pm This module includes Dijkstra's shortest path algorithm. Dijkstra's is a few times slower than BFS, but you shouldn't see a difference for small graphs. (And in my experience, if you're trying to run search algorithms on large graphs, Perl is too slow anyway -- consider something like the C++ Boost Graph library instead.) HTH, Jeremy From jason at cgt.duhs.duke.edu Fri Feb 27 15:53:16 2004 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Fri Feb 27 15:59:16 2004 Subject: [Bioperl-l] Re: [Gmod-schema] install problems, continued (fwd) Message-ID: -- Jason Stajich Duke University jason at cgt.mc.duke.edu ---------- Forwarded message ---------- Date: Fri, 27 Feb 2004 14:30:57 -0500 (EST) From: Don Gilbert To: cain@cshl.org, p.lijnzaad@med.uu.nl Cc: gmod-schema@lists.sourceforge.net Subject: Re: [Gmod-schema] install problems, continued THere is a problem with bioperl-1.4 related to ontology parsing. I'm assuming people in bioperl/ontology world know about this, but here is what I found -- Bio/Ontology/RelationshipType.pm has been reverted to remove some of Allen Day's necessary patches and now it won't parse SO and some of the other standard bio ontology data. - Don ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: Found unknown type of relationship: [derived_from] Known types are: [IS_A], [PART_OF], [CONTAINS], [FOUND_IN] STACK: Error::throw STACK: Bio::Root::Root::throw /bio/biodb/common/perl/lib/Bio/Root/Root.pm:328 STACK: Bio::Ontology::RelationshipType::get_instance /bio/biodb/common/perl/lib/Bio/Ontology/RelationshipType.pm:143 STACK: Bio::Ontology::SimpleGOEngine::add_relationship_type /bio/biodb/common/perl/lib/Bio/Ontology/SimpleGOEngine.pm:284 STACK: Bio::OntologyIO::dagflat::_parse_flat_file /bio/biodb/common/perl/lib/Bio/OntologyIO/dagflat.pm:556 STACK: Bio::OntologyIO::dagflat::parse /bio/biodb/common/perl/lib/Bio/OntologyIO/dagflat.pm:250 STACK: Bio::OntologyIO::dagflat::next_ontology /bio/biodb/common/perl/lib/Bio/OntologyIO/dagflat.pm:284 STACK: /bio/biodb/gmod/bin/gmod_load_ontology.pl:119 ----------------------------------------------------------- Problem loading ontology /bio/biodb/gmod/data/ontologies/song/so.ontology: 65280 at bin/ -- reinstated below comment out, leading to lots of errors -- UC versus lc terms, and derived_from, other relations not in perl module. cricket.% diff [bioperl-eariler]/Bio/Ontology/RelationshipType.pm [bioperl-1.4]/Bio/Ontology/RelationshipType.pm 1c1 < # $Id: RelationshipType.pm,v 1.11 2003/06/20 18:31:44 allenday Exp $ --- > # $Id: RelationshipType.pm,v 1.5.2.5 2003/09/08 12:16:19 heikki Exp $ 136,145c136,141 < < # < #see the cell ontology. this code is too strict, even for dag-edit files. -allen < # < # if ( ! (($name eq IS_A) || ($name eq PART_OF) || < # ($name eq CONTAINS) || ( $name eq FOUND_IN ))) { < # my $msg = "Found unknown type of relationship: [" . $name . "]\n"; < # $msg .= "Known types are: [" . IS_A . "], [" . PART_OF . "], [" . CONTAINS . "], [" . FOUND_IN . "]"; < # $class->throw( $msg ); < # } --- > if ( ! (($name eq IS_A) || ($name eq PART_OF) || > ($name eq CONTAINS) || ( $name eq FOUND_IN ))) { > my $msg = "Found unknown type of relationship: [" . $name . "]\n"; > $msg .= "Known types are: [" . IS_A . "], [" . PART_OF . "], [" . CONTAINS . "], [" . FOUND_IN . "]"; > $class->throw( $msg ); > } 364a361,363 > ------------------------------------------------------- SF.Net is sponsored by: Speed Start Your Linux Apps Now. Build and deploy apps & Web services for Linux with a free DVD software kit from IBM. Click Now! http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click _______________________________________________ Gmod-schema mailing list Gmod-schema@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/gmod-schema From brian_osborne at cognia.com Fri Feb 27 12:47:42 2004 From: brian_osborne at cognia.com (Brian Osborne) Date: Fri Feb 27 16:00:04 2004 Subject: [Bioperl-l] bioperl-1.4 ppm package for Windows In-Reply-To: <6.0.0.22.2.20040225104734.01c13840@express.cites.uiuc.edu> Message-ID: Chris, Nigam Shah has fixed package.lst and Bioperl-1.4.ppd in DIST. Thank you for telling us about this problem. Brian O. -----Original Message----- From: bioperl-l-bounces@portal.open-bio.org [mailto:bioperl-l-bounces@portal.open-bio.org]On Behalf Of Chris Fields Sent: Wednesday, February 25, 2004 12:27 PM To: bioperl-l@bioperl.org Subject: [Bioperl-l] bioperl-1.4 ppm package for Windows I was unable to get the PPM package for 1.4 working for Windows from http:/bioperl.org/DIST and had to perform a workaround. I decided to post it in case others were running into problems. When I first tried installing Bioperl using PPM, it installs bioperl 1.2 first (!?!), then allows upgrading to 1.2.3. However, it will not install 1.4 b/c of the additional dependencies (HTML-Entities and IO-Scalar). The latter dependencies are notably not req'd for 1.2 or 1.2.3. IMHO, I'm guessing that PPM can't find these modules b/c it is looking for specific ppm packages named HTML-Entities and IO-Scalar, not for the modules named HTML-Entities and IO-Scalar (which are included in the packages HTML-Parser and IO-stringy). This problem could be linked to the version of PPM I'm using (3.1) on ActivePerl 5.8.3-809, both of which are very new, so I have no idea if this is a problem with older versions of PPM. The workaround was to remove the dependencies manually. I downloaded the relevant ppm tar file and corresponding ppd files (bioperl-1.4-ppm.tar.gz and Bioperl-1.4.ppd, respectively) to a local directory (C:\Perl\Bioperl). Using a text editor, I removed all references to the added dependencies and saved the file. More specifically, I deleted the following lines, listed twice under Implementations (so delete both sets!): I then entered PPM, set up a local ppd repository: rep add local_bio "C:/Perl/Bioperl" I then searched for and installed the modifed PPM file and it worked. Like I said, I don't know if this is a PPM issue or not. However, I think it might be a good idea to remove those dependencies just in case, as they are a bit redundant (both HTML-Parser and IO-stringy are already listed). My two cents... __________________________________ Chris Fields - Postdoctoral Researcher Lab of Dr. Robert Switzer Address: University of Illinois at Urbana-Champaign Dept. of Biochemistry - 323 RAL 600 S. Mathews Ave. Urbana, IL 61801 Phone : (217) 333-7098 Fax : (217) 244-5858 From cjfields at uiuc.edu Fri Feb 27 12:05:51 2004 From: cjfields at uiuc.edu (Chris Fields) Date: Fri Feb 27 16:00:12 2004 Subject: [Bioperl-l] Re: Windows PPM for Bioperl 1.4 In-Reply-To: <000a01c3fd43$675de9f0$34167680@Vivek> References: <000a01c3fd43$675de9f0$34167680@Vivek> Message-ID: <6.0.0.22.2.20040227094821.01bfb6b8@express.cites.uiuc.edu> An HTML attachment was scrubbed... URL: http://portal.open-bio.org/pipermail/bioperl-l/attachments/20040227/b428f970/attachment-0001.htm From jason at cgt.duhs.duke.edu Fri Feb 27 15:58:16 2004 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Fri Feb 27 16:04:15 2004 Subject: [Bioperl-l] Re: [Gmod-schema] install problems, continued (fwd) In-Reply-To: References: Message-ID: ignore this sorry -- wasn't thinking. as hilmar replied on gmod-schema list this was improper comparison with bioperl 1.2.x branch not bioperl 1.4. --jason On Fri, 27 Feb 2004, Jason Stajich wrote: > > > -- > Jason Stajich > Duke University > jason at cgt.mc.duke.edu > > ---------- Forwarded message ---------- > Date: Fri, 27 Feb 2004 14:30:57 -0500 (EST) > From: Don Gilbert > To: cain@cshl.org, p.lijnzaad@med.uu.nl > Cc: gmod-schema@lists.sourceforge.net > Subject: Re: [Gmod-schema] install problems, continued > > > THere is a problem with bioperl-1.4 related to ontology parsing. > I'm assuming people in bioperl/ontology world know about this, but > here is what I found -- Bio/Ontology/RelationshipType.pm > has been reverted to remove some of Allen Day's necessary patches > and now it won't parse SO and some of the other standard bio ontology data. > > - Don > > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: Found unknown type of relationship: [derived_from] > Known types are: [IS_A], [PART_OF], [CONTAINS], [FOUND_IN] > STACK: Error::throw > STACK: Bio::Root::Root::throw /bio/biodb/common/perl/lib/Bio/Root/Root.pm:328 > STACK: Bio::Ontology::RelationshipType::get_instance /bio/biodb/common/perl/lib/Bio/Ontology/RelationshipType.pm:143 > STACK: Bio::Ontology::SimpleGOEngine::add_relationship_type /bio/biodb/common/perl/lib/Bio/Ontology/SimpleGOEngine.pm:284 > STACK: Bio::OntologyIO::dagflat::_parse_flat_file /bio/biodb/common/perl/lib/Bio/OntologyIO/dagflat.pm:556 > STACK: Bio::OntologyIO::dagflat::parse /bio/biodb/common/perl/lib/Bio/OntologyIO/dagflat.pm:250 > STACK: Bio::OntologyIO::dagflat::next_ontology /bio/biodb/common/perl/lib/Bio/OntologyIO/dagflat.pm:284 > STACK: /bio/biodb/gmod/bin/gmod_load_ontology.pl:119 > ----------------------------------------------------------- > Problem loading ontology /bio/biodb/gmod/data/ontologies/song/so.ontology: 65280 at bin/ > > -- reinstated below comment out, leading to lots of errors > -- UC versus lc terms, and derived_from, other relations not in perl module. > > > cricket.% diff > [bioperl-eariler]/Bio/Ontology/RelationshipType.pm > [bioperl-1.4]/Bio/Ontology/RelationshipType.pm > 1c1 > < # $Id: RelationshipType.pm,v 1.11 2003/06/20 18:31:44 allenday Exp $ > --- > > # $Id: RelationshipType.pm,v 1.5.2.5 2003/09/08 12:16:19 heikki Exp $ > 136,145c136,141 > < > < # > < #see the cell ontology. this code is too strict, even for dag-edit files. -allen > < # > < # if ( ! (($name eq IS_A) || ($name eq PART_OF) || > < # ($name eq CONTAINS) || ( $name eq FOUND_IN ))) { > < # my $msg = "Found unknown type of relationship: [" . $name . "]\n"; > < # $msg .= "Known types are: [" . IS_A . "], [" . PART_OF . "], [" . CONTAINS . "], [" . FOUND_IN . "]"; > < # $class->throw( $msg ); > < # } > --- > > if ( ! (($name eq IS_A) || ($name eq PART_OF) || > > ($name eq CONTAINS) || ( $name eq FOUND_IN ))) { > > my $msg = "Found unknown type of relationship: [" . $name . "]\n"; > > $msg .= "Known types are: [" . IS_A . "], [" . PART_OF . "], [" . CONTAINS . "], [" . FOUND_IN . "]"; > > $class->throw( $msg ); > > } > 364a361,363 > > > > > ------------------------------------------------------- > SF.Net is sponsored by: Speed Start Your Linux Apps Now. > Build and deploy apps & Web services for Linux with > a free DVD software kit from IBM. Click Now! > http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click > _______________________________________________ > Gmod-schema mailing list > Gmod-schema@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/gmod-schema > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From natg at shore.net Fri Feb 27 16:31:05 2004 From: natg at shore.net (Nathan (Nat) Goodman) Date: Fri Feb 27 16:37:03 2004 Subject: [Bioperl-l] OT: Breadth-First Search Algorithm - BFS Message-ID: <001101c3fd79$01acae30$de02000a@systemsbiology.net> Graph::Base is seriously broken. I urge anyone who's using it to check the bug list at http://rt.cpan.org/NoAuth/Bugs.html?Dist=Graph. The developer, Jarkko Hietaniemi, is well aware of the problems and is working on a rewrite. In the meantime, I have a very simple graph package that I'm happy to share. It handles undirected, unlabelled graphs only. It provides depth and breadth first search, all pairs shortest path, enumeration of all paths in a graph, as well as the basics. Best, Nat ---------- Jose M? Glez Izarzugaza wrote: >> I'm working with a graph and I need to calculate the values of C and L, to do so, I need an algorithm to calculate the distance to the other elements.... >> A good one is BFS algorithm. Jeremy replied: > One solution is to use the Graph::Base module on CPAN: > > http://search.cpan.org/~jhi/Graph-0.20101/lib/Graph/Base.pm From Steven.Roels at mpi.com Fri Feb 27 16:59:01 2004 From: Steven.Roels at mpi.com (Roels, Steven) Date: Fri Feb 27 17:05:54 2004 Subject: [Bioperl-l] OT: Breadth-First Search Algorithm - BFS Message-ID: Nat, Thanks for the heads-up. Anyone know off-hand how, if at all, these bugs impact Bio::Ontology::SimpleGOEngine? Thanks, -Steve ***************************************************************** Steve Roels, Ph.D. Senior Scientist I - Computational Biology Phone: (617) 761-6820 Millennium Pharmaceuticals, Inc. FAX: (617) 577-3555 640 Memorial Drive Email: roels@mpi.com Cambridge, MA 02139-4853 ***************************************************************** >-----Original Message----- >From: bioperl-l-bounces@portal.open-bio.org [mailto:bioperl-l-bounces@portal.open-bio.org] On Behalf >Of Nathan (Nat) Goodman >Sent: Friday, February 27, 2004 4:31 PM >To: bioperl-l@portal.open-bio.org >Cc: 'Jeremy Semeiks' >Subject: RE: [Bioperl-l] OT: Breadth-First Search Algorithm - BFS > >Graph::Base is seriously broken. I urge anyone who's using it to check >the bug list at >http://rt.cpan.org/NoAuth/Bugs.html?Dist=Graph. > >The developer, Jarkko Hietaniemi, is well aware of the problems and is >working on a rewrite. In the meantime, I have a very simple graph >package that I'm happy to share. It handles undirected, unlabelled >graphs only. It provides depth and breadth first search, all pairs >shortest path, enumeration of all paths in a graph, as well as the >basics. > >Best, >Nat >---------- > >Jose M? Glez Izarzugaza wrote: >>> I'm working with a graph and I need to calculate the values of C and >L, to do so, I need an algorithm to calculate the distance to the other >elements.... >>> A good one is BFS algorithm. > >Jeremy replied: >> One solution is to use the Graph::Base module on CPAN: >> >> http://search.cpan.org/~jhi/Graph-0.20101/lib/Graph/Base.pm > > > >_______________________________________________ >Bioperl-l mailing list >Bioperl-l@portal.open-bio.org >http://portal.open-bio.org/mailman/listinfo/bioperl-l This e-mail, including any attachments, is a confidential business communication, and may contain information that is confidential, proprietary and/or privileged. This e-mail is intended only for the individual(s) to whom it is addressed, and may not be saved, copied, printed, disclosed or used by anyone else. If you are not the(an) intended recipient, please immediately delete this e-mail from your computer system and notify the sender. Thank you. From ruby21rusty at hotmail.com Fri Feb 27 21:19:52 2004 From: ruby21rusty at hotmail.com (felix) Date: Fri Feb 27 19:31:36 2004 Subject: [Bioperl-l] 8 times longer than V_I A_G R_A?? Message-ID: <1077934792-8855@excite.com> Here is an wondefrul way to please your lady. You can be ready for love for up to thirty-six hours. The results are far better than any other product. http://drugsbusiness.com/sv/index.php?pid=eph9106 kleenex binkytango awesome nikita kingdom e-mail new taffy eclipsetina dougie guess groovy mission lucas Get off this list by writing to http://drugsbusiness.com/sv/applepie.php From hlapp at gnf.org Fri Feb 27 19:45:13 2004 From: hlapp at gnf.org (Hilmar Lapp) Date: Fri Feb 27 19:51:13 2004 Subject: [Bioperl-l] Re: Bio ::seqIO ::tigr In-Reply-To: <94AAA922-6883-11D8-ABEB-000A95BBDAD2@cs.ucr.edu> Message-ID: <5E8C17E6-6987-11D8-B6DE-000A959EB4C4@gnf.org> On Thursday, February 26, 2004, at 09:45 AM, Josh Lauricha wrote: > Does the source_term_id refer to the source_tag()? Yes. > > On Feb 26, 2004, at 9:08 AM, matthieu CONTE wrote: > >> [conte@bearn biosql]$ perl load_seqdatabase.pl --dbuser biosql >> --dbpass biosql --namespace orysa_tigr --format tigr >> /home/conte/pipeline_orthologues/data/orysa_tigr/chr07.xml >> Loading /home/conte/pipeline_orthologues/data/orysa_tigr/chr07.xml ... >> >> -------------------- WARNING --------------------- >> MSG: insert in Bio::DB::BioSQL::SeqFeatureAdaptor (driver) failed, >> values were ("","1") FKs (26216,37,) >> Column 'source_term_id' cannot be null >> --------------------------------------------------- >> What this means is that there was no $feat->source_tag set, and looking up undef resulted in undef for the foreign key :-) While bioperl doesn't enforce it, for biosql the source_tag() as well as the primary_tag() of a feature are mandatory and cannot be undef. Most SeqIO parsers set source_tag() to a static default if there is no value, e.g. 'EMBL/GenBank/SwissProt' if you're using FTHelper.pm. -hilmar -- ------------------------------------------------------------- Hilmar Lapp email: lapp at gnf.org GNF, San Diego, Ca. 92121 phone: +1-858-812-1757 ------------------------------------------------------------- From hlapp at gnf.org Fri Feb 27 19:59:59 2004 From: hlapp at gnf.org (Hilmar Lapp) Date: Fri Feb 27 20:05:59 2004 Subject: [Bioperl-l] OT: Breadth-First Search Algorithm - BFS In-Reply-To: Message-ID: <6EA1F8A2-6989-11D8-B6DE-000A959EB4C4@gnf.org> Well, that depends on what you want to do with the ontology (or its engine, respectively). If all that you want is the methods defined in Bio::Ontology::OntologyI, then there is no effect. If you want to obtain $ontology->engine->graph() and then run those algorithms that are buggy, well then the bugs will have an effect obviously ... Does this answer your question? -hilmar On Friday, February 27, 2004, at 01:59 PM, Roels, Steven wrote: > > Nat, > > Thanks for the heads-up. > > Anyone know off-hand how, if at all, these bugs impact > Bio::Ontology::SimpleGOEngine? > > Thanks, > > -Steve > > ***************************************************************** > Steve Roels, Ph.D. > Senior Scientist I - Computational Biology Phone: (617) 761-6820 > Millennium Pharmaceuticals, Inc. FAX: (617) 577-3555 > 640 Memorial Drive Email: roels@mpi.com > Cambridge, MA 02139-4853 > ***************************************************************** > >> -----Original Message----- >> From: bioperl-l-bounces@portal.open-bio.org > [mailto:bioperl-l-bounces@portal.open-bio.org] On Behalf >> Of Nathan (Nat) Goodman >> Sent: Friday, February 27, 2004 4:31 PM >> To: bioperl-l@portal.open-bio.org >> Cc: 'Jeremy Semeiks' >> Subject: RE: [Bioperl-l] OT: Breadth-First Search Algorithm - BFS >> >> Graph::Base is seriously broken. I urge anyone who's using it to >> check >> the bug list at >> http://rt.cpan.org/NoAuth/Bugs.html?Dist=Graph. >> >> The developer, Jarkko Hietaniemi, is well aware of the problems and is >> working on a rewrite. In the meantime, I have a very simple graph >> package that I'm happy to share. It handles undirected, unlabelled >> graphs only. It provides depth and breadth first search, all pairs >> shortest path, enumeration of all paths in a graph, as well as the >> basics. >> >> Best, >> Nat >> ---------- >> >> Jose M? Glez Izarzugaza wrote: >>>> I'm working with a graph and I need to calculate the values of C and >> L, to do so, I need an algorithm to calculate the distance to the >> other >> elements.... >>>> A good one is BFS algorithm. >> >> Jeremy replied: >>> One solution is to use the Graph::Base module on CPAN: >>> >>> http://search.cpan.org/~jhi/Graph-0.20101/lib/Graph/Base.pm >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l@portal.open-bio.org >> http://portal.open-bio.org/mailman/listinfo/bioperl-l > > > > This e-mail, including any attachments, is a confidential business > communication, and may contain information that is confidential, > proprietary and/or privileged. This e-mail is intended only for the > individual(s) to whom it is addressed, and may not be saved, copied, > printed, disclosed or used by anyone else. If you are not the(an) > intended recipient, please immediately delete this e-mail from your > computer system and notify the sender. Thank you. > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > -- ------------------------------------------------------------- Hilmar Lapp email: lapp at gnf.org GNF, San Diego, Ca. 92121 phone: +1-858-812-1757 -------------------------------------------------------------